mirror of
https://github.com/telemt/telemt.git
synced 2026-04-18 19:14:09 +03:00
Traffic Control + Fairness + Evaluating hard-idle timeout + Improve FakeTLS server-flight fidelity + PROXY Protocol V2 UNKNOWN/LOCAL misuse fixes: merge pull request #714 from telemt/flow
Traffic Control + Fairness + Evaluating hard-idle timeout + Improve FakeTLS server-flight fidelity + PROXY Protocol V2 UNKNOWN/LOCAL misuse fixes
This commit is contained in:
2
Cargo.lock
generated
2
Cargo.lock
generated
@@ -2780,7 +2780,7 @@ checksum = "7b2093cf4c8eb1e67749a6762251bc9cd836b6fc171623bd0a9d324d37af2417"
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "telemt"
|
name = "telemt"
|
||||||
version = "3.4.0"
|
version = "3.4.1"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"aes",
|
"aes",
|
||||||
"anyhow",
|
"anyhow",
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "telemt"
|
name = "telemt"
|
||||||
version = "3.4.0"
|
version = "3.4.1"
|
||||||
edition = "2024"
|
edition = "2024"
|
||||||
|
|
||||||
[features]
|
[features]
|
||||||
@@ -98,4 +98,3 @@ harness = false
|
|||||||
[profile.release]
|
[profile.release]
|
||||||
lto = "fat"
|
lto = "fat"
|
||||||
codegen-units = 1
|
codegen-units = 1
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,225 @@
|
|||||||
|
# TLS Front Profile Fidelity
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This document describes how Telemt reuses captured TLS behavior in the FakeTLS server flight and how to validate the result on a real deployment.
|
||||||
|
|
||||||
|
When TLS front emulation is enabled, Telemt can capture useful server-side TLS behavior from the selected origin and reuse that behavior in the emulated success path. The goal is not to reproduce the origin byte-for-byte, but to reduce stable synthetic traits and make the emitted server flight structurally closer to the captured profile.
|
||||||
|
|
||||||
|
## Why this change exists
|
||||||
|
|
||||||
|
The project already captures useful server-side TLS behavior in the TLS front fetch path:
|
||||||
|
|
||||||
|
- `change_cipher_spec_count`
|
||||||
|
- `app_data_record_sizes`
|
||||||
|
- `ticket_record_sizes`
|
||||||
|
|
||||||
|
Before this change, the emulator used only part of that information. This left a gap between captured origin behavior and emitted FakeTLS server flight.
|
||||||
|
|
||||||
|
## What is implemented
|
||||||
|
|
||||||
|
- The emulator now replays the observed `ChangeCipherSpec` count from the fetched behavior profile.
|
||||||
|
- The emulator now replays observed ticket-like tail ApplicationData record sizes when raw or merged TLS profile data is available.
|
||||||
|
- The emulator now preserves more of the profiled encrypted-flight structure instead of collapsing it into a smaller synthetic shape.
|
||||||
|
- The emulator still falls back to the previous synthetic behavior when the cached profile does not contain raw TLS behavior information.
|
||||||
|
- Operator-configured `tls_new_session_tickets` still works as an additive fallback when the profile does not provide enough tail records.
|
||||||
|
|
||||||
|
## Practical benefit
|
||||||
|
|
||||||
|
- Reduced distinguishability between profiled origin TLS behavior and emulated TLS behavior.
|
||||||
|
- Lower chance of stable server-flight fingerprints caused by fixed CCS count or synthetic-only tail record sizes.
|
||||||
|
- Better reuse of already captured TLS profile data without changing MTProto logic, KDF routing, or transport architecture.
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
This mechanism does not aim to make Telemt byte-identical to the origin server.
|
||||||
|
|
||||||
|
It also does not change:
|
||||||
|
|
||||||
|
- MTProto business logic;
|
||||||
|
- KDF routing behavior;
|
||||||
|
- the overall transport architecture.
|
||||||
|
|
||||||
|
The practical goal is narrower:
|
||||||
|
|
||||||
|
- reuse more captured profile data;
|
||||||
|
- reduce fixed synthetic behavior in the server flight;
|
||||||
|
- preserve a valid FakeTLS success path while changing the emitted shape on the wire.
|
||||||
|
|
||||||
|
## Validation targets
|
||||||
|
|
||||||
|
- Correct count of emulated `ChangeCipherSpec` records.
|
||||||
|
- Correct replay of observed ticket-tail record sizes.
|
||||||
|
- No regression in existing ALPN and payload-placement behavior.
|
||||||
|
|
||||||
|
## How to validate the result
|
||||||
|
|
||||||
|
Recommended validation consists of two layers:
|
||||||
|
|
||||||
|
- focused unit and security tests for CCS-count replay and ticket-tail replay;
|
||||||
|
- real packet-capture comparison for a selected origin and a successful FakeTLS session.
|
||||||
|
|
||||||
|
When testing on the network, the expected result is:
|
||||||
|
|
||||||
|
- a valid FakeTLS and MTProto success path is preserved;
|
||||||
|
- the early encrypted server flight changes shape when richer profile data is available;
|
||||||
|
- the change is visible on the wire without changing MTProto logic or transport architecture.
|
||||||
|
|
||||||
|
This validation is intended to show better reuse of captured TLS profile data.
|
||||||
|
It is not intended to prove byte-level equivalence with the real origin server.
|
||||||
|
|
||||||
|
## How to test on a real deployment
|
||||||
|
|
||||||
|
The strongest practical validation is a side-by-side trace comparison between:
|
||||||
|
|
||||||
|
- a real TLS origin server used as `mask_host`;
|
||||||
|
- a Telemt FakeTLS success-path connection for the same SNI;
|
||||||
|
- optional captures from different Telemt builds or configurations.
|
||||||
|
|
||||||
|
The purpose of the comparison is to inspect the shape of the server flight:
|
||||||
|
|
||||||
|
- record order;
|
||||||
|
- count of `ChangeCipherSpec` records;
|
||||||
|
- count and grouping of early encrypted `ApplicationData` records;
|
||||||
|
- lengths of tail or continuation `ApplicationData` records.
|
||||||
|
|
||||||
|
## Recommended environment
|
||||||
|
|
||||||
|
Use a Linux host or Docker container for the cleanest reproduction.
|
||||||
|
|
||||||
|
Recommended setup:
|
||||||
|
|
||||||
|
1. One Telemt instance.
|
||||||
|
2. One real HTTPS origin as `mask_host`.
|
||||||
|
3. One Telegram client configured with an `ee` proxy link for the Telemt instance.
|
||||||
|
4. `tcpdump` or Wireshark available for capture analysis.
|
||||||
|
|
||||||
|
## Step-by-step test procedure
|
||||||
|
|
||||||
|
### 1. Prepare the origin
|
||||||
|
|
||||||
|
1. Choose a real HTTPS origin.
|
||||||
|
2. Set both `censorship.tls_domain` and `censorship.mask_host` to that hostname.
|
||||||
|
3. Confirm that a direct TLS request works:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openssl s_client -connect ORIGIN_IP:443 -servername YOUR_DOMAIN </dev/null
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Configure Telemt
|
||||||
|
|
||||||
|
Use a configuration that enables:
|
||||||
|
|
||||||
|
- `censorship.mask = true`
|
||||||
|
- `censorship.tls_emulation = true`
|
||||||
|
- `censorship.mask_host`
|
||||||
|
- `censorship.mask_port`
|
||||||
|
|
||||||
|
Recommended for cleaner testing:
|
||||||
|
|
||||||
|
- keep `censorship.tls_new_session_tickets = 0`, so the result depends primarily on fetched profile data rather than operator-forced synthetic tail records;
|
||||||
|
- keep `censorship.tls_fetch.strict_route = true`, if cleaner provenance for captured profile data is important.
|
||||||
|
|
||||||
|
### 3. Refresh TLS profile data
|
||||||
|
|
||||||
|
1. Start Telemt.
|
||||||
|
2. Let it fetch TLS front profile data for the configured domain.
|
||||||
|
3. If `tls_front_dir` is persisted, confirm that the TLS front cache is populated.
|
||||||
|
|
||||||
|
Persisted cache artifacts are useful, but they are not required if packet captures already demonstrate the runtime result.
|
||||||
|
|
||||||
|
### 4. Capture a direct-origin trace
|
||||||
|
|
||||||
|
From a separate client host, connect directly to the origin:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openssl s_client -connect ORIGIN_IP:443 -servername YOUR_DOMAIN </dev/null
|
||||||
|
```
|
||||||
|
|
||||||
|
Capture with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo tcpdump -i any -w origin-direct.pcap host ORIGIN_IP and port 443
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Capture a Telemt FakeTLS success-path trace
|
||||||
|
|
||||||
|
Now connect to Telemt with a real Telegram client through an `ee` proxy link that targets the Telemt instance.
|
||||||
|
|
||||||
|
`openssl s_client` is useful for direct-origin capture and fallback sanity checks, but it does not exercise the successful FakeTLS and MTProto path.
|
||||||
|
|
||||||
|
Capture with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo tcpdump -i any -w telemt-emulated.pcap host TELEMT_IP and port 443
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. Decode TLS record structure
|
||||||
|
|
||||||
|
Use `tshark` to print record-level structure:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
tshark -r origin-direct.pcap -Y "tls.record" -T fields \
|
||||||
|
-e frame.number \
|
||||||
|
-e ip.src \
|
||||||
|
-e ip.dst \
|
||||||
|
-e tls.record.content_type \
|
||||||
|
-e tls.record.length
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
tshark -r telemt-emulated.pcap -Y "tls.record" -T fields \
|
||||||
|
-e frame.number \
|
||||||
|
-e ip.src \
|
||||||
|
-e ip.dst \
|
||||||
|
-e tls.record.content_type \
|
||||||
|
-e tls.record.length
|
||||||
|
```
|
||||||
|
|
||||||
|
Focus on the server flight after ClientHello:
|
||||||
|
|
||||||
|
- `22` = Handshake
|
||||||
|
- `20` = ChangeCipherSpec
|
||||||
|
- `23` = ApplicationData
|
||||||
|
|
||||||
|
### 7. Build a comparison table
|
||||||
|
|
||||||
|
A compact table like the following is usually enough:
|
||||||
|
|
||||||
|
| Path | CCS count | AppData count in first encrypted flight | Tail AppData lengths |
|
||||||
|
| --- | --- | --- | --- |
|
||||||
|
| Origin | `N` | `M` | `[a, b, ...]` |
|
||||||
|
| Telemt build A | `...` | `...` | `...` |
|
||||||
|
| Telemt build B | `...` | `...` | `...` |
|
||||||
|
|
||||||
|
The comparison should make it easy to see that:
|
||||||
|
|
||||||
|
- the FakeTLS success path remains valid;
|
||||||
|
- the early encrypted server flight changes when richer profile data is reused;
|
||||||
|
- the result is backed by packet evidence.
|
||||||
|
|
||||||
|
## Example capture set
|
||||||
|
|
||||||
|
One practical example of this workflow uses:
|
||||||
|
|
||||||
|
- `origin-direct-nginx.pcap`
|
||||||
|
- `telemt-ee-before-nginx.pcap`
|
||||||
|
- `telemt-ee-after-nginx.pcap`
|
||||||
|
|
||||||
|
Practical notes:
|
||||||
|
|
||||||
|
- `origin` was captured as a direct TLS 1.2 connection to `nginx.org`;
|
||||||
|
- `before` and `after` were captured on the Telemt FakeTLS success path with a real Telegram client;
|
||||||
|
- the first server-side FakeTLS response remains valid in both cases;
|
||||||
|
- the early encrypted server-flight segmentation differs between `before` and `after`, which is consistent with better reuse of captured profile data;
|
||||||
|
- this kind of result shows a wire-visible effect without breaking the success path, but it does not claim full indistinguishability from the origin.
|
||||||
|
|
||||||
|
## Stronger validation
|
||||||
|
|
||||||
|
For broader confidence, repeat the same comparison on:
|
||||||
|
|
||||||
|
1. one CDN-backed origin;
|
||||||
|
2. one regular nginx origin;
|
||||||
|
3. one origin with a multi-record encrypted flight and visible ticket-like tails.
|
||||||
|
|
||||||
|
If the same directional improvement appears across all three, confidence in the result will be much higher than for a single-origin example.
|
||||||
@@ -0,0 +1,225 @@
|
|||||||
|
# Fidelity TLS Front Profile
|
||||||
|
|
||||||
|
## Обзор
|
||||||
|
|
||||||
|
Этот документ описывает, как Telemt переиспользует захваченное TLS-поведение в FakeTLS server flight и как проверять результат на реальной инсталляции.
|
||||||
|
|
||||||
|
Когда включена TLS front emulation, Telemt может собирать полезное серверное TLS-поведение выбранного origin и использовать его в emulated success path. Цель здесь не в побайтном копировании origin, а в уменьшении устойчивых synthetic признаков и в том, чтобы emitted server flight был структурно ближе к захваченному profile.
|
||||||
|
|
||||||
|
## Зачем нужно это изменение
|
||||||
|
|
||||||
|
Проект уже умеет собирать полезное серверное TLS-поведение в пути TLS front fetch:
|
||||||
|
|
||||||
|
- `change_cipher_spec_count`
|
||||||
|
- `app_data_record_sizes`
|
||||||
|
- `ticket_record_sizes`
|
||||||
|
|
||||||
|
До этого изменения эмулятор использовал только часть этой информации. Из-за этого оставался разрыв между захваченным поведением origin и тем FakeTLS server flight, который реально уходил на провод.
|
||||||
|
|
||||||
|
## Что реализовано
|
||||||
|
|
||||||
|
- Эмулятор теперь воспроизводит наблюдаемое значение `ChangeCipherSpec` из полученного `behavior_profile`.
|
||||||
|
- Эмулятор теперь воспроизводит наблюдаемые размеры ticket-like tail ApplicationData records, когда доступны raw или merged TLS profile data.
|
||||||
|
- Эмулятор теперь сохраняет больше структуры профилированного encrypted flight, а не схлопывает его в более маленькую synthetic форму.
|
||||||
|
- Для профилей без raw TLS behavior по-прежнему сохраняется прежний synthetic fallback.
|
||||||
|
- Операторский `tls_new_session_tickets` по-прежнему работает как дополнительный fallback, если профиль не даёт достаточного количества tail records.
|
||||||
|
|
||||||
|
## Практическая польза
|
||||||
|
|
||||||
|
- Снижается различимость между профилированным origin TLS-поведением и эмулируемым TLS-поведением.
|
||||||
|
- Уменьшается шанс устойчивых server-flight fingerprint, вызванных фиксированным CCS count или полностью synthetic tail record sizes.
|
||||||
|
- Уже собранные TLS profile data используются лучше, без изменения MTProto logic, KDF routing или transport architecture.
|
||||||
|
|
||||||
|
## Ограничения
|
||||||
|
|
||||||
|
Этот механизм не ставит целью сделать Telemt побайтно идентичным origin server.
|
||||||
|
|
||||||
|
Он также не меняет:
|
||||||
|
|
||||||
|
- MTProto business logic;
|
||||||
|
- поведение KDF routing;
|
||||||
|
- общую transport architecture.
|
||||||
|
|
||||||
|
Практическая цель уже:
|
||||||
|
|
||||||
|
- использовать больше уже собранных profile data;
|
||||||
|
- уменьшить fixed synthetic behavior в server flight;
|
||||||
|
- сохранить валидный FakeTLS success path, одновременно меняя форму emitted traffic на проводе.
|
||||||
|
|
||||||
|
## Цели валидации
|
||||||
|
|
||||||
|
- Корректное количество эмулируемых `ChangeCipherSpec` records.
|
||||||
|
- Корректное воспроизведение наблюдаемых ticket-tail record sizes.
|
||||||
|
- Отсутствие регрессии в существующем ALPN и payload-placement behavior.
|
||||||
|
|
||||||
|
## Как проверять результат
|
||||||
|
|
||||||
|
Рекомендуемая валидация состоит из двух слоёв:
|
||||||
|
|
||||||
|
- focused unit и security tests для CCS-count replay и ticket-tail replay;
|
||||||
|
- сравнение реальных packet capture для выбранного origin и успешной FakeTLS session.
|
||||||
|
|
||||||
|
При проверке на сети ожидаемый результат такой:
|
||||||
|
|
||||||
|
- валидный FakeTLS и MTProto success path сохраняется;
|
||||||
|
- форма раннего encrypted server flight меняется, когда доступно более богатое profile data;
|
||||||
|
- изменение видно на проводе без изменения MTProto logic и transport architecture.
|
||||||
|
|
||||||
|
Такая проверка нужна для подтверждения того, что уже собранные TLS profile data используются лучше.
|
||||||
|
Она не предназначена для доказательства побайтной эквивалентности с реальным origin server.
|
||||||
|
|
||||||
|
## Как проверить на реальной инсталляции
|
||||||
|
|
||||||
|
Самая сильная практическая проверка — side-by-side trace comparison между:
|
||||||
|
|
||||||
|
- реальным TLS origin server, используемым как `mask_host`;
|
||||||
|
- Telemt FakeTLS success-path connection для того же SNI;
|
||||||
|
- при необходимости capture от разных Telemt builds или configurations.
|
||||||
|
|
||||||
|
Смысл сравнения состоит в том, чтобы посмотреть на форму server flight:
|
||||||
|
|
||||||
|
- порядок records;
|
||||||
|
- количество `ChangeCipherSpec` records;
|
||||||
|
- количество и группировку ранних encrypted `ApplicationData` records;
|
||||||
|
- размеры tail или continuation `ApplicationData` records.
|
||||||
|
|
||||||
|
## Рекомендуемое окружение
|
||||||
|
|
||||||
|
Для самой чистой проверки лучше использовать Linux host или Docker container.
|
||||||
|
|
||||||
|
Рекомендуемый setup:
|
||||||
|
|
||||||
|
1. Один экземпляр Telemt.
|
||||||
|
2. Один реальный HTTPS origin как `mask_host`.
|
||||||
|
3. Один Telegram client, настроенный на `ee` proxy link для Telemt instance.
|
||||||
|
4. `tcpdump` или Wireshark для анализа capture.
|
||||||
|
|
||||||
|
## Пошаговая процедура проверки
|
||||||
|
|
||||||
|
### 1. Подготовить origin
|
||||||
|
|
||||||
|
1. Выберите реальный HTTPS origin.
|
||||||
|
2. Установите и `censorship.tls_domain`, и `censorship.mask_host` в hostname этого origin.
|
||||||
|
3. Убедитесь, что прямой TLS request работает:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openssl s_client -connect ORIGIN_IP:443 -servername YOUR_DOMAIN </dev/null
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Настроить Telemt
|
||||||
|
|
||||||
|
Используйте config, где включены:
|
||||||
|
|
||||||
|
- `censorship.mask = true`
|
||||||
|
- `censorship.tls_emulation = true`
|
||||||
|
- `censorship.mask_host`
|
||||||
|
- `censorship.mask_port`
|
||||||
|
|
||||||
|
Для более чистой проверки рекомендуется:
|
||||||
|
|
||||||
|
- держать `censorship.tls_new_session_tickets = 0`, чтобы результат в первую очередь зависел от fetched profile data, а не от операторских synthetic tail records;
|
||||||
|
- держать `censorship.tls_fetch.strict_route = true`, если важна более чистая provenance для captured profile data.
|
||||||
|
|
||||||
|
### 3. Обновить TLS profile data
|
||||||
|
|
||||||
|
1. Запустите Telemt.
|
||||||
|
2. Дайте ему получить TLS front profile data для выбранного домена.
|
||||||
|
3. Если `tls_front_dir` хранится persistently, убедитесь, что TLS front cache заполнен.
|
||||||
|
|
||||||
|
Persisted cache artifacts полезны, но не обязательны, если packet capture уже показывают runtime result.
|
||||||
|
|
||||||
|
### 4. Снять direct-origin trace
|
||||||
|
|
||||||
|
С отдельной клиентской машины подключитесь напрямую к origin:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openssl s_client -connect ORIGIN_IP:443 -servername YOUR_DOMAIN </dev/null
|
||||||
|
```
|
||||||
|
|
||||||
|
Capture:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo tcpdump -i any -w origin-direct.pcap host ORIGIN_IP and port 443
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Снять Telemt FakeTLS success-path trace
|
||||||
|
|
||||||
|
Теперь подключитесь к Telemt через реальный Telegram client с `ee` proxy link, который указывает на Telemt instance.
|
||||||
|
|
||||||
|
`openssl s_client` полезен для direct-origin capture и для fallback sanity checks, но он не проходит успешный FakeTLS и MTProto path.
|
||||||
|
|
||||||
|
Capture:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo tcpdump -i any -w telemt-emulated.pcap host TELEMT_IP and port 443
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. Декодировать структуру TLS records
|
||||||
|
|
||||||
|
Используйте `tshark`, чтобы вывести record-level structure:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
tshark -r origin-direct.pcap -Y "tls.record" -T fields \
|
||||||
|
-e frame.number \
|
||||||
|
-e ip.src \
|
||||||
|
-e ip.dst \
|
||||||
|
-e tls.record.content_type \
|
||||||
|
-e tls.record.length
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
tshark -r telemt-emulated.pcap -Y "tls.record" -T fields \
|
||||||
|
-e frame.number \
|
||||||
|
-e ip.src \
|
||||||
|
-e ip.dst \
|
||||||
|
-e tls.record.content_type \
|
||||||
|
-e tls.record.length
|
||||||
|
```
|
||||||
|
|
||||||
|
Смотрите на server flight после ClientHello:
|
||||||
|
|
||||||
|
- `22` = Handshake
|
||||||
|
- `20` = ChangeCipherSpec
|
||||||
|
- `23` = ApplicationData
|
||||||
|
|
||||||
|
### 7. Собрать сравнительную таблицу
|
||||||
|
|
||||||
|
Обычно достаточно короткой таблицы такого вида:
|
||||||
|
|
||||||
|
| Path | CCS count | AppData count in first encrypted flight | Tail AppData lengths |
|
||||||
|
| --- | --- | --- | --- |
|
||||||
|
| Origin | `N` | `M` | `[a, b, ...]` |
|
||||||
|
| Telemt build A | `...` | `...` | `...` |
|
||||||
|
| Telemt build B | `...` | `...` | `...` |
|
||||||
|
|
||||||
|
По такой таблице должно быть легко увидеть, что:
|
||||||
|
|
||||||
|
- FakeTLS success path остаётся валидным;
|
||||||
|
- ранний encrypted server flight меняется, когда переиспользуется более богатое profile data;
|
||||||
|
- результат подтверждён packet evidence.
|
||||||
|
|
||||||
|
## Пример набора capture
|
||||||
|
|
||||||
|
Один практический пример такой проверки использует:
|
||||||
|
|
||||||
|
- `origin-direct-nginx.pcap`
|
||||||
|
- `telemt-ee-before-nginx.pcap`
|
||||||
|
- `telemt-ee-after-nginx.pcap`
|
||||||
|
|
||||||
|
Практические замечания:
|
||||||
|
|
||||||
|
- `origin` снимался как прямое TLS 1.2 connection к `nginx.org`;
|
||||||
|
- `before` и `after` снимались на Telemt FakeTLS success path с реальным Telegram client;
|
||||||
|
- первый server-side FakeTLS response остаётся валидным в обоих случаях;
|
||||||
|
- сегментация раннего encrypted server flight отличается между `before` и `after`, что согласуется с лучшим использованием captured profile data;
|
||||||
|
- такой результат показывает заметный эффект на проводе без поломки success path, но не заявляет полной неотличимости от origin.
|
||||||
|
|
||||||
|
## Более сильная валидация
|
||||||
|
|
||||||
|
Для более широкой проверки повторите ту же процедуру ещё на:
|
||||||
|
|
||||||
|
1. одном CDN-backed origin;
|
||||||
|
2. одном regular nginx origin;
|
||||||
|
3. одном origin с multi-record encrypted flight и заметными ticket-like tails.
|
||||||
|
|
||||||
|
Если одно и то же направление улучшения повторится на всех трёх, уверенность в результате будет значительно выше, чем для одного origin example.
|
||||||
@@ -121,6 +121,9 @@ pub struct HotFields {
|
|||||||
pub user_max_tcp_conns_global_each: usize,
|
pub user_max_tcp_conns_global_each: usize,
|
||||||
pub user_expirations: std::collections::HashMap<String, chrono::DateTime<chrono::Utc>>,
|
pub user_expirations: std::collections::HashMap<String, chrono::DateTime<chrono::Utc>>,
|
||||||
pub user_data_quota: std::collections::HashMap<String, u64>,
|
pub user_data_quota: std::collections::HashMap<String, u64>,
|
||||||
|
pub user_rate_limits: std::collections::HashMap<String, crate::config::RateLimitBps>,
|
||||||
|
pub cidr_rate_limits:
|
||||||
|
std::collections::HashMap<ipnetwork::IpNetwork, crate::config::RateLimitBps>,
|
||||||
pub user_max_unique_ips: std::collections::HashMap<String, usize>,
|
pub user_max_unique_ips: std::collections::HashMap<String, usize>,
|
||||||
pub user_max_unique_ips_global_each: usize,
|
pub user_max_unique_ips_global_each: usize,
|
||||||
pub user_max_unique_ips_mode: crate::config::UserMaxUniqueIpsMode,
|
pub user_max_unique_ips_mode: crate::config::UserMaxUniqueIpsMode,
|
||||||
@@ -245,6 +248,8 @@ impl HotFields {
|
|||||||
user_max_tcp_conns_global_each: cfg.access.user_max_tcp_conns_global_each,
|
user_max_tcp_conns_global_each: cfg.access.user_max_tcp_conns_global_each,
|
||||||
user_expirations: cfg.access.user_expirations.clone(),
|
user_expirations: cfg.access.user_expirations.clone(),
|
||||||
user_data_quota: cfg.access.user_data_quota.clone(),
|
user_data_quota: cfg.access.user_data_quota.clone(),
|
||||||
|
user_rate_limits: cfg.access.user_rate_limits.clone(),
|
||||||
|
cidr_rate_limits: cfg.access.cidr_rate_limits.clone(),
|
||||||
user_max_unique_ips: cfg.access.user_max_unique_ips.clone(),
|
user_max_unique_ips: cfg.access.user_max_unique_ips.clone(),
|
||||||
user_max_unique_ips_global_each: cfg.access.user_max_unique_ips_global_each,
|
user_max_unique_ips_global_each: cfg.access.user_max_unique_ips_global_each,
|
||||||
user_max_unique_ips_mode: cfg.access.user_max_unique_ips_mode,
|
user_max_unique_ips_mode: cfg.access.user_max_unique_ips_mode,
|
||||||
@@ -545,6 +550,8 @@ fn overlay_hot_fields(old: &ProxyConfig, new: &ProxyConfig) -> ProxyConfig {
|
|||||||
cfg.access.user_max_tcp_conns_global_each = new.access.user_max_tcp_conns_global_each;
|
cfg.access.user_max_tcp_conns_global_each = new.access.user_max_tcp_conns_global_each;
|
||||||
cfg.access.user_expirations = new.access.user_expirations.clone();
|
cfg.access.user_expirations = new.access.user_expirations.clone();
|
||||||
cfg.access.user_data_quota = new.access.user_data_quota.clone();
|
cfg.access.user_data_quota = new.access.user_data_quota.clone();
|
||||||
|
cfg.access.user_rate_limits = new.access.user_rate_limits.clone();
|
||||||
|
cfg.access.cidr_rate_limits = new.access.cidr_rate_limits.clone();
|
||||||
cfg.access.user_max_unique_ips = new.access.user_max_unique_ips.clone();
|
cfg.access.user_max_unique_ips = new.access.user_max_unique_ips.clone();
|
||||||
cfg.access.user_max_unique_ips_global_each = new.access.user_max_unique_ips_global_each;
|
cfg.access.user_max_unique_ips_global_each = new.access.user_max_unique_ips_global_each;
|
||||||
cfg.access.user_max_unique_ips_mode = new.access.user_max_unique_ips_mode;
|
cfg.access.user_max_unique_ips_mode = new.access.user_max_unique_ips_mode;
|
||||||
@@ -1183,6 +1190,18 @@ fn log_changes(
|
|||||||
new_hot.user_data_quota.len()
|
new_hot.user_data_quota.len()
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
if old_hot.user_rate_limits != new_hot.user_rate_limits {
|
||||||
|
info!(
|
||||||
|
"config reload: user_rate_limits updated ({} entries)",
|
||||||
|
new_hot.user_rate_limits.len()
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if old_hot.cidr_rate_limits != new_hot.cidr_rate_limits {
|
||||||
|
info!(
|
||||||
|
"config reload: cidr_rate_limits updated ({} entries)",
|
||||||
|
new_hot.cidr_rate_limits.len()
|
||||||
|
);
|
||||||
|
}
|
||||||
if old_hot.user_max_unique_ips != new_hot.user_max_unique_ips {
|
if old_hot.user_max_unique_ips != new_hot.user_max_unique_ips {
|
||||||
info!(
|
info!(
|
||||||
"config reload: user_max_unique_ips updated ({} entries)",
|
"config reload: user_max_unique_ips updated ({} entries)",
|
||||||
|
|||||||
@@ -861,6 +861,22 @@ impl ProxyConfig {
|
|||||||
));
|
));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
for (user, limit) in &config.access.user_rate_limits {
|
||||||
|
if limit.up_bps == 0 && limit.down_bps == 0 {
|
||||||
|
return Err(ProxyError::Config(format!(
|
||||||
|
"access.user_rate_limits.{user} must set at least one non-zero direction"
|
||||||
|
)));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
for (cidr, limit) in &config.access.cidr_rate_limits {
|
||||||
|
if limit.up_bps == 0 && limit.down_bps == 0 {
|
||||||
|
return Err(ProxyError::Config(format!(
|
||||||
|
"access.cidr_rate_limits.{cidr} must set at least one non-zero direction"
|
||||||
|
)));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if config.general.me_reinit_every_secs == 0 {
|
if config.general.me_reinit_every_secs == 0 {
|
||||||
return Err(ProxyError::Config(
|
return Err(ProxyError::Config(
|
||||||
"general.me_reinit_every_secs must be > 0".to_string(),
|
"general.me_reinit_every_secs must be > 0".to_string(),
|
||||||
|
|||||||
@@ -1826,6 +1826,21 @@ pub struct AccessConfig {
|
|||||||
#[serde(default)]
|
#[serde(default)]
|
||||||
pub user_data_quota: HashMap<String, u64>,
|
pub user_data_quota: HashMap<String, u64>,
|
||||||
|
|
||||||
|
/// Per-user transport rate limits in bits-per-second.
|
||||||
|
///
|
||||||
|
/// Each entry supports independent upload (`up_bps`) and download
|
||||||
|
/// (`down_bps`) ceilings. A value of `0` in one direction means
|
||||||
|
/// "unlimited" for that direction.
|
||||||
|
#[serde(default)]
|
||||||
|
pub user_rate_limits: HashMap<String, RateLimitBps>,
|
||||||
|
|
||||||
|
/// Per-CIDR aggregate transport rate limits in bits-per-second.
|
||||||
|
///
|
||||||
|
/// Matching uses longest-prefix-wins semantics. A value of `0` in one
|
||||||
|
/// direction means "unlimited" for that direction.
|
||||||
|
#[serde(default)]
|
||||||
|
pub cidr_rate_limits: HashMap<IpNetwork, RateLimitBps>,
|
||||||
|
|
||||||
#[serde(default)]
|
#[serde(default)]
|
||||||
pub user_max_unique_ips: HashMap<String, usize>,
|
pub user_max_unique_ips: HashMap<String, usize>,
|
||||||
|
|
||||||
@@ -1859,6 +1874,8 @@ impl Default for AccessConfig {
|
|||||||
user_max_tcp_conns_global_each: default_user_max_tcp_conns_global_each(),
|
user_max_tcp_conns_global_each: default_user_max_tcp_conns_global_each(),
|
||||||
user_expirations: HashMap::new(),
|
user_expirations: HashMap::new(),
|
||||||
user_data_quota: HashMap::new(),
|
user_data_quota: HashMap::new(),
|
||||||
|
user_rate_limits: HashMap::new(),
|
||||||
|
cidr_rate_limits: HashMap::new(),
|
||||||
user_max_unique_ips: HashMap::new(),
|
user_max_unique_ips: HashMap::new(),
|
||||||
user_max_unique_ips_global_each: default_user_max_unique_ips_global_each(),
|
user_max_unique_ips_global_each: default_user_max_unique_ips_global_each(),
|
||||||
user_max_unique_ips_mode: UserMaxUniqueIpsMode::default(),
|
user_max_unique_ips_mode: UserMaxUniqueIpsMode::default(),
|
||||||
@@ -1870,6 +1887,14 @@ impl Default for AccessConfig {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq, Serialize, Deserialize)]
|
||||||
|
pub struct RateLimitBps {
|
||||||
|
#[serde(default)]
|
||||||
|
pub up_bps: u64,
|
||||||
|
#[serde(default)]
|
||||||
|
pub down_bps: u64,
|
||||||
|
}
|
||||||
|
|
||||||
// ============= Aux Structures =============
|
// ============= Aux Structures =============
|
||||||
|
|
||||||
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
|
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
|
||||||
|
|||||||
@@ -8,6 +8,7 @@ use std::io::{self, Read, Write};
|
|||||||
use std::os::unix::fs::OpenOptionsExt;
|
use std::os::unix::fs::OpenOptionsExt;
|
||||||
use std::path::{Path, PathBuf};
|
use std::path::{Path, PathBuf};
|
||||||
|
|
||||||
|
use nix::errno::Errno;
|
||||||
use nix::fcntl::{Flock, FlockArg};
|
use nix::fcntl::{Flock, FlockArg};
|
||||||
use nix::unistd::{self, ForkResult, Gid, Pid, Uid, chdir, close, fork, getpid, setsid};
|
use nix::unistd::{self, ForkResult, Gid, Pid, Uid, chdir, close, fork, getpid, setsid};
|
||||||
use tracing::{debug, info, warn};
|
use tracing::{debug, info, warn};
|
||||||
@@ -157,15 +158,15 @@ fn redirect_stdio_to_devnull() -> Result<(), DaemonError> {
|
|||||||
unsafe {
|
unsafe {
|
||||||
// Redirect stdin (fd 0)
|
// Redirect stdin (fd 0)
|
||||||
if libc::dup2(devnull_fd, 0) < 0 {
|
if libc::dup2(devnull_fd, 0) < 0 {
|
||||||
return Err(DaemonError::RedirectFailed(nix::errno::Errno::last()));
|
return Err(DaemonError::RedirectFailed(Errno::last()));
|
||||||
}
|
}
|
||||||
// Redirect stdout (fd 1)
|
// Redirect stdout (fd 1)
|
||||||
if libc::dup2(devnull_fd, 1) < 0 {
|
if libc::dup2(devnull_fd, 1) < 0 {
|
||||||
return Err(DaemonError::RedirectFailed(nix::errno::Errno::last()));
|
return Err(DaemonError::RedirectFailed(Errno::last()));
|
||||||
}
|
}
|
||||||
// Redirect stderr (fd 2)
|
// Redirect stderr (fd 2)
|
||||||
if libc::dup2(devnull_fd, 2) < 0 {
|
if libc::dup2(devnull_fd, 2) < 0 {
|
||||||
return Err(DaemonError::RedirectFailed(nix::errno::Errno::last()));
|
return Err(DaemonError::RedirectFailed(Errno::last()));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -337,6 +338,27 @@ fn is_process_running(pid: i32) -> bool {
|
|||||||
nix::sys::signal::kill(Pid::from_raw(pid), None).is_ok()
|
nix::sys::signal::kill(Pid::from_raw(pid), None).is_ok()
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// macOS gates nix::unistd::setgroups differently in the current dependency set,
|
||||||
|
// so call libc directly there while preserving the original nix path elsewhere.
|
||||||
|
fn set_supplementary_groups(gid: Gid) -> Result<(), nix::Error> {
|
||||||
|
#[cfg(target_os = "macos")]
|
||||||
|
{
|
||||||
|
let groups = [gid.as_raw()];
|
||||||
|
let rc = unsafe {
|
||||||
|
libc::setgroups(
|
||||||
|
i32::try_from(groups.len()).expect("single supplementary group must fit in c_int"),
|
||||||
|
groups.as_ptr(),
|
||||||
|
)
|
||||||
|
};
|
||||||
|
if rc == 0 { Ok(()) } else { Err(Errno::last()) }
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(not(target_os = "macos"))]
|
||||||
|
{
|
||||||
|
unistd::setgroups(&[gid])
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
/// Drops privileges to the specified user and group.
|
/// Drops privileges to the specified user and group.
|
||||||
///
|
///
|
||||||
/// This should be called after binding privileged ports but before entering
|
/// This should be called after binding privileged ports but before entering
|
||||||
@@ -368,7 +390,7 @@ pub fn drop_privileges(
|
|||||||
|
|
||||||
if let Some(gid) = target_gid {
|
if let Some(gid) = target_gid {
|
||||||
unistd::setgid(gid).map_err(DaemonError::PrivilegeDrop)?;
|
unistd::setgid(gid).map_err(DaemonError::PrivilegeDrop)?;
|
||||||
unistd::setgroups(&[gid]).map_err(DaemonError::PrivilegeDrop)?;
|
set_supplementary_groups(gid).map_err(DaemonError::PrivilegeDrop)?;
|
||||||
info!(gid = gid.as_raw(), "Dropped group privileges");
|
info!(gid = gid.as_raw(), "Dropped group privileges");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -664,6 +664,11 @@ async fn run_telemt_core(
|
|||||||
));
|
));
|
||||||
|
|
||||||
let buffer_pool = Arc::new(BufferPool::with_config(64 * 1024, 4096));
|
let buffer_pool = Arc::new(BufferPool::with_config(64 * 1024, 4096));
|
||||||
|
let shared_state = ProxySharedState::new();
|
||||||
|
shared_state.traffic_limiter.apply_policy(
|
||||||
|
config.access.user_rate_limits.clone(),
|
||||||
|
config.access.cidr_rate_limits.clone(),
|
||||||
|
);
|
||||||
|
|
||||||
connectivity::run_startup_connectivity(
|
connectivity::run_startup_connectivity(
|
||||||
&config,
|
&config,
|
||||||
@@ -695,6 +700,7 @@ async fn run_telemt_core(
|
|||||||
beobachten.clone(),
|
beobachten.clone(),
|
||||||
api_config_tx.clone(),
|
api_config_tx.clone(),
|
||||||
me_pool.clone(),
|
me_pool.clone(),
|
||||||
|
shared_state.clone(),
|
||||||
)
|
)
|
||||||
.await;
|
.await;
|
||||||
let config_rx = runtime_watches.config_rx;
|
let config_rx = runtime_watches.config_rx;
|
||||||
@@ -711,7 +717,6 @@ async fn run_telemt_core(
|
|||||||
)
|
)
|
||||||
.await;
|
.await;
|
||||||
let _admission_tx_hold = admission_tx;
|
let _admission_tx_hold = admission_tx;
|
||||||
let shared_state = ProxySharedState::new();
|
|
||||||
conntrack_control::spawn_conntrack_controller(
|
conntrack_control::spawn_conntrack_controller(
|
||||||
config_rx.clone(),
|
config_rx.clone(),
|
||||||
stats.clone(),
|
stats.clone(),
|
||||||
|
|||||||
@@ -51,6 +51,7 @@ pub(crate) async fn spawn_runtime_tasks(
|
|||||||
beobachten: Arc<BeobachtenStore>,
|
beobachten: Arc<BeobachtenStore>,
|
||||||
api_config_tx: watch::Sender<Arc<ProxyConfig>>,
|
api_config_tx: watch::Sender<Arc<ProxyConfig>>,
|
||||||
me_pool_for_policy: Option<Arc<MePool>>,
|
me_pool_for_policy: Option<Arc<MePool>>,
|
||||||
|
shared_state: Arc<ProxySharedState>,
|
||||||
) -> RuntimeWatches {
|
) -> RuntimeWatches {
|
||||||
let um_clone = upstream_manager.clone();
|
let um_clone = upstream_manager.clone();
|
||||||
let dc_overrides_for_health = config.dc_overrides.clone();
|
let dc_overrides_for_health = config.dc_overrides.clone();
|
||||||
@@ -182,6 +183,41 @@ pub(crate) async fn spawn_runtime_tasks(
|
|||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
|
let limiter = shared_state.traffic_limiter.clone();
|
||||||
|
limiter.apply_policy(
|
||||||
|
config.access.user_rate_limits.clone(),
|
||||||
|
config.access.cidr_rate_limits.clone(),
|
||||||
|
);
|
||||||
|
let mut config_rx_rate_limits = config_rx.clone();
|
||||||
|
tokio::spawn(async move {
|
||||||
|
let mut prev_user_limits = config_rx_rate_limits
|
||||||
|
.borrow()
|
||||||
|
.access
|
||||||
|
.user_rate_limits
|
||||||
|
.clone();
|
||||||
|
let mut prev_cidr_limits = config_rx_rate_limits
|
||||||
|
.borrow()
|
||||||
|
.access
|
||||||
|
.cidr_rate_limits
|
||||||
|
.clone();
|
||||||
|
loop {
|
||||||
|
if config_rx_rate_limits.changed().await.is_err() {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
let cfg = config_rx_rate_limits.borrow_and_update().clone();
|
||||||
|
if prev_user_limits != cfg.access.user_rate_limits
|
||||||
|
|| prev_cidr_limits != cfg.access.cidr_rate_limits
|
||||||
|
{
|
||||||
|
limiter.apply_policy(
|
||||||
|
cfg.access.user_rate_limits.clone(),
|
||||||
|
cfg.access.cidr_rate_limits.clone(),
|
||||||
|
);
|
||||||
|
prev_user_limits = cfg.access.user_rate_limits.clone();
|
||||||
|
prev_cidr_limits = cfg.access.cidr_rate_limits.clone();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
let beobachten_writer = beobachten.clone();
|
let beobachten_writer = beobachten.clone();
|
||||||
let config_rx_beobachten = config_rx.clone();
|
let config_rx_beobachten = config_rx.clone();
|
||||||
tokio::spawn(async move {
|
tokio::spawn(async move {
|
||||||
|
|||||||
270
src/metrics.rs
270
src/metrics.rs
@@ -575,6 +575,139 @@ async fn render_metrics(
|
|||||||
}
|
}
|
||||||
);
|
);
|
||||||
|
|
||||||
|
let limiter_metrics = shared_state.traffic_limiter.metrics_snapshot();
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_rate_limiter_throttle_total Traffic limiter throttle events by scope and direction"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_rate_limiter_throttle_total counter");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_throttle_total{{scope=\"user\",direction=\"up\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.user_throttle_up_total
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_throttle_total{{scope=\"user\",direction=\"down\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.user_throttle_down_total
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_throttle_total{{scope=\"cidr\",direction=\"up\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.cidr_throttle_up_total
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_throttle_total{{scope=\"cidr\",direction=\"down\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.cidr_throttle_down_total
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_rate_limiter_wait_ms_total Traffic limiter accumulated wait time in milliseconds by scope and direction"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_rate_limiter_wait_ms_total counter");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_wait_ms_total{{scope=\"user\",direction=\"up\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.user_wait_up_ms_total
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_wait_ms_total{{scope=\"user\",direction=\"down\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.user_wait_down_ms_total
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_wait_ms_total{{scope=\"cidr\",direction=\"up\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.cidr_wait_up_ms_total
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_wait_ms_total{{scope=\"cidr\",direction=\"down\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.cidr_wait_down_ms_total
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_rate_limiter_active_leases Active relay leases under rate limiting by scope"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_rate_limiter_active_leases gauge");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_active_leases{{scope=\"user\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.user_active_leases
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_active_leases{{scope=\"cidr\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.cidr_active_leases
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_rate_limiter_policy_entries Active rate-limit policy entries by scope"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_rate_limiter_policy_entries gauge");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_policy_entries{{scope=\"user\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.user_policy_entries
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_rate_limiter_policy_entries{{scope=\"cidr\"}} {}",
|
||||||
|
if core_enabled {
|
||||||
|
limiter_metrics.cidr_policy_entries
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
let _ = writeln!(
|
let _ = writeln!(
|
||||||
out,
|
out,
|
||||||
"# HELP telemt_upstream_connect_attempt_total Upstream connect attempts across all requests"
|
"# HELP telemt_upstream_connect_attempt_total Upstream connect attempts across all requests"
|
||||||
@@ -1177,6 +1310,143 @@ async fn render_metrics(
|
|||||||
0
|
0
|
||||||
}
|
}
|
||||||
);
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_me_fair_pressure_state Worker-local fairness pressure state"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_me_fair_pressure_state gauge");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_pressure_state {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_pressure_state_gauge()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_me_fair_active_flows Fair-scheduler active flow count"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_me_fair_active_flows gauge");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_active_flows {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_active_flows_gauge()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_me_fair_queued_bytes Fair-scheduler queued bytes"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_me_fair_queued_bytes gauge");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_queued_bytes {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_queued_bytes_gauge()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_me_fair_flow_state_gauge Fair-scheduler flow health classes"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_me_fair_flow_state_gauge gauge");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_flow_state_gauge{{class=\"standing\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_standing_flows_gauge()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_flow_state_gauge{{class=\"backpressured\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_backpressured_flows_gauge()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"# HELP telemt_me_fair_events_total Fair-scheduler event counters"
|
||||||
|
);
|
||||||
|
let _ = writeln!(out, "# TYPE telemt_me_fair_events_total counter");
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_events_total{{event=\"scheduler_round\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_scheduler_rounds_total()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_events_total{{event=\"deficit_grant\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_deficit_grants_total()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_events_total{{event=\"deficit_skip\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_deficit_skips_total()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_events_total{{event=\"enqueue_reject\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_enqueue_rejects_total()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_events_total{{event=\"shed_drop\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_shed_drops_total()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_events_total{{event=\"penalty\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_penalties_total()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
let _ = writeln!(
|
||||||
|
out,
|
||||||
|
"telemt_me_fair_events_total{{event=\"downstream_stall\"}} {}",
|
||||||
|
if me_allows_normal {
|
||||||
|
stats.get_me_fair_downstream_stalls_total()
|
||||||
|
} else {
|
||||||
|
0
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
let _ = writeln!(
|
let _ = writeln!(
|
||||||
out,
|
out,
|
||||||
|
|||||||
@@ -316,6 +316,9 @@ where
|
|||||||
|
|
||||||
stats.increment_user_connects(user);
|
stats.increment_user_connects(user);
|
||||||
let _direct_connection_lease = stats.acquire_direct_connection_lease();
|
let _direct_connection_lease = stats.acquire_direct_connection_lease();
|
||||||
|
let traffic_lease = shared
|
||||||
|
.traffic_limiter
|
||||||
|
.acquire_lease(user, success.peer.ip());
|
||||||
|
|
||||||
let buffer_pool_trim = Arc::clone(&buffer_pool);
|
let buffer_pool_trim = Arc::clone(&buffer_pool);
|
||||||
let relay_activity_timeout = if shared.conntrack_pressure_active() {
|
let relay_activity_timeout = if shared.conntrack_pressure_active() {
|
||||||
@@ -329,7 +332,7 @@ where
|
|||||||
} else {
|
} else {
|
||||||
Duration::from_secs(1800)
|
Duration::from_secs(1800)
|
||||||
};
|
};
|
||||||
let relay_result = crate::proxy::relay::relay_bidirectional_with_activity_timeout(
|
let relay_result = crate::proxy::relay::relay_bidirectional_with_activity_timeout_and_lease(
|
||||||
client_reader,
|
client_reader,
|
||||||
client_writer,
|
client_writer,
|
||||||
tg_reader,
|
tg_reader,
|
||||||
@@ -340,6 +343,7 @@ where
|
|||||||
Arc::clone(&stats),
|
Arc::clone(&stats),
|
||||||
config.access.user_data_quota.get(user).copied(),
|
config.access.user_data_quota.get(user).copied(),
|
||||||
buffer_pool,
|
buffer_pool,
|
||||||
|
traffic_lease,
|
||||||
relay_activity_timeout,
|
relay_activity_timeout,
|
||||||
);
|
);
|
||||||
tokio::pin!(relay_result);
|
tokio::pin!(relay_result);
|
||||||
|
|||||||
@@ -28,6 +28,7 @@ use crate::proxy::route_mode::{
|
|||||||
use crate::proxy::shared_state::{
|
use crate::proxy::shared_state::{
|
||||||
ConntrackCloseEvent, ConntrackClosePublishResult, ConntrackCloseReason, ProxySharedState,
|
ConntrackCloseEvent, ConntrackClosePublishResult, ConntrackCloseReason, ProxySharedState,
|
||||||
};
|
};
|
||||||
|
use crate::proxy::traffic_limiter::{RateDirection, TrafficLease, next_refill_delay};
|
||||||
use crate::stats::{
|
use crate::stats::{
|
||||||
MeD2cFlushReason, MeD2cQuotaRejectStage, MeD2cWriteMode, QuotaReserveError, Stats, UserStats,
|
MeD2cFlushReason, MeD2cQuotaRejectStage, MeD2cWriteMode, QuotaReserveError, Stats, UserStats,
|
||||||
};
|
};
|
||||||
@@ -286,6 +287,10 @@ impl RelayClientIdleState {
|
|||||||
self.last_client_frame_at = now;
|
self.last_client_frame_at = now;
|
||||||
self.soft_idle_marked = false;
|
self.soft_idle_marked = false;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
fn on_client_tiny_frame(&mut self, now: Instant) {
|
||||||
|
self.last_client_frame_at = now;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
impl MeD2cFlushPolicy {
|
impl MeD2cFlushPolicy {
|
||||||
@@ -595,6 +600,41 @@ async fn reserve_user_quota_with_yield(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
async fn wait_for_traffic_budget(
|
||||||
|
lease: Option<&Arc<TrafficLease>>,
|
||||||
|
direction: RateDirection,
|
||||||
|
bytes: u64,
|
||||||
|
) {
|
||||||
|
if bytes == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
let Some(lease) = lease else {
|
||||||
|
return;
|
||||||
|
};
|
||||||
|
|
||||||
|
let mut remaining = bytes;
|
||||||
|
while remaining > 0 {
|
||||||
|
let consume = lease.try_consume(direction, remaining);
|
||||||
|
if consume.granted > 0 {
|
||||||
|
remaining = remaining.saturating_sub(consume.granted);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
let wait_started_at = Instant::now();
|
||||||
|
tokio::time::sleep(next_refill_delay()).await;
|
||||||
|
let wait_ms = wait_started_at
|
||||||
|
.elapsed()
|
||||||
|
.as_millis()
|
||||||
|
.min(u128::from(u64::MAX)) as u64;
|
||||||
|
lease.observe_wait_ms(
|
||||||
|
direction,
|
||||||
|
consume.blocked_user,
|
||||||
|
consume.blocked_cidr,
|
||||||
|
wait_ms,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
fn classify_me_d2c_flush_reason(
|
fn classify_me_d2c_flush_reason(
|
||||||
flush_immediately: bool,
|
flush_immediately: bool,
|
||||||
batch_frames: usize,
|
batch_frames: usize,
|
||||||
@@ -985,6 +1025,7 @@ where
|
|||||||
let quota_limit = config.access.user_data_quota.get(&user).copied();
|
let quota_limit = config.access.user_data_quota.get(&user).copied();
|
||||||
let quota_user_stats = quota_limit.map(|_| stats.get_or_create_user_stats_handle(&user));
|
let quota_user_stats = quota_limit.map(|_| stats.get_or_create_user_stats_handle(&user));
|
||||||
let peer = success.peer;
|
let peer = success.peer;
|
||||||
|
let traffic_lease = shared.traffic_limiter.acquire_lease(&user, peer.ip());
|
||||||
let proto_tag = success.proto_tag;
|
let proto_tag = success.proto_tag;
|
||||||
let pool_generation = me_pool.current_generation();
|
let pool_generation = me_pool.current_generation();
|
||||||
|
|
||||||
@@ -1120,6 +1161,7 @@ where
|
|||||||
let rng_clone = rng.clone();
|
let rng_clone = rng.clone();
|
||||||
let user_clone = user.clone();
|
let user_clone = user.clone();
|
||||||
let quota_user_stats_me_writer = quota_user_stats.clone();
|
let quota_user_stats_me_writer = quota_user_stats.clone();
|
||||||
|
let traffic_lease_me_writer = traffic_lease.clone();
|
||||||
let last_downstream_activity_ms_clone = last_downstream_activity_ms.clone();
|
let last_downstream_activity_ms_clone = last_downstream_activity_ms.clone();
|
||||||
let bytes_me2c_clone = bytes_me2c.clone();
|
let bytes_me2c_clone = bytes_me2c.clone();
|
||||||
let d2c_flush_policy = MeD2cFlushPolicy::from_config(&config);
|
let d2c_flush_policy = MeD2cFlushPolicy::from_config(&config);
|
||||||
@@ -1153,7 +1195,7 @@ where
|
|||||||
|
|
||||||
let first_is_downstream_activity =
|
let first_is_downstream_activity =
|
||||||
matches!(&first, MeResponse::Data { .. } | MeResponse::Ack(_));
|
matches!(&first, MeResponse::Data { .. } | MeResponse::Ack(_));
|
||||||
match process_me_writer_response(
|
match process_me_writer_response_with_traffic_lease(
|
||||||
first,
|
first,
|
||||||
&mut writer,
|
&mut writer,
|
||||||
proto_tag,
|
proto_tag,
|
||||||
@@ -1164,6 +1206,7 @@ where
|
|||||||
quota_user_stats_me_writer.as_deref(),
|
quota_user_stats_me_writer.as_deref(),
|
||||||
quota_limit,
|
quota_limit,
|
||||||
d2c_flush_policy.quota_soft_overshoot_bytes,
|
d2c_flush_policy.quota_soft_overshoot_bytes,
|
||||||
|
traffic_lease_me_writer.as_ref(),
|
||||||
bytes_me2c_clone.as_ref(),
|
bytes_me2c_clone.as_ref(),
|
||||||
conn_id,
|
conn_id,
|
||||||
d2c_flush_policy.ack_flush_immediate,
|
d2c_flush_policy.ack_flush_immediate,
|
||||||
@@ -1213,7 +1256,7 @@ where
|
|||||||
|
|
||||||
let next_is_downstream_activity =
|
let next_is_downstream_activity =
|
||||||
matches!(&next, MeResponse::Data { .. } | MeResponse::Ack(_));
|
matches!(&next, MeResponse::Data { .. } | MeResponse::Ack(_));
|
||||||
match process_me_writer_response(
|
match process_me_writer_response_with_traffic_lease(
|
||||||
next,
|
next,
|
||||||
&mut writer,
|
&mut writer,
|
||||||
proto_tag,
|
proto_tag,
|
||||||
@@ -1224,6 +1267,7 @@ where
|
|||||||
quota_user_stats_me_writer.as_deref(),
|
quota_user_stats_me_writer.as_deref(),
|
||||||
quota_limit,
|
quota_limit,
|
||||||
d2c_flush_policy.quota_soft_overshoot_bytes,
|
d2c_flush_policy.quota_soft_overshoot_bytes,
|
||||||
|
traffic_lease_me_writer.as_ref(),
|
||||||
bytes_me2c_clone.as_ref(),
|
bytes_me2c_clone.as_ref(),
|
||||||
conn_id,
|
conn_id,
|
||||||
d2c_flush_policy.ack_flush_immediate,
|
d2c_flush_policy.ack_flush_immediate,
|
||||||
@@ -1276,7 +1320,7 @@ where
|
|||||||
Ok(Some(next)) => {
|
Ok(Some(next)) => {
|
||||||
let next_is_downstream_activity =
|
let next_is_downstream_activity =
|
||||||
matches!(&next, MeResponse::Data { .. } | MeResponse::Ack(_));
|
matches!(&next, MeResponse::Data { .. } | MeResponse::Ack(_));
|
||||||
match process_me_writer_response(
|
match process_me_writer_response_with_traffic_lease(
|
||||||
next,
|
next,
|
||||||
&mut writer,
|
&mut writer,
|
||||||
proto_tag,
|
proto_tag,
|
||||||
@@ -1287,6 +1331,7 @@ where
|
|||||||
quota_user_stats_me_writer.as_deref(),
|
quota_user_stats_me_writer.as_deref(),
|
||||||
quota_limit,
|
quota_limit,
|
||||||
d2c_flush_policy.quota_soft_overshoot_bytes,
|
d2c_flush_policy.quota_soft_overshoot_bytes,
|
||||||
|
traffic_lease_me_writer.as_ref(),
|
||||||
bytes_me2c_clone.as_ref(),
|
bytes_me2c_clone.as_ref(),
|
||||||
conn_id,
|
conn_id,
|
||||||
d2c_flush_policy.ack_flush_immediate,
|
d2c_flush_policy.ack_flush_immediate,
|
||||||
@@ -1341,7 +1386,7 @@ where
|
|||||||
|
|
||||||
let extra_is_downstream_activity =
|
let extra_is_downstream_activity =
|
||||||
matches!(&extra, MeResponse::Data { .. } | MeResponse::Ack(_));
|
matches!(&extra, MeResponse::Data { .. } | MeResponse::Ack(_));
|
||||||
match process_me_writer_response(
|
match process_me_writer_response_with_traffic_lease(
|
||||||
extra,
|
extra,
|
||||||
&mut writer,
|
&mut writer,
|
||||||
proto_tag,
|
proto_tag,
|
||||||
@@ -1352,6 +1397,7 @@ where
|
|||||||
quota_user_stats_me_writer.as_deref(),
|
quota_user_stats_me_writer.as_deref(),
|
||||||
quota_limit,
|
quota_limit,
|
||||||
d2c_flush_policy.quota_soft_overshoot_bytes,
|
d2c_flush_policy.quota_soft_overshoot_bytes,
|
||||||
|
traffic_lease_me_writer.as_ref(),
|
||||||
bytes_me2c_clone.as_ref(),
|
bytes_me2c_clone.as_ref(),
|
||||||
conn_id,
|
conn_id,
|
||||||
d2c_flush_policy.ack_flush_immediate,
|
d2c_flush_policy.ack_flush_immediate,
|
||||||
@@ -1542,6 +1588,12 @@ where
|
|||||||
match payload_result {
|
match payload_result {
|
||||||
Ok(Some((payload, quickack))) => {
|
Ok(Some((payload, quickack))) => {
|
||||||
trace!(conn_id, bytes = payload.len(), "C->ME frame");
|
trace!(conn_id, bytes = payload.len(), "C->ME frame");
|
||||||
|
wait_for_traffic_budget(
|
||||||
|
traffic_lease.as_ref(),
|
||||||
|
RateDirection::Up,
|
||||||
|
payload.len() as u64,
|
||||||
|
)
|
||||||
|
.await;
|
||||||
forensics.bytes_c2me = forensics
|
forensics.bytes_c2me = forensics
|
||||||
.bytes_c2me
|
.bytes_c2me
|
||||||
.saturating_add(payload.len() as u64);
|
.saturating_add(payload.len() as u64);
|
||||||
@@ -1762,40 +1814,6 @@ where
|
|||||||
let downstream_ms = last_downstream_activity_ms.load(Ordering::Relaxed);
|
let downstream_ms = last_downstream_activity_ms.load(Ordering::Relaxed);
|
||||||
let hard_deadline =
|
let hard_deadline =
|
||||||
hard_deadline(idle_policy, idle_state, session_started_at, downstream_ms);
|
hard_deadline(idle_policy, idle_state, session_started_at, downstream_ms);
|
||||||
if now >= hard_deadline {
|
|
||||||
clear_relay_idle_candidate_in(shared, forensics.conn_id);
|
|
||||||
stats.increment_relay_idle_hard_close_total();
|
|
||||||
let client_idle_secs = now
|
|
||||||
.saturating_duration_since(idle_state.last_client_frame_at)
|
|
||||||
.as_secs();
|
|
||||||
let downstream_idle_secs = now
|
|
||||||
.saturating_duration_since(
|
|
||||||
session_started_at + Duration::from_millis(downstream_ms),
|
|
||||||
)
|
|
||||||
.as_secs();
|
|
||||||
warn!(
|
|
||||||
trace_id = format_args!("0x{:016x}", forensics.trace_id),
|
|
||||||
conn_id = forensics.conn_id,
|
|
||||||
user = %forensics.user,
|
|
||||||
read_label,
|
|
||||||
client_idle_secs,
|
|
||||||
downstream_idle_secs,
|
|
||||||
soft_idle_secs = idle_policy.soft_idle.as_secs(),
|
|
||||||
hard_idle_secs = idle_policy.hard_idle.as_secs(),
|
|
||||||
grace_secs = idle_policy.grace_after_downstream_activity.as_secs(),
|
|
||||||
"Middle-relay hard idle close"
|
|
||||||
);
|
|
||||||
return Err(ProxyError::Io(std::io::Error::new(
|
|
||||||
std::io::ErrorKind::TimedOut,
|
|
||||||
format!(
|
|
||||||
"middle-relay hard idle timeout while reading {read_label}: client_idle_secs={client_idle_secs}, downstream_idle_secs={downstream_idle_secs}, soft_idle_secs={}, hard_idle_secs={}, grace_secs={}",
|
|
||||||
idle_policy.soft_idle.as_secs(),
|
|
||||||
idle_policy.hard_idle.as_secs(),
|
|
||||||
idle_policy.grace_after_downstream_activity.as_secs(),
|
|
||||||
),
|
|
||||||
)));
|
|
||||||
}
|
|
||||||
|
|
||||||
if !idle_state.soft_idle_marked
|
if !idle_state.soft_idle_marked
|
||||||
&& now.saturating_duration_since(idle_state.last_client_frame_at)
|
&& now.saturating_duration_since(idle_state.last_client_frame_at)
|
||||||
>= idle_policy.soft_idle
|
>= idle_policy.soft_idle
|
||||||
@@ -1850,7 +1868,45 @@ where
|
|||||||
),
|
),
|
||||||
)));
|
)));
|
||||||
}
|
}
|
||||||
Err(_) => {}
|
Err(_) => {
|
||||||
|
let now = Instant::now();
|
||||||
|
let downstream_ms = last_downstream_activity_ms.load(Ordering::Relaxed);
|
||||||
|
let hard_deadline =
|
||||||
|
hard_deadline(idle_policy, idle_state, session_started_at, downstream_ms);
|
||||||
|
if now >= hard_deadline {
|
||||||
|
clear_relay_idle_candidate_in(shared, forensics.conn_id);
|
||||||
|
stats.increment_relay_idle_hard_close_total();
|
||||||
|
let client_idle_secs = now
|
||||||
|
.saturating_duration_since(idle_state.last_client_frame_at)
|
||||||
|
.as_secs();
|
||||||
|
let downstream_idle_secs = now
|
||||||
|
.saturating_duration_since(
|
||||||
|
session_started_at + Duration::from_millis(downstream_ms),
|
||||||
|
)
|
||||||
|
.as_secs();
|
||||||
|
warn!(
|
||||||
|
trace_id = format_args!("0x{:016x}", forensics.trace_id),
|
||||||
|
conn_id = forensics.conn_id,
|
||||||
|
user = %forensics.user,
|
||||||
|
read_label,
|
||||||
|
client_idle_secs,
|
||||||
|
downstream_idle_secs,
|
||||||
|
soft_idle_secs = idle_policy.soft_idle.as_secs(),
|
||||||
|
hard_idle_secs = idle_policy.hard_idle.as_secs(),
|
||||||
|
grace_secs = idle_policy.grace_after_downstream_activity.as_secs(),
|
||||||
|
"Middle-relay hard idle close"
|
||||||
|
);
|
||||||
|
return Err(ProxyError::Io(std::io::Error::new(
|
||||||
|
std::io::ErrorKind::TimedOut,
|
||||||
|
format!(
|
||||||
|
"middle-relay hard idle timeout while reading {read_label}: client_idle_secs={client_idle_secs}, downstream_idle_secs={downstream_idle_secs}, soft_idle_secs={}, hard_idle_secs={}, grace_secs={}",
|
||||||
|
idle_policy.soft_idle.as_secs(),
|
||||||
|
idle_policy.hard_idle.as_secs(),
|
||||||
|
idle_policy.grace_after_downstream_activity.as_secs(),
|
||||||
|
),
|
||||||
|
)));
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -1941,6 +1997,7 @@ where
|
|||||||
};
|
};
|
||||||
|
|
||||||
if len == 0 {
|
if len == 0 {
|
||||||
|
idle_state.on_client_tiny_frame(Instant::now());
|
||||||
idle_state.tiny_frame_debt = idle_state
|
idle_state.tiny_frame_debt = idle_state
|
||||||
.tiny_frame_debt
|
.tiny_frame_debt
|
||||||
.saturating_add(TINY_FRAME_DEBT_PER_TINY);
|
.saturating_add(TINY_FRAME_DEBT_PER_TINY);
|
||||||
@@ -2160,6 +2217,46 @@ async fn process_me_writer_response<W>(
|
|||||||
ack_flush_immediate: bool,
|
ack_flush_immediate: bool,
|
||||||
batched: bool,
|
batched: bool,
|
||||||
) -> Result<MeWriterResponseOutcome>
|
) -> Result<MeWriterResponseOutcome>
|
||||||
|
where
|
||||||
|
W: AsyncWrite + Unpin + Send + 'static,
|
||||||
|
{
|
||||||
|
process_me_writer_response_with_traffic_lease(
|
||||||
|
response,
|
||||||
|
client_writer,
|
||||||
|
proto_tag,
|
||||||
|
rng,
|
||||||
|
frame_buf,
|
||||||
|
stats,
|
||||||
|
user,
|
||||||
|
quota_user_stats,
|
||||||
|
quota_limit,
|
||||||
|
quota_soft_overshoot_bytes,
|
||||||
|
None,
|
||||||
|
bytes_me2c,
|
||||||
|
conn_id,
|
||||||
|
ack_flush_immediate,
|
||||||
|
batched,
|
||||||
|
)
|
||||||
|
.await
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn process_me_writer_response_with_traffic_lease<W>(
|
||||||
|
response: MeResponse,
|
||||||
|
client_writer: &mut CryptoWriter<W>,
|
||||||
|
proto_tag: ProtoTag,
|
||||||
|
rng: &SecureRandom,
|
||||||
|
frame_buf: &mut Vec<u8>,
|
||||||
|
stats: &Stats,
|
||||||
|
user: &str,
|
||||||
|
quota_user_stats: Option<&UserStats>,
|
||||||
|
quota_limit: Option<u64>,
|
||||||
|
quota_soft_overshoot_bytes: u64,
|
||||||
|
traffic_lease: Option<&Arc<TrafficLease>>,
|
||||||
|
bytes_me2c: &AtomicU64,
|
||||||
|
conn_id: u64,
|
||||||
|
ack_flush_immediate: bool,
|
||||||
|
batched: bool,
|
||||||
|
) -> Result<MeWriterResponseOutcome>
|
||||||
where
|
where
|
||||||
W: AsyncWrite + Unpin + Send + 'static,
|
W: AsyncWrite + Unpin + Send + 'static,
|
||||||
{
|
{
|
||||||
@@ -2183,6 +2280,7 @@ where
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
wait_for_traffic_budget(traffic_lease, RateDirection::Down, data_len).await;
|
||||||
|
|
||||||
let write_mode =
|
let write_mode =
|
||||||
match write_client_payload(client_writer, proto_tag, flags, &data, rng, frame_buf)
|
match write_client_payload(client_writer, proto_tag, flags, &data, rng, frame_buf)
|
||||||
@@ -2220,6 +2318,7 @@ where
|
|||||||
} else {
|
} else {
|
||||||
trace!(conn_id, confirm, "ME->C quickack");
|
trace!(conn_id, confirm, "ME->C quickack");
|
||||||
}
|
}
|
||||||
|
wait_for_traffic_budget(traffic_lease, RateDirection::Down, 4).await;
|
||||||
write_client_ack(client_writer, proto_tag, confirm).await?;
|
write_client_ack(client_writer, proto_tag, confirm).await?;
|
||||||
stats.increment_me_d2c_ack_frames_total();
|
stats.increment_me_d2c_ack_frames_total();
|
||||||
|
|
||||||
|
|||||||
@@ -68,6 +68,7 @@ pub mod relay;
|
|||||||
pub mod route_mode;
|
pub mod route_mode;
|
||||||
pub mod session_eviction;
|
pub mod session_eviction;
|
||||||
pub mod shared_state;
|
pub mod shared_state;
|
||||||
|
pub mod traffic_limiter;
|
||||||
|
|
||||||
pub use client::ClientHandler;
|
pub use client::ClientHandler;
|
||||||
#[allow(unused_imports)]
|
#[allow(unused_imports)]
|
||||||
|
|||||||
@@ -52,6 +52,7 @@
|
|||||||
//! - `SharedCounters` (atomics) let the watchdog read stats without locking
|
//! - `SharedCounters` (atomics) let the watchdog read stats without locking
|
||||||
|
|
||||||
use crate::error::{ProxyError, Result};
|
use crate::error::{ProxyError, Result};
|
||||||
|
use crate::proxy::traffic_limiter::{RateDirection, TrafficLease, next_refill_delay};
|
||||||
use crate::stats::{Stats, UserStats};
|
use crate::stats::{Stats, UserStats};
|
||||||
use crate::stream::BufferPool;
|
use crate::stream::BufferPool;
|
||||||
use std::io;
|
use std::io;
|
||||||
@@ -61,7 +62,7 @@ use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
|
|||||||
use std::task::{Context, Poll};
|
use std::task::{Context, Poll};
|
||||||
use std::time::Duration;
|
use std::time::Duration;
|
||||||
use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt, ReadBuf, copy_bidirectional_with_sizes};
|
use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt, ReadBuf, copy_bidirectional_with_sizes};
|
||||||
use tokio::time::Instant;
|
use tokio::time::{Instant, Sleep};
|
||||||
use tracing::{debug, trace, warn};
|
use tracing::{debug, trace, warn};
|
||||||
|
|
||||||
// ============= Constants =============
|
// ============= Constants =============
|
||||||
@@ -210,12 +211,24 @@ struct StatsIo<S> {
|
|||||||
stats: Arc<Stats>,
|
stats: Arc<Stats>,
|
||||||
user: String,
|
user: String,
|
||||||
user_stats: Arc<UserStats>,
|
user_stats: Arc<UserStats>,
|
||||||
|
traffic_lease: Option<Arc<TrafficLease>>,
|
||||||
|
c2s_rate_debt_bytes: u64,
|
||||||
|
c2s_wait: RateWaitState,
|
||||||
|
s2c_wait: RateWaitState,
|
||||||
quota_limit: Option<u64>,
|
quota_limit: Option<u64>,
|
||||||
quota_exceeded: Arc<AtomicBool>,
|
quota_exceeded: Arc<AtomicBool>,
|
||||||
quota_bytes_since_check: u64,
|
quota_bytes_since_check: u64,
|
||||||
epoch: Instant,
|
epoch: Instant,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[derive(Default)]
|
||||||
|
struct RateWaitState {
|
||||||
|
sleep: Option<Pin<Box<Sleep>>>,
|
||||||
|
started_at: Option<Instant>,
|
||||||
|
blocked_user: bool,
|
||||||
|
blocked_cidr: bool,
|
||||||
|
}
|
||||||
|
|
||||||
impl<S> StatsIo<S> {
|
impl<S> StatsIo<S> {
|
||||||
fn new(
|
fn new(
|
||||||
inner: S,
|
inner: S,
|
||||||
@@ -225,6 +238,28 @@ impl<S> StatsIo<S> {
|
|||||||
quota_limit: Option<u64>,
|
quota_limit: Option<u64>,
|
||||||
quota_exceeded: Arc<AtomicBool>,
|
quota_exceeded: Arc<AtomicBool>,
|
||||||
epoch: Instant,
|
epoch: Instant,
|
||||||
|
) -> Self {
|
||||||
|
Self::new_with_traffic_lease(
|
||||||
|
inner,
|
||||||
|
counters,
|
||||||
|
stats,
|
||||||
|
user,
|
||||||
|
None,
|
||||||
|
quota_limit,
|
||||||
|
quota_exceeded,
|
||||||
|
epoch,
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn new_with_traffic_lease(
|
||||||
|
inner: S,
|
||||||
|
counters: Arc<SharedCounters>,
|
||||||
|
stats: Arc<Stats>,
|
||||||
|
user: String,
|
||||||
|
traffic_lease: Option<Arc<TrafficLease>>,
|
||||||
|
quota_limit: Option<u64>,
|
||||||
|
quota_exceeded: Arc<AtomicBool>,
|
||||||
|
epoch: Instant,
|
||||||
) -> Self {
|
) -> Self {
|
||||||
// Mark initial activity so the watchdog doesn't fire before data flows
|
// Mark initial activity so the watchdog doesn't fire before data flows
|
||||||
counters.touch(Instant::now(), epoch);
|
counters.touch(Instant::now(), epoch);
|
||||||
@@ -235,12 +270,88 @@ impl<S> StatsIo<S> {
|
|||||||
stats,
|
stats,
|
||||||
user,
|
user,
|
||||||
user_stats,
|
user_stats,
|
||||||
|
traffic_lease,
|
||||||
|
c2s_rate_debt_bytes: 0,
|
||||||
|
c2s_wait: RateWaitState::default(),
|
||||||
|
s2c_wait: RateWaitState::default(),
|
||||||
quota_limit,
|
quota_limit,
|
||||||
quota_exceeded,
|
quota_exceeded,
|
||||||
quota_bytes_since_check: 0,
|
quota_bytes_since_check: 0,
|
||||||
epoch,
|
epoch,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
fn record_wait(
|
||||||
|
wait: &mut RateWaitState,
|
||||||
|
lease: Option<&Arc<TrafficLease>>,
|
||||||
|
direction: RateDirection,
|
||||||
|
) {
|
||||||
|
let Some(started_at) = wait.started_at.take() else {
|
||||||
|
return;
|
||||||
|
};
|
||||||
|
let wait_ms = started_at.elapsed().as_millis().min(u128::from(u64::MAX)) as u64;
|
||||||
|
if let Some(lease) = lease {
|
||||||
|
lease.observe_wait_ms(direction, wait.blocked_user, wait.blocked_cidr, wait_ms);
|
||||||
|
}
|
||||||
|
wait.blocked_user = false;
|
||||||
|
wait.blocked_cidr = false;
|
||||||
|
}
|
||||||
|
|
||||||
|
fn arm_wait(wait: &mut RateWaitState, blocked_user: bool, blocked_cidr: bool) {
|
||||||
|
if wait.sleep.is_none() {
|
||||||
|
wait.sleep = Some(Box::pin(tokio::time::sleep(next_refill_delay())));
|
||||||
|
wait.started_at = Some(Instant::now());
|
||||||
|
}
|
||||||
|
wait.blocked_user |= blocked_user;
|
||||||
|
wait.blocked_cidr |= blocked_cidr;
|
||||||
|
}
|
||||||
|
|
||||||
|
fn poll_wait(
|
||||||
|
wait: &mut RateWaitState,
|
||||||
|
cx: &mut Context<'_>,
|
||||||
|
lease: Option<&Arc<TrafficLease>>,
|
||||||
|
direction: RateDirection,
|
||||||
|
) -> Poll<()> {
|
||||||
|
let Some(sleep) = wait.sleep.as_mut() else {
|
||||||
|
return Poll::Ready(());
|
||||||
|
};
|
||||||
|
if sleep.as_mut().poll(cx).is_pending() {
|
||||||
|
return Poll::Pending;
|
||||||
|
}
|
||||||
|
wait.sleep = None;
|
||||||
|
Self::record_wait(wait, lease, direction);
|
||||||
|
Poll::Ready(())
|
||||||
|
}
|
||||||
|
|
||||||
|
fn settle_c2s_rate_debt(&mut self, cx: &mut Context<'_>) -> Poll<()> {
|
||||||
|
let Some(lease) = self.traffic_lease.as_ref() else {
|
||||||
|
self.c2s_rate_debt_bytes = 0;
|
||||||
|
return Poll::Ready(());
|
||||||
|
};
|
||||||
|
|
||||||
|
while self.c2s_rate_debt_bytes > 0 {
|
||||||
|
let consume = lease.try_consume(RateDirection::Up, self.c2s_rate_debt_bytes);
|
||||||
|
if consume.granted > 0 {
|
||||||
|
self.c2s_rate_debt_bytes = self.c2s_rate_debt_bytes.saturating_sub(consume.granted);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
Self::arm_wait(
|
||||||
|
&mut self.c2s_wait,
|
||||||
|
consume.blocked_user,
|
||||||
|
consume.blocked_cidr,
|
||||||
|
);
|
||||||
|
if Self::poll_wait(&mut self.c2s_wait, cx, Some(lease), RateDirection::Up).is_pending()
|
||||||
|
{
|
||||||
|
return Poll::Pending;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if Self::poll_wait(&mut self.c2s_wait, cx, Some(lease), RateDirection::Up).is_pending() {
|
||||||
|
return Poll::Pending;
|
||||||
|
}
|
||||||
|
|
||||||
|
Poll::Ready(())
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
#[derive(Debug)]
|
#[derive(Debug)]
|
||||||
@@ -286,6 +397,25 @@ fn should_immediate_quota_check(remaining_before: u64, charge_bytes: u64) -> boo
|
|||||||
remaining_before <= QUOTA_NEAR_LIMIT_BYTES || charge_bytes >= QUOTA_LARGE_CHARGE_BYTES
|
remaining_before <= QUOTA_NEAR_LIMIT_BYTES || charge_bytes >= QUOTA_LARGE_CHARGE_BYTES
|
||||||
}
|
}
|
||||||
|
|
||||||
|
fn refund_reserved_quota_bytes(user_stats: &UserStats, reserved_bytes: u64) {
|
||||||
|
if reserved_bytes == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
let mut current = user_stats.quota_used.load(Ordering::Relaxed);
|
||||||
|
loop {
|
||||||
|
let next = current.saturating_sub(reserved_bytes);
|
||||||
|
match user_stats.quota_used.compare_exchange_weak(
|
||||||
|
current,
|
||||||
|
next,
|
||||||
|
Ordering::Relaxed,
|
||||||
|
Ordering::Relaxed,
|
||||||
|
) {
|
||||||
|
Ok(_) => return,
|
||||||
|
Err(observed) => current = observed,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
impl<S: AsyncRead + Unpin> AsyncRead for StatsIo<S> {
|
impl<S: AsyncRead + Unpin> AsyncRead for StatsIo<S> {
|
||||||
fn poll_read(
|
fn poll_read(
|
||||||
self: Pin<&mut Self>,
|
self: Pin<&mut Self>,
|
||||||
@@ -296,6 +426,9 @@ impl<S: AsyncRead + Unpin> AsyncRead for StatsIo<S> {
|
|||||||
if this.quota_exceeded.load(Ordering::Acquire) {
|
if this.quota_exceeded.load(Ordering::Acquire) {
|
||||||
return Poll::Ready(Err(quota_io_error()));
|
return Poll::Ready(Err(quota_io_error()));
|
||||||
}
|
}
|
||||||
|
if this.settle_c2s_rate_debt(cx).is_pending() {
|
||||||
|
return Poll::Pending;
|
||||||
|
}
|
||||||
|
|
||||||
let mut remaining_before = None;
|
let mut remaining_before = None;
|
||||||
if let Some(limit) = this.quota_limit {
|
if let Some(limit) = this.quota_limit {
|
||||||
@@ -377,6 +510,11 @@ impl<S: AsyncRead + Unpin> AsyncRead for StatsIo<S> {
|
|||||||
.add_user_octets_from_handle(this.user_stats.as_ref(), n_to_charge);
|
.add_user_octets_from_handle(this.user_stats.as_ref(), n_to_charge);
|
||||||
this.stats
|
this.stats
|
||||||
.increment_user_msgs_from_handle(this.user_stats.as_ref());
|
.increment_user_msgs_from_handle(this.user_stats.as_ref());
|
||||||
|
if this.traffic_lease.is_some() {
|
||||||
|
this.c2s_rate_debt_bytes =
|
||||||
|
this.c2s_rate_debt_bytes.saturating_add(n_to_charge);
|
||||||
|
let _ = this.settle_c2s_rate_debt(cx);
|
||||||
|
}
|
||||||
|
|
||||||
trace!(user = %this.user, bytes = n, "C->S");
|
trace!(user = %this.user, bytes = n, "C->S");
|
||||||
}
|
}
|
||||||
@@ -398,28 +536,66 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
|
|||||||
return Poll::Ready(Err(quota_io_error()));
|
return Poll::Ready(Err(quota_io_error()));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
let mut shaper_reserved_bytes = 0u64;
|
||||||
|
let mut write_buf = buf;
|
||||||
|
if let Some(lease) = this.traffic_lease.as_ref() {
|
||||||
|
if !buf.is_empty() {
|
||||||
|
loop {
|
||||||
|
let consume = lease.try_consume(RateDirection::Down, buf.len() as u64);
|
||||||
|
if consume.granted > 0 {
|
||||||
|
shaper_reserved_bytes = consume.granted;
|
||||||
|
if consume.granted < buf.len() as u64 {
|
||||||
|
write_buf = &buf[..consume.granted as usize];
|
||||||
|
}
|
||||||
|
let _ = Self::poll_wait(
|
||||||
|
&mut this.s2c_wait,
|
||||||
|
cx,
|
||||||
|
Some(lease),
|
||||||
|
RateDirection::Down,
|
||||||
|
);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
Self::arm_wait(
|
||||||
|
&mut this.s2c_wait,
|
||||||
|
consume.blocked_user,
|
||||||
|
consume.blocked_cidr,
|
||||||
|
);
|
||||||
|
if Self::poll_wait(&mut this.s2c_wait, cx, Some(lease), RateDirection::Down)
|
||||||
|
.is_pending()
|
||||||
|
{
|
||||||
|
return Poll::Pending;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
let _ = Self::poll_wait(&mut this.s2c_wait, cx, Some(lease), RateDirection::Down);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
let mut remaining_before = None;
|
let mut remaining_before = None;
|
||||||
let mut reserved_bytes = 0u64;
|
let mut reserved_bytes = 0u64;
|
||||||
let mut write_buf = buf;
|
|
||||||
if let Some(limit) = this.quota_limit {
|
if let Some(limit) = this.quota_limit {
|
||||||
if !buf.is_empty() {
|
if !write_buf.is_empty() {
|
||||||
let mut reserve_rounds = 0usize;
|
let mut reserve_rounds = 0usize;
|
||||||
while reserved_bytes == 0 {
|
while reserved_bytes == 0 {
|
||||||
let used_before = this.user_stats.quota_used();
|
let used_before = this.user_stats.quota_used();
|
||||||
let remaining = limit.saturating_sub(used_before);
|
let remaining = limit.saturating_sub(used_before);
|
||||||
if remaining == 0 {
|
if remaining == 0 {
|
||||||
|
if let Some(lease) = this.traffic_lease.as_ref() {
|
||||||
|
lease.refund(RateDirection::Down, shaper_reserved_bytes);
|
||||||
|
}
|
||||||
this.quota_exceeded.store(true, Ordering::Release);
|
this.quota_exceeded.store(true, Ordering::Release);
|
||||||
return Poll::Ready(Err(quota_io_error()));
|
return Poll::Ready(Err(quota_io_error()));
|
||||||
}
|
}
|
||||||
remaining_before = Some(remaining);
|
remaining_before = Some(remaining);
|
||||||
|
|
||||||
let desired = remaining.min(buf.len() as u64);
|
let desired = remaining.min(write_buf.len() as u64);
|
||||||
let mut saw_contention = false;
|
let mut saw_contention = false;
|
||||||
for _ in 0..QUOTA_RESERVE_SPIN_RETRIES {
|
for _ in 0..QUOTA_RESERVE_SPIN_RETRIES {
|
||||||
match this.user_stats.quota_try_reserve(desired, limit) {
|
match this.user_stats.quota_try_reserve(desired, limit) {
|
||||||
Ok(_) => {
|
Ok(_) => {
|
||||||
reserved_bytes = desired;
|
reserved_bytes = desired;
|
||||||
write_buf = &buf[..desired as usize];
|
write_buf = &write_buf[..desired as usize];
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
Err(crate::stats::QuotaReserveError::LimitExceeded) => {
|
Err(crate::stats::QuotaReserveError::LimitExceeded) => {
|
||||||
@@ -434,6 +610,9 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
|
|||||||
if reserved_bytes == 0 {
|
if reserved_bytes == 0 {
|
||||||
reserve_rounds = reserve_rounds.saturating_add(1);
|
reserve_rounds = reserve_rounds.saturating_add(1);
|
||||||
if reserve_rounds >= QUOTA_RESERVE_MAX_ROUNDS {
|
if reserve_rounds >= QUOTA_RESERVE_MAX_ROUNDS {
|
||||||
|
if let Some(lease) = this.traffic_lease.as_ref() {
|
||||||
|
lease.refund(RateDirection::Down, shaper_reserved_bytes);
|
||||||
|
}
|
||||||
this.quota_exceeded.store(true, Ordering::Release);
|
this.quota_exceeded.store(true, Ordering::Release);
|
||||||
return Poll::Ready(Err(quota_io_error()));
|
return Poll::Ready(Err(quota_io_error()));
|
||||||
}
|
}
|
||||||
@@ -446,6 +625,9 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
|
|||||||
let used_before = this.user_stats.quota_used();
|
let used_before = this.user_stats.quota_used();
|
||||||
let remaining = limit.saturating_sub(used_before);
|
let remaining = limit.saturating_sub(used_before);
|
||||||
if remaining == 0 {
|
if remaining == 0 {
|
||||||
|
if let Some(lease) = this.traffic_lease.as_ref() {
|
||||||
|
lease.refund(RateDirection::Down, shaper_reserved_bytes);
|
||||||
|
}
|
||||||
this.quota_exceeded.store(true, Ordering::Release);
|
this.quota_exceeded.store(true, Ordering::Release);
|
||||||
return Poll::Ready(Err(quota_io_error()));
|
return Poll::Ready(Err(quota_io_error()));
|
||||||
}
|
}
|
||||||
@@ -456,23 +638,20 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
|
|||||||
match Pin::new(&mut this.inner).poll_write(cx, write_buf) {
|
match Pin::new(&mut this.inner).poll_write(cx, write_buf) {
|
||||||
Poll::Ready(Ok(n)) => {
|
Poll::Ready(Ok(n)) => {
|
||||||
if reserved_bytes > n as u64 {
|
if reserved_bytes > n as u64 {
|
||||||
let refund = reserved_bytes - n as u64;
|
refund_reserved_quota_bytes(
|
||||||
let mut current = this.user_stats.quota_used.load(Ordering::Relaxed);
|
this.user_stats.as_ref(),
|
||||||
loop {
|
reserved_bytes - n as u64,
|
||||||
let next = current.saturating_sub(refund);
|
);
|
||||||
match this.user_stats.quota_used.compare_exchange_weak(
|
|
||||||
current,
|
|
||||||
next,
|
|
||||||
Ordering::Relaxed,
|
|
||||||
Ordering::Relaxed,
|
|
||||||
) {
|
|
||||||
Ok(_) => break,
|
|
||||||
Err(observed) => current = observed,
|
|
||||||
}
|
}
|
||||||
|
if shaper_reserved_bytes > n as u64
|
||||||
|
&& let Some(lease) = this.traffic_lease.as_ref()
|
||||||
|
{
|
||||||
|
lease.refund(RateDirection::Down, shaper_reserved_bytes - n as u64);
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
if n > 0 {
|
if n > 0 {
|
||||||
|
if let Some(lease) = this.traffic_lease.as_ref() {
|
||||||
|
Self::record_wait(&mut this.s2c_wait, Some(lease), RateDirection::Down);
|
||||||
|
}
|
||||||
let n_to_charge = n as u64;
|
let n_to_charge = n as u64;
|
||||||
|
|
||||||
// S→C: data written to client
|
// S→C: data written to client
|
||||||
@@ -512,37 +691,23 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
|
|||||||
}
|
}
|
||||||
Poll::Ready(Err(err)) => {
|
Poll::Ready(Err(err)) => {
|
||||||
if reserved_bytes > 0 {
|
if reserved_bytes > 0 {
|
||||||
let mut current = this.user_stats.quota_used.load(Ordering::Relaxed);
|
refund_reserved_quota_bytes(this.user_stats.as_ref(), reserved_bytes);
|
||||||
loop {
|
|
||||||
let next = current.saturating_sub(reserved_bytes);
|
|
||||||
match this.user_stats.quota_used.compare_exchange_weak(
|
|
||||||
current,
|
|
||||||
next,
|
|
||||||
Ordering::Relaxed,
|
|
||||||
Ordering::Relaxed,
|
|
||||||
) {
|
|
||||||
Ok(_) => break,
|
|
||||||
Err(observed) => current = observed,
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
if shaper_reserved_bytes > 0
|
||||||
|
&& let Some(lease) = this.traffic_lease.as_ref()
|
||||||
|
{
|
||||||
|
lease.refund(RateDirection::Down, shaper_reserved_bytes);
|
||||||
}
|
}
|
||||||
Poll::Ready(Err(err))
|
Poll::Ready(Err(err))
|
||||||
}
|
}
|
||||||
Poll::Pending => {
|
Poll::Pending => {
|
||||||
if reserved_bytes > 0 {
|
if reserved_bytes > 0 {
|
||||||
let mut current = this.user_stats.quota_used.load(Ordering::Relaxed);
|
refund_reserved_quota_bytes(this.user_stats.as_ref(), reserved_bytes);
|
||||||
loop {
|
|
||||||
let next = current.saturating_sub(reserved_bytes);
|
|
||||||
match this.user_stats.quota_used.compare_exchange_weak(
|
|
||||||
current,
|
|
||||||
next,
|
|
||||||
Ordering::Relaxed,
|
|
||||||
Ordering::Relaxed,
|
|
||||||
) {
|
|
||||||
Ok(_) => break,
|
|
||||||
Err(observed) => current = observed,
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
if shaper_reserved_bytes > 0
|
||||||
|
&& let Some(lease) = this.traffic_lease.as_ref()
|
||||||
|
{
|
||||||
|
lease.refund(RateDirection::Down, shaper_reserved_bytes);
|
||||||
}
|
}
|
||||||
Poll::Pending
|
Poll::Pending
|
||||||
}
|
}
|
||||||
@@ -627,6 +792,43 @@ pub async fn relay_bidirectional_with_activity_timeout<CR, CW, SR, SW>(
|
|||||||
_buffer_pool: Arc<BufferPool>,
|
_buffer_pool: Arc<BufferPool>,
|
||||||
activity_timeout: Duration,
|
activity_timeout: Duration,
|
||||||
) -> Result<()>
|
) -> Result<()>
|
||||||
|
where
|
||||||
|
CR: AsyncRead + Unpin + Send + 'static,
|
||||||
|
CW: AsyncWrite + Unpin + Send + 'static,
|
||||||
|
SR: AsyncRead + Unpin + Send + 'static,
|
||||||
|
SW: AsyncWrite + Unpin + Send + 'static,
|
||||||
|
{
|
||||||
|
relay_bidirectional_with_activity_timeout_and_lease(
|
||||||
|
client_reader,
|
||||||
|
client_writer,
|
||||||
|
server_reader,
|
||||||
|
server_writer,
|
||||||
|
c2s_buf_size,
|
||||||
|
s2c_buf_size,
|
||||||
|
user,
|
||||||
|
stats,
|
||||||
|
quota_limit,
|
||||||
|
_buffer_pool,
|
||||||
|
None,
|
||||||
|
activity_timeout,
|
||||||
|
)
|
||||||
|
.await
|
||||||
|
}
|
||||||
|
|
||||||
|
pub async fn relay_bidirectional_with_activity_timeout_and_lease<CR, CW, SR, SW>(
|
||||||
|
client_reader: CR,
|
||||||
|
client_writer: CW,
|
||||||
|
server_reader: SR,
|
||||||
|
server_writer: SW,
|
||||||
|
c2s_buf_size: usize,
|
||||||
|
s2c_buf_size: usize,
|
||||||
|
user: &str,
|
||||||
|
stats: Arc<Stats>,
|
||||||
|
quota_limit: Option<u64>,
|
||||||
|
_buffer_pool: Arc<BufferPool>,
|
||||||
|
traffic_lease: Option<Arc<TrafficLease>>,
|
||||||
|
activity_timeout: Duration,
|
||||||
|
) -> Result<()>
|
||||||
where
|
where
|
||||||
CR: AsyncRead + Unpin + Send + 'static,
|
CR: AsyncRead + Unpin + Send + 'static,
|
||||||
CW: AsyncWrite + Unpin + Send + 'static,
|
CW: AsyncWrite + Unpin + Send + 'static,
|
||||||
@@ -644,11 +846,12 @@ where
|
|||||||
let mut server = CombinedStream::new(server_reader, server_writer);
|
let mut server = CombinedStream::new(server_reader, server_writer);
|
||||||
|
|
||||||
// Wrap client with stats/activity tracking
|
// Wrap client with stats/activity tracking
|
||||||
let mut client = StatsIo::new(
|
let mut client = StatsIo::new_with_traffic_lease(
|
||||||
client_combined,
|
client_combined,
|
||||||
Arc::clone(&counters),
|
Arc::clone(&counters),
|
||||||
Arc::clone(&stats),
|
Arc::clone(&stats),
|
||||||
user_owned.clone(),
|
user_owned.clone(),
|
||||||
|
traffic_lease,
|
||||||
quota_limit,
|
quota_limit,
|
||||||
Arc::clone("a_exceeded),
|
Arc::clone("a_exceeded),
|
||||||
epoch,
|
epoch,
|
||||||
|
|||||||
@@ -10,6 +10,7 @@ use tokio::sync::mpsc;
|
|||||||
|
|
||||||
use crate::proxy::handshake::{AuthProbeSaturationState, AuthProbeState};
|
use crate::proxy::handshake::{AuthProbeSaturationState, AuthProbeState};
|
||||||
use crate::proxy::middle_relay::{DesyncDedupRotationState, RelayIdleCandidateRegistry};
|
use crate::proxy::middle_relay::{DesyncDedupRotationState, RelayIdleCandidateRegistry};
|
||||||
|
use crate::proxy::traffic_limiter::TrafficLimiter;
|
||||||
|
|
||||||
const HANDSHAKE_RECENT_USER_RING_LEN: usize = 64;
|
const HANDSHAKE_RECENT_USER_RING_LEN: usize = 64;
|
||||||
|
|
||||||
@@ -65,6 +66,7 @@ pub(crate) struct MiddleRelaySharedState {
|
|||||||
pub(crate) struct ProxySharedState {
|
pub(crate) struct ProxySharedState {
|
||||||
pub(crate) handshake: HandshakeSharedState,
|
pub(crate) handshake: HandshakeSharedState,
|
||||||
pub(crate) middle_relay: MiddleRelaySharedState,
|
pub(crate) middle_relay: MiddleRelaySharedState,
|
||||||
|
pub(crate) traffic_limiter: Arc<TrafficLimiter>,
|
||||||
pub(crate) conntrack_pressure_active: AtomicBool,
|
pub(crate) conntrack_pressure_active: AtomicBool,
|
||||||
pub(crate) conntrack_close_tx: Mutex<Option<mpsc::Sender<ConntrackCloseEvent>>>,
|
pub(crate) conntrack_close_tx: Mutex<Option<mpsc::Sender<ConntrackCloseEvent>>>,
|
||||||
}
|
}
|
||||||
@@ -98,6 +100,7 @@ impl ProxySharedState {
|
|||||||
relay_idle_registry: Mutex::new(RelayIdleCandidateRegistry::default()),
|
relay_idle_registry: Mutex::new(RelayIdleCandidateRegistry::default()),
|
||||||
relay_idle_mark_seq: AtomicU64::new(0),
|
relay_idle_mark_seq: AtomicU64::new(0),
|
||||||
},
|
},
|
||||||
|
traffic_limiter: TrafficLimiter::new(),
|
||||||
conntrack_pressure_active: AtomicBool::new(false),
|
conntrack_pressure_active: AtomicBool::new(false),
|
||||||
conntrack_close_tx: Mutex::new(None),
|
conntrack_close_tx: Mutex::new(None),
|
||||||
})
|
})
|
||||||
|
|||||||
853
src/proxy/traffic_limiter.rs
Normal file
853
src/proxy/traffic_limiter.rs
Normal file
@@ -0,0 +1,853 @@
|
|||||||
|
use std::collections::{HashMap, HashSet};
|
||||||
|
use std::hash::{Hash, Hasher};
|
||||||
|
use std::net::IpAddr;
|
||||||
|
use std::sync::Arc;
|
||||||
|
use std::sync::OnceLock;
|
||||||
|
use std::sync::atomic::{AtomicU64, Ordering};
|
||||||
|
use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
|
||||||
|
|
||||||
|
use arc_swap::ArcSwap;
|
||||||
|
use dashmap::DashMap;
|
||||||
|
use ipnetwork::IpNetwork;
|
||||||
|
|
||||||
|
use crate::config::RateLimitBps;
|
||||||
|
|
||||||
|
const REGISTRY_SHARDS: usize = 64;
|
||||||
|
const FAIR_EPOCH_MS: u64 = 20;
|
||||||
|
const MAX_BORROW_CHUNK_BYTES: u64 = 32 * 1024;
|
||||||
|
const CLEANUP_INTERVAL_SECS: u64 = 60;
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub enum RateDirection {
|
||||||
|
Up,
|
||||||
|
Down,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub struct TrafficConsumeResult {
|
||||||
|
pub granted: u64,
|
||||||
|
pub blocked_user: bool,
|
||||||
|
pub blocked_cidr: bool,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy)]
|
||||||
|
pub struct TrafficLimiterMetricsSnapshot {
|
||||||
|
pub user_throttle_up_total: u64,
|
||||||
|
pub user_throttle_down_total: u64,
|
||||||
|
pub cidr_throttle_up_total: u64,
|
||||||
|
pub cidr_throttle_down_total: u64,
|
||||||
|
pub user_wait_up_ms_total: u64,
|
||||||
|
pub user_wait_down_ms_total: u64,
|
||||||
|
pub cidr_wait_up_ms_total: u64,
|
||||||
|
pub cidr_wait_down_ms_total: u64,
|
||||||
|
pub user_active_leases: u64,
|
||||||
|
pub cidr_active_leases: u64,
|
||||||
|
pub user_policy_entries: u64,
|
||||||
|
pub cidr_policy_entries: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Default)]
|
||||||
|
struct ScopeMetrics {
|
||||||
|
throttle_up_total: AtomicU64,
|
||||||
|
throttle_down_total: AtomicU64,
|
||||||
|
wait_up_ms_total: AtomicU64,
|
||||||
|
wait_down_ms_total: AtomicU64,
|
||||||
|
active_leases: AtomicU64,
|
||||||
|
policy_entries: AtomicU64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl ScopeMetrics {
|
||||||
|
fn throttle(&self, direction: RateDirection) {
|
||||||
|
match direction {
|
||||||
|
RateDirection::Up => {
|
||||||
|
self.throttle_up_total.fetch_add(1, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
RateDirection::Down => {
|
||||||
|
self.throttle_down_total.fetch_add(1, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn wait_ms(&self, direction: RateDirection, wait_ms: u64) {
|
||||||
|
match direction {
|
||||||
|
RateDirection::Up => {
|
||||||
|
self.wait_up_ms_total.fetch_add(wait_ms, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
RateDirection::Down => {
|
||||||
|
self.wait_down_ms_total
|
||||||
|
.fetch_add(wait_ms, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Default)]
|
||||||
|
struct AtomicRatePair {
|
||||||
|
up_bps: AtomicU64,
|
||||||
|
down_bps: AtomicU64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl AtomicRatePair {
|
||||||
|
fn set(&self, limits: RateLimitBps) {
|
||||||
|
self.up_bps.store(limits.up_bps, Ordering::Relaxed);
|
||||||
|
self.down_bps.store(limits.down_bps, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
fn get(&self, direction: RateDirection) -> u64 {
|
||||||
|
match direction {
|
||||||
|
RateDirection::Up => self.up_bps.load(Ordering::Relaxed),
|
||||||
|
RateDirection::Down => self.down_bps.load(Ordering::Relaxed),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Default)]
|
||||||
|
struct DirectionBucket {
|
||||||
|
epoch: AtomicU64,
|
||||||
|
used: AtomicU64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl DirectionBucket {
|
||||||
|
fn sync_epoch(&self, epoch: u64) {
|
||||||
|
let current = self.epoch.load(Ordering::Relaxed);
|
||||||
|
if current == epoch {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if current < epoch
|
||||||
|
&& self
|
||||||
|
.epoch
|
||||||
|
.compare_exchange(current, epoch, Ordering::Relaxed, Ordering::Relaxed)
|
||||||
|
.is_ok()
|
||||||
|
{
|
||||||
|
self.used.store(0, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn try_consume(&self, cap_bps: u64, requested: u64) -> u64 {
|
||||||
|
if requested == 0 {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
if cap_bps == 0 {
|
||||||
|
return requested;
|
||||||
|
}
|
||||||
|
|
||||||
|
let epoch = current_epoch();
|
||||||
|
self.sync_epoch(epoch);
|
||||||
|
let cap_epoch = bytes_per_epoch(cap_bps);
|
||||||
|
|
||||||
|
loop {
|
||||||
|
let used = self.used.load(Ordering::Relaxed);
|
||||||
|
if used >= cap_epoch {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
let remaining = cap_epoch.saturating_sub(used);
|
||||||
|
let grant = requested.min(remaining);
|
||||||
|
if grant == 0 {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
let next = used.saturating_add(grant);
|
||||||
|
if self
|
||||||
|
.used
|
||||||
|
.compare_exchange_weak(used, next, Ordering::Relaxed, Ordering::Relaxed)
|
||||||
|
.is_ok()
|
||||||
|
{
|
||||||
|
return grant;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refund(&self, bytes: u64) {
|
||||||
|
if bytes == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
decrement_atomic_saturating(&self.used, bytes);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct UserBucket {
|
||||||
|
rates: AtomicRatePair,
|
||||||
|
up: DirectionBucket,
|
||||||
|
down: DirectionBucket,
|
||||||
|
active_leases: AtomicU64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl UserBucket {
|
||||||
|
fn new(limits: RateLimitBps) -> Self {
|
||||||
|
let rates = AtomicRatePair::default();
|
||||||
|
rates.set(limits);
|
||||||
|
Self {
|
||||||
|
rates,
|
||||||
|
up: DirectionBucket::default(),
|
||||||
|
down: DirectionBucket::default(),
|
||||||
|
active_leases: AtomicU64::new(0),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn set_rates(&self, limits: RateLimitBps) {
|
||||||
|
self.rates.set(limits);
|
||||||
|
}
|
||||||
|
|
||||||
|
fn try_consume(&self, direction: RateDirection, requested: u64) -> u64 {
|
||||||
|
let cap_bps = self.rates.get(direction);
|
||||||
|
match direction {
|
||||||
|
RateDirection::Up => self.up.try_consume(cap_bps, requested),
|
||||||
|
RateDirection::Down => self.down.try_consume(cap_bps, requested),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refund(&self, direction: RateDirection, bytes: u64) {
|
||||||
|
match direction {
|
||||||
|
RateDirection::Up => self.up.refund(bytes),
|
||||||
|
RateDirection::Down => self.down.refund(bytes),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Default)]
|
||||||
|
struct CidrDirectionBucket {
|
||||||
|
epoch: AtomicU64,
|
||||||
|
used: AtomicU64,
|
||||||
|
active_users: AtomicU64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl CidrDirectionBucket {
|
||||||
|
fn sync_epoch(&self, epoch: u64) {
|
||||||
|
let current = self.epoch.load(Ordering::Relaxed);
|
||||||
|
if current == epoch {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if current < epoch
|
||||||
|
&& self
|
||||||
|
.epoch
|
||||||
|
.compare_exchange(current, epoch, Ordering::Relaxed, Ordering::Relaxed)
|
||||||
|
.is_ok()
|
||||||
|
{
|
||||||
|
self.used.store(0, Ordering::Relaxed);
|
||||||
|
self.active_users.store(0, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn try_consume(
|
||||||
|
&self,
|
||||||
|
user_state: &CidrUserDirectionState,
|
||||||
|
cap_epoch: u64,
|
||||||
|
requested: u64,
|
||||||
|
) -> u64 {
|
||||||
|
if requested == 0 || cap_epoch == 0 {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
let epoch = current_epoch();
|
||||||
|
self.sync_epoch(epoch);
|
||||||
|
user_state.sync_epoch_and_mark_active(epoch, &self.active_users);
|
||||||
|
let active_users = self.active_users.load(Ordering::Relaxed).max(1);
|
||||||
|
let fair_share = cap_epoch.saturating_div(active_users).max(1);
|
||||||
|
|
||||||
|
loop {
|
||||||
|
let total_used = self.used.load(Ordering::Relaxed);
|
||||||
|
if total_used >= cap_epoch {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
let total_remaining = cap_epoch.saturating_sub(total_used);
|
||||||
|
let user_used = user_state.used.load(Ordering::Relaxed);
|
||||||
|
let guaranteed_remaining = fair_share.saturating_sub(user_used);
|
||||||
|
|
||||||
|
let grant = if guaranteed_remaining > 0 {
|
||||||
|
requested.min(guaranteed_remaining).min(total_remaining)
|
||||||
|
} else {
|
||||||
|
requested.min(total_remaining).min(MAX_BORROW_CHUNK_BYTES)
|
||||||
|
};
|
||||||
|
|
||||||
|
if grant == 0 {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
let next_total = total_used.saturating_add(grant);
|
||||||
|
if self
|
||||||
|
.used
|
||||||
|
.compare_exchange_weak(total_used, next_total, Ordering::Relaxed, Ordering::Relaxed)
|
||||||
|
.is_ok()
|
||||||
|
{
|
||||||
|
user_state.used.fetch_add(grant, Ordering::Relaxed);
|
||||||
|
return grant;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refund(&self, bytes: u64) {
|
||||||
|
if bytes == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
decrement_atomic_saturating(&self.used, bytes);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Default)]
|
||||||
|
struct CidrUserDirectionState {
|
||||||
|
epoch: AtomicU64,
|
||||||
|
used: AtomicU64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl CidrUserDirectionState {
|
||||||
|
fn sync_epoch_and_mark_active(&self, epoch: u64, active_users: &AtomicU64) {
|
||||||
|
let current = self.epoch.load(Ordering::Relaxed);
|
||||||
|
if current == epoch {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if current < epoch
|
||||||
|
&& self
|
||||||
|
.epoch
|
||||||
|
.compare_exchange(current, epoch, Ordering::Relaxed, Ordering::Relaxed)
|
||||||
|
.is_ok()
|
||||||
|
{
|
||||||
|
self.used.store(0, Ordering::Relaxed);
|
||||||
|
active_users.fetch_add(1, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refund(&self, bytes: u64) {
|
||||||
|
if bytes == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
decrement_atomic_saturating(&self.used, bytes);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct CidrUserShare {
|
||||||
|
active_conns: AtomicU64,
|
||||||
|
up: CidrUserDirectionState,
|
||||||
|
down: CidrUserDirectionState,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl CidrUserShare {
|
||||||
|
fn new() -> Self {
|
||||||
|
Self {
|
||||||
|
active_conns: AtomicU64::new(0),
|
||||||
|
up: CidrUserDirectionState::default(),
|
||||||
|
down: CidrUserDirectionState::default(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct CidrBucket {
|
||||||
|
rates: AtomicRatePair,
|
||||||
|
up: CidrDirectionBucket,
|
||||||
|
down: CidrDirectionBucket,
|
||||||
|
users: ShardedRegistry<CidrUserShare>,
|
||||||
|
active_leases: AtomicU64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl CidrBucket {
|
||||||
|
fn new(limits: RateLimitBps) -> Self {
|
||||||
|
let rates = AtomicRatePair::default();
|
||||||
|
rates.set(limits);
|
||||||
|
Self {
|
||||||
|
rates,
|
||||||
|
up: CidrDirectionBucket::default(),
|
||||||
|
down: CidrDirectionBucket::default(),
|
||||||
|
users: ShardedRegistry::new(REGISTRY_SHARDS),
|
||||||
|
active_leases: AtomicU64::new(0),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn set_rates(&self, limits: RateLimitBps) {
|
||||||
|
self.rates.set(limits);
|
||||||
|
}
|
||||||
|
|
||||||
|
fn acquire_user_share(&self, user: &str) -> Arc<CidrUserShare> {
|
||||||
|
let share = self.users.get_or_insert_with(user, CidrUserShare::new);
|
||||||
|
share.active_conns.fetch_add(1, Ordering::Relaxed);
|
||||||
|
share
|
||||||
|
}
|
||||||
|
|
||||||
|
fn release_user_share(&self, user: &str, share: &Arc<CidrUserShare>) {
|
||||||
|
decrement_atomic_saturating(&share.active_conns, 1);
|
||||||
|
let share_for_remove = Arc::clone(share);
|
||||||
|
let _ = self.users.remove_if(user, |candidate| {
|
||||||
|
Arc::ptr_eq(candidate, &share_for_remove)
|
||||||
|
&& candidate.active_conns.load(Ordering::Relaxed) == 0
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
fn try_consume_for_user(
|
||||||
|
&self,
|
||||||
|
direction: RateDirection,
|
||||||
|
share: &CidrUserShare,
|
||||||
|
requested: u64,
|
||||||
|
) -> u64 {
|
||||||
|
let cap_bps = self.rates.get(direction);
|
||||||
|
if cap_bps == 0 {
|
||||||
|
return requested;
|
||||||
|
}
|
||||||
|
let cap_epoch = bytes_per_epoch(cap_bps);
|
||||||
|
match direction {
|
||||||
|
RateDirection::Up => self.up.try_consume(&share.up, cap_epoch, requested),
|
||||||
|
RateDirection::Down => self.down.try_consume(&share.down, cap_epoch, requested),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn refund_for_user(&self, direction: RateDirection, share: &CidrUserShare, bytes: u64) {
|
||||||
|
match direction {
|
||||||
|
RateDirection::Up => {
|
||||||
|
self.up.refund(bytes);
|
||||||
|
share.up.refund(bytes);
|
||||||
|
}
|
||||||
|
RateDirection::Down => {
|
||||||
|
self.down.refund(bytes);
|
||||||
|
share.down.refund(bytes);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn cleanup_idle_users(&self) {
|
||||||
|
self.users
|
||||||
|
.retain(|_, share| share.active_conns.load(Ordering::Relaxed) > 0);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Clone)]
|
||||||
|
struct CidrRule {
|
||||||
|
key: String,
|
||||||
|
cidr: IpNetwork,
|
||||||
|
limits: RateLimitBps,
|
||||||
|
prefix_len: u8,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Default)]
|
||||||
|
struct PolicySnapshot {
|
||||||
|
user_limits: HashMap<String, RateLimitBps>,
|
||||||
|
cidr_rules_v4: Vec<CidrRule>,
|
||||||
|
cidr_rules_v6: Vec<CidrRule>,
|
||||||
|
cidr_rule_keys: HashSet<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl PolicySnapshot {
|
||||||
|
fn match_cidr(&self, ip: IpAddr) -> Option<&CidrRule> {
|
||||||
|
match ip {
|
||||||
|
IpAddr::V4(_) => self
|
||||||
|
.cidr_rules_v4
|
||||||
|
.iter()
|
||||||
|
.find(|rule| rule.cidr.contains(ip)),
|
||||||
|
IpAddr::V6(_) => self
|
||||||
|
.cidr_rules_v6
|
||||||
|
.iter()
|
||||||
|
.find(|rule| rule.cidr.contains(ip)),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct ShardedRegistry<T> {
|
||||||
|
shards: Box<[DashMap<String, Arc<T>>]>,
|
||||||
|
mask: usize,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<T> ShardedRegistry<T> {
|
||||||
|
fn new(shards: usize) -> Self {
|
||||||
|
let shard_count = shards.max(1).next_power_of_two();
|
||||||
|
let mut items = Vec::with_capacity(shard_count);
|
||||||
|
for _ in 0..shard_count {
|
||||||
|
items.push(DashMap::<String, Arc<T>>::new());
|
||||||
|
}
|
||||||
|
Self {
|
||||||
|
shards: items.into_boxed_slice(),
|
||||||
|
mask: shard_count.saturating_sub(1),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn shard_index(&self, key: &str) -> usize {
|
||||||
|
let mut hasher = std::collections::hash_map::DefaultHasher::new();
|
||||||
|
key.hash(&mut hasher);
|
||||||
|
(hasher.finish() as usize) & self.mask
|
||||||
|
}
|
||||||
|
|
||||||
|
fn get_or_insert_with<F>(&self, key: &str, make: F) -> Arc<T>
|
||||||
|
where
|
||||||
|
F: FnOnce() -> T,
|
||||||
|
{
|
||||||
|
let shard = &self.shards[self.shard_index(key)];
|
||||||
|
match shard.entry(key.to_string()) {
|
||||||
|
dashmap::mapref::entry::Entry::Occupied(entry) => Arc::clone(entry.get()),
|
||||||
|
dashmap::mapref::entry::Entry::Vacant(slot) => {
|
||||||
|
let value = Arc::new(make());
|
||||||
|
slot.insert(Arc::clone(&value));
|
||||||
|
value
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn retain<F>(&self, predicate: F)
|
||||||
|
where
|
||||||
|
F: Fn(&String, &Arc<T>) -> bool + Copy,
|
||||||
|
{
|
||||||
|
for shard in &*self.shards {
|
||||||
|
shard.retain(|key, value| predicate(key, value));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn remove_if<F>(&self, key: &str, predicate: F) -> bool
|
||||||
|
where
|
||||||
|
F: Fn(&Arc<T>) -> bool,
|
||||||
|
{
|
||||||
|
let shard = &self.shards[self.shard_index(key)];
|
||||||
|
let should_remove = match shard.get(key) {
|
||||||
|
Some(entry) => predicate(entry.value()),
|
||||||
|
None => false,
|
||||||
|
};
|
||||||
|
if !should_remove {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
shard.remove(key).is_some()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct TrafficLease {
|
||||||
|
limiter: Arc<TrafficLimiter>,
|
||||||
|
user_bucket: Option<Arc<UserBucket>>,
|
||||||
|
cidr_bucket: Option<Arc<CidrBucket>>,
|
||||||
|
cidr_user_key: Option<String>,
|
||||||
|
cidr_user_share: Option<Arc<CidrUserShare>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl TrafficLease {
|
||||||
|
pub fn try_consume(&self, direction: RateDirection, requested: u64) -> TrafficConsumeResult {
|
||||||
|
if requested == 0 {
|
||||||
|
return TrafficConsumeResult {
|
||||||
|
granted: 0,
|
||||||
|
blocked_user: false,
|
||||||
|
blocked_cidr: false,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
let mut granted = requested;
|
||||||
|
if let Some(user_bucket) = self.user_bucket.as_ref() {
|
||||||
|
let user_granted = user_bucket.try_consume(direction, granted);
|
||||||
|
if user_granted == 0 {
|
||||||
|
self.limiter.observe_throttle(direction, true, false);
|
||||||
|
return TrafficConsumeResult {
|
||||||
|
granted: 0,
|
||||||
|
blocked_user: true,
|
||||||
|
blocked_cidr: false,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
granted = user_granted;
|
||||||
|
}
|
||||||
|
|
||||||
|
if let (Some(cidr_bucket), Some(cidr_user_share)) =
|
||||||
|
(self.cidr_bucket.as_ref(), self.cidr_user_share.as_ref())
|
||||||
|
{
|
||||||
|
let cidr_granted =
|
||||||
|
cidr_bucket.try_consume_for_user(direction, cidr_user_share, granted);
|
||||||
|
if cidr_granted < granted
|
||||||
|
&& let Some(user_bucket) = self.user_bucket.as_ref()
|
||||||
|
{
|
||||||
|
user_bucket.refund(direction, granted.saturating_sub(cidr_granted));
|
||||||
|
}
|
||||||
|
if cidr_granted == 0 {
|
||||||
|
self.limiter.observe_throttle(direction, false, true);
|
||||||
|
return TrafficConsumeResult {
|
||||||
|
granted: 0,
|
||||||
|
blocked_user: false,
|
||||||
|
blocked_cidr: true,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
granted = cidr_granted;
|
||||||
|
}
|
||||||
|
|
||||||
|
TrafficConsumeResult {
|
||||||
|
granted,
|
||||||
|
blocked_user: false,
|
||||||
|
blocked_cidr: false,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn refund(&self, direction: RateDirection, bytes: u64) {
|
||||||
|
if bytes == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if let Some(user_bucket) = self.user_bucket.as_ref() {
|
||||||
|
user_bucket.refund(direction, bytes);
|
||||||
|
}
|
||||||
|
if let (Some(cidr_bucket), Some(cidr_user_share)) =
|
||||||
|
(self.cidr_bucket.as_ref(), self.cidr_user_share.as_ref())
|
||||||
|
{
|
||||||
|
cidr_bucket.refund_for_user(direction, cidr_user_share, bytes);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn observe_wait_ms(
|
||||||
|
&self,
|
||||||
|
direction: RateDirection,
|
||||||
|
blocked_user: bool,
|
||||||
|
blocked_cidr: bool,
|
||||||
|
wait_ms: u64,
|
||||||
|
) {
|
||||||
|
if wait_ms == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
self.limiter
|
||||||
|
.observe_wait(direction, blocked_user, blocked_cidr, wait_ms);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Drop for TrafficLease {
|
||||||
|
fn drop(&mut self) {
|
||||||
|
if let Some(bucket) = self.user_bucket.as_ref() {
|
||||||
|
decrement_atomic_saturating(&bucket.active_leases, 1);
|
||||||
|
decrement_atomic_saturating(&self.limiter.user_scope.active_leases, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
if let Some(bucket) = self.cidr_bucket.as_ref() {
|
||||||
|
if let (Some(user_key), Some(share)) =
|
||||||
|
(self.cidr_user_key.as_ref(), self.cidr_user_share.as_ref())
|
||||||
|
{
|
||||||
|
bucket.release_user_share(user_key, share);
|
||||||
|
}
|
||||||
|
decrement_atomic_saturating(&bucket.active_leases, 1);
|
||||||
|
decrement_atomic_saturating(&self.limiter.cidr_scope.active_leases, 1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct TrafficLimiter {
|
||||||
|
policy: ArcSwap<PolicySnapshot>,
|
||||||
|
user_buckets: ShardedRegistry<UserBucket>,
|
||||||
|
cidr_buckets: ShardedRegistry<CidrBucket>,
|
||||||
|
user_scope: ScopeMetrics,
|
||||||
|
cidr_scope: ScopeMetrics,
|
||||||
|
last_cleanup_epoch_secs: AtomicU64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl TrafficLimiter {
|
||||||
|
pub fn new() -> Arc<Self> {
|
||||||
|
Arc::new(Self {
|
||||||
|
policy: ArcSwap::from_pointee(PolicySnapshot::default()),
|
||||||
|
user_buckets: ShardedRegistry::new(REGISTRY_SHARDS),
|
||||||
|
cidr_buckets: ShardedRegistry::new(REGISTRY_SHARDS),
|
||||||
|
user_scope: ScopeMetrics::default(),
|
||||||
|
cidr_scope: ScopeMetrics::default(),
|
||||||
|
last_cleanup_epoch_secs: AtomicU64::new(0),
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn apply_policy(
|
||||||
|
&self,
|
||||||
|
user_limits: HashMap<String, RateLimitBps>,
|
||||||
|
cidr_limits: HashMap<IpNetwork, RateLimitBps>,
|
||||||
|
) {
|
||||||
|
let filtered_users = user_limits
|
||||||
|
.into_iter()
|
||||||
|
.filter(|(_, limit)| limit.up_bps > 0 || limit.down_bps > 0)
|
||||||
|
.collect::<HashMap<_, _>>();
|
||||||
|
|
||||||
|
let mut cidr_rules_v4 = Vec::new();
|
||||||
|
let mut cidr_rules_v6 = Vec::new();
|
||||||
|
let mut cidr_rule_keys = HashSet::new();
|
||||||
|
for (cidr, limits) in cidr_limits {
|
||||||
|
if limits.up_bps == 0 && limits.down_bps == 0 {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
let key = cidr.to_string();
|
||||||
|
let rule = CidrRule {
|
||||||
|
key: key.clone(),
|
||||||
|
cidr,
|
||||||
|
limits,
|
||||||
|
prefix_len: cidr.prefix(),
|
||||||
|
};
|
||||||
|
cidr_rule_keys.insert(key);
|
||||||
|
match rule.cidr {
|
||||||
|
IpNetwork::V4(_) => cidr_rules_v4.push(rule),
|
||||||
|
IpNetwork::V6(_) => cidr_rules_v6.push(rule),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
cidr_rules_v4.sort_by(|a, b| b.prefix_len.cmp(&a.prefix_len));
|
||||||
|
cidr_rules_v6.sort_by(|a, b| b.prefix_len.cmp(&a.prefix_len));
|
||||||
|
|
||||||
|
self.user_scope
|
||||||
|
.policy_entries
|
||||||
|
.store(filtered_users.len() as u64, Ordering::Relaxed);
|
||||||
|
self.cidr_scope
|
||||||
|
.policy_entries
|
||||||
|
.store(cidr_rule_keys.len() as u64, Ordering::Relaxed);
|
||||||
|
|
||||||
|
self.policy.store(Arc::new(PolicySnapshot {
|
||||||
|
user_limits: filtered_users,
|
||||||
|
cidr_rules_v4,
|
||||||
|
cidr_rules_v6,
|
||||||
|
cidr_rule_keys,
|
||||||
|
}));
|
||||||
|
|
||||||
|
self.maybe_cleanup();
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn acquire_lease(
|
||||||
|
self: &Arc<Self>,
|
||||||
|
user: &str,
|
||||||
|
client_ip: IpAddr,
|
||||||
|
) -> Option<Arc<TrafficLease>> {
|
||||||
|
let policy = self.policy.load_full();
|
||||||
|
let mut user_bucket = None;
|
||||||
|
if let Some(limit) = policy.user_limits.get(user).copied() {
|
||||||
|
let bucket = self
|
||||||
|
.user_buckets
|
||||||
|
.get_or_insert_with(user, || UserBucket::new(limit));
|
||||||
|
bucket.set_rates(limit);
|
||||||
|
bucket.active_leases.fetch_add(1, Ordering::Relaxed);
|
||||||
|
self.user_scope
|
||||||
|
.active_leases
|
||||||
|
.fetch_add(1, Ordering::Relaxed);
|
||||||
|
user_bucket = Some(bucket);
|
||||||
|
}
|
||||||
|
|
||||||
|
let mut cidr_bucket = None;
|
||||||
|
let mut cidr_user_key = None;
|
||||||
|
let mut cidr_user_share = None;
|
||||||
|
if let Some(rule) = policy.match_cidr(client_ip) {
|
||||||
|
let bucket = self
|
||||||
|
.cidr_buckets
|
||||||
|
.get_or_insert_with(rule.key.as_str(), || CidrBucket::new(rule.limits));
|
||||||
|
bucket.set_rates(rule.limits);
|
||||||
|
bucket.active_leases.fetch_add(1, Ordering::Relaxed);
|
||||||
|
self.cidr_scope
|
||||||
|
.active_leases
|
||||||
|
.fetch_add(1, Ordering::Relaxed);
|
||||||
|
let share = bucket.acquire_user_share(user);
|
||||||
|
cidr_user_key = Some(user.to_string());
|
||||||
|
cidr_user_share = Some(share);
|
||||||
|
cidr_bucket = Some(bucket);
|
||||||
|
}
|
||||||
|
|
||||||
|
if user_bucket.is_none() && cidr_bucket.is_none() {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
|
||||||
|
self.maybe_cleanup();
|
||||||
|
Some(Arc::new(TrafficLease {
|
||||||
|
limiter: Arc::clone(self),
|
||||||
|
user_bucket,
|
||||||
|
cidr_bucket,
|
||||||
|
cidr_user_key,
|
||||||
|
cidr_user_share,
|
||||||
|
}))
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn metrics_snapshot(&self) -> TrafficLimiterMetricsSnapshot {
|
||||||
|
TrafficLimiterMetricsSnapshot {
|
||||||
|
user_throttle_up_total: self.user_scope.throttle_up_total.load(Ordering::Relaxed),
|
||||||
|
user_throttle_down_total: self.user_scope.throttle_down_total.load(Ordering::Relaxed),
|
||||||
|
cidr_throttle_up_total: self.cidr_scope.throttle_up_total.load(Ordering::Relaxed),
|
||||||
|
cidr_throttle_down_total: self.cidr_scope.throttle_down_total.load(Ordering::Relaxed),
|
||||||
|
user_wait_up_ms_total: self.user_scope.wait_up_ms_total.load(Ordering::Relaxed),
|
||||||
|
user_wait_down_ms_total: self.user_scope.wait_down_ms_total.load(Ordering::Relaxed),
|
||||||
|
cidr_wait_up_ms_total: self.cidr_scope.wait_up_ms_total.load(Ordering::Relaxed),
|
||||||
|
cidr_wait_down_ms_total: self.cidr_scope.wait_down_ms_total.load(Ordering::Relaxed),
|
||||||
|
user_active_leases: self.user_scope.active_leases.load(Ordering::Relaxed),
|
||||||
|
cidr_active_leases: self.cidr_scope.active_leases.load(Ordering::Relaxed),
|
||||||
|
user_policy_entries: self.user_scope.policy_entries.load(Ordering::Relaxed),
|
||||||
|
cidr_policy_entries: self.cidr_scope.policy_entries.load(Ordering::Relaxed),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn observe_throttle(&self, direction: RateDirection, blocked_user: bool, blocked_cidr: bool) {
|
||||||
|
if blocked_user {
|
||||||
|
self.user_scope.throttle(direction);
|
||||||
|
}
|
||||||
|
if blocked_cidr {
|
||||||
|
self.cidr_scope.throttle(direction);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn observe_wait(
|
||||||
|
&self,
|
||||||
|
direction: RateDirection,
|
||||||
|
blocked_user: bool,
|
||||||
|
blocked_cidr: bool,
|
||||||
|
wait_ms: u64,
|
||||||
|
) {
|
||||||
|
if blocked_user {
|
||||||
|
self.user_scope.wait_ms(direction, wait_ms);
|
||||||
|
}
|
||||||
|
if blocked_cidr {
|
||||||
|
self.cidr_scope.wait_ms(direction, wait_ms);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn maybe_cleanup(&self) {
|
||||||
|
let now_epoch_secs = now_epoch_secs();
|
||||||
|
let last = self.last_cleanup_epoch_secs.load(Ordering::Relaxed);
|
||||||
|
if now_epoch_secs.saturating_sub(last) < CLEANUP_INTERVAL_SECS {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if self
|
||||||
|
.last_cleanup_epoch_secs
|
||||||
|
.compare_exchange(last, now_epoch_secs, Ordering::Relaxed, Ordering::Relaxed)
|
||||||
|
.is_err()
|
||||||
|
{
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
let policy = self.policy.load_full();
|
||||||
|
self.user_buckets.retain(|user, bucket| {
|
||||||
|
bucket.active_leases.load(Ordering::Relaxed) > 0
|
||||||
|
|| policy.user_limits.contains_key(user)
|
||||||
|
});
|
||||||
|
self.cidr_buckets.retain(|cidr_key, bucket| {
|
||||||
|
bucket.cleanup_idle_users();
|
||||||
|
bucket.active_leases.load(Ordering::Relaxed) > 0
|
||||||
|
|| policy.cidr_rule_keys.contains(cidr_key)
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn next_refill_delay() -> Duration {
|
||||||
|
let start = limiter_epoch_start();
|
||||||
|
let elapsed_ms = start.elapsed().as_millis() as u64;
|
||||||
|
let epoch_pos = elapsed_ms % FAIR_EPOCH_MS;
|
||||||
|
let wait_ms = FAIR_EPOCH_MS.saturating_sub(epoch_pos).max(1);
|
||||||
|
Duration::from_millis(wait_ms)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn decrement_atomic_saturating(counter: &AtomicU64, by: u64) {
|
||||||
|
if by == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
let mut current = counter.load(Ordering::Relaxed);
|
||||||
|
loop {
|
||||||
|
if current == 0 {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
let next = current.saturating_sub(by);
|
||||||
|
match counter.compare_exchange_weak(current, next, Ordering::Relaxed, Ordering::Relaxed) {
|
||||||
|
Ok(_) => return,
|
||||||
|
Err(actual) => current = actual,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn now_epoch_secs() -> u64 {
|
||||||
|
SystemTime::now()
|
||||||
|
.duration_since(UNIX_EPOCH)
|
||||||
|
.unwrap_or_default()
|
||||||
|
.as_secs()
|
||||||
|
}
|
||||||
|
|
||||||
|
fn bytes_per_epoch(bps: u64) -> u64 {
|
||||||
|
if bps == 0 {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
let numerator = bps.saturating_mul(FAIR_EPOCH_MS);
|
||||||
|
let bytes = numerator.saturating_div(8_000);
|
||||||
|
bytes.max(1)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn current_epoch() -> u64 {
|
||||||
|
let start = limiter_epoch_start();
|
||||||
|
let elapsed_ms = start.elapsed().as_millis() as u64;
|
||||||
|
elapsed_ms / FAIR_EPOCH_MS
|
||||||
|
}
|
||||||
|
|
||||||
|
fn limiter_epoch_start() -> &'static Instant {
|
||||||
|
static START: OnceLock<Instant> = OnceLock::new();
|
||||||
|
START.get_or_init(Instant::now)
|
||||||
|
}
|
||||||
121
src/stats/mod.rs
121
src/stats/mod.rs
@@ -175,6 +175,18 @@ pub struct Stats {
|
|||||||
me_route_drop_queue_full: AtomicU64,
|
me_route_drop_queue_full: AtomicU64,
|
||||||
me_route_drop_queue_full_base: AtomicU64,
|
me_route_drop_queue_full_base: AtomicU64,
|
||||||
me_route_drop_queue_full_high: AtomicU64,
|
me_route_drop_queue_full_high: AtomicU64,
|
||||||
|
me_fair_pressure_state_gauge: AtomicU64,
|
||||||
|
me_fair_active_flows_gauge: AtomicU64,
|
||||||
|
me_fair_queued_bytes_gauge: AtomicU64,
|
||||||
|
me_fair_standing_flows_gauge: AtomicU64,
|
||||||
|
me_fair_backpressured_flows_gauge: AtomicU64,
|
||||||
|
me_fair_scheduler_rounds_total: AtomicU64,
|
||||||
|
me_fair_deficit_grants_total: AtomicU64,
|
||||||
|
me_fair_deficit_skips_total: AtomicU64,
|
||||||
|
me_fair_enqueue_rejects_total: AtomicU64,
|
||||||
|
me_fair_shed_drops_total: AtomicU64,
|
||||||
|
me_fair_penalties_total: AtomicU64,
|
||||||
|
me_fair_downstream_stalls_total: AtomicU64,
|
||||||
me_d2c_batches_total: AtomicU64,
|
me_d2c_batches_total: AtomicU64,
|
||||||
me_d2c_batch_frames_total: AtomicU64,
|
me_d2c_batch_frames_total: AtomicU64,
|
||||||
me_d2c_batch_bytes_total: AtomicU64,
|
me_d2c_batch_bytes_total: AtomicU64,
|
||||||
@@ -856,6 +868,78 @@ impl Stats {
|
|||||||
.fetch_add(1, Ordering::Relaxed);
|
.fetch_add(1, Ordering::Relaxed);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
pub fn set_me_fair_pressure_state_gauge(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() {
|
||||||
|
self.me_fair_pressure_state_gauge
|
||||||
|
.store(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn set_me_fair_active_flows_gauge(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() {
|
||||||
|
self.me_fair_active_flows_gauge
|
||||||
|
.store(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn set_me_fair_queued_bytes_gauge(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() {
|
||||||
|
self.me_fair_queued_bytes_gauge
|
||||||
|
.store(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn set_me_fair_standing_flows_gauge(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() {
|
||||||
|
self.me_fair_standing_flows_gauge
|
||||||
|
.store(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn set_me_fair_backpressured_flows_gauge(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() {
|
||||||
|
self.me_fair_backpressured_flows_gauge
|
||||||
|
.store(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn add_me_fair_scheduler_rounds_total(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() && value > 0 {
|
||||||
|
self.me_fair_scheduler_rounds_total
|
||||||
|
.fetch_add(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn add_me_fair_deficit_grants_total(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() && value > 0 {
|
||||||
|
self.me_fair_deficit_grants_total
|
||||||
|
.fetch_add(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn add_me_fair_deficit_skips_total(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() && value > 0 {
|
||||||
|
self.me_fair_deficit_skips_total
|
||||||
|
.fetch_add(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn add_me_fair_enqueue_rejects_total(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() && value > 0 {
|
||||||
|
self.me_fair_enqueue_rejects_total
|
||||||
|
.fetch_add(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn add_me_fair_shed_drops_total(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() && value > 0 {
|
||||||
|
self.me_fair_shed_drops_total
|
||||||
|
.fetch_add(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn add_me_fair_penalties_total(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() && value > 0 {
|
||||||
|
self.me_fair_penalties_total
|
||||||
|
.fetch_add(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
pub fn add_me_fair_downstream_stalls_total(&self, value: u64) {
|
||||||
|
if self.telemetry_me_allows_normal() && value > 0 {
|
||||||
|
self.me_fair_downstream_stalls_total
|
||||||
|
.fetch_add(value, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
}
|
||||||
pub fn increment_me_d2c_batches_total(&self) {
|
pub fn increment_me_d2c_batches_total(&self) {
|
||||||
if self.telemetry_me_allows_normal() {
|
if self.telemetry_me_allows_normal() {
|
||||||
self.me_d2c_batches_total.fetch_add(1, Ordering::Relaxed);
|
self.me_d2c_batches_total.fetch_add(1, Ordering::Relaxed);
|
||||||
@@ -1806,6 +1890,43 @@ impl Stats {
|
|||||||
pub fn get_me_route_drop_queue_full_high(&self) -> u64 {
|
pub fn get_me_route_drop_queue_full_high(&self) -> u64 {
|
||||||
self.me_route_drop_queue_full_high.load(Ordering::Relaxed)
|
self.me_route_drop_queue_full_high.load(Ordering::Relaxed)
|
||||||
}
|
}
|
||||||
|
pub fn get_me_fair_pressure_state_gauge(&self) -> u64 {
|
||||||
|
self.me_fair_pressure_state_gauge.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_active_flows_gauge(&self) -> u64 {
|
||||||
|
self.me_fair_active_flows_gauge.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_queued_bytes_gauge(&self) -> u64 {
|
||||||
|
self.me_fair_queued_bytes_gauge.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_standing_flows_gauge(&self) -> u64 {
|
||||||
|
self.me_fair_standing_flows_gauge.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_backpressured_flows_gauge(&self) -> u64 {
|
||||||
|
self.me_fair_backpressured_flows_gauge
|
||||||
|
.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_scheduler_rounds_total(&self) -> u64 {
|
||||||
|
self.me_fair_scheduler_rounds_total.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_deficit_grants_total(&self) -> u64 {
|
||||||
|
self.me_fair_deficit_grants_total.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_deficit_skips_total(&self) -> u64 {
|
||||||
|
self.me_fair_deficit_skips_total.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_enqueue_rejects_total(&self) -> u64 {
|
||||||
|
self.me_fair_enqueue_rejects_total.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_shed_drops_total(&self) -> u64 {
|
||||||
|
self.me_fair_shed_drops_total.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_penalties_total(&self) -> u64 {
|
||||||
|
self.me_fair_penalties_total.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
|
pub fn get_me_fair_downstream_stalls_total(&self) -> u64 {
|
||||||
|
self.me_fair_downstream_stalls_total.load(Ordering::Relaxed)
|
||||||
|
}
|
||||||
pub fn get_me_d2c_batches_total(&self) -> u64 {
|
pub fn get_me_d2c_batches_total(&self) -> u64 {
|
||||||
self.me_d2c_batches_total.load(Ordering::Relaxed)
|
self.me_d2c_batches_total.load(Ordering::Relaxed)
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -11,6 +11,7 @@ use crc32fast::Hasher;
|
|||||||
|
|
||||||
const MIN_APP_DATA: usize = 64;
|
const MIN_APP_DATA: usize = 64;
|
||||||
const MAX_APP_DATA: usize = MAX_TLS_CIPHERTEXT_SIZE;
|
const MAX_APP_DATA: usize = MAX_TLS_CIPHERTEXT_SIZE;
|
||||||
|
const MAX_TICKET_RECORDS: usize = 4;
|
||||||
|
|
||||||
fn jitter_and_clamp_sizes(sizes: &[usize], rng: &SecureRandom) -> Vec<usize> {
|
fn jitter_and_clamp_sizes(sizes: &[usize], rng: &SecureRandom) -> Vec<usize> {
|
||||||
sizes
|
sizes
|
||||||
@@ -62,6 +63,53 @@ fn ensure_payload_capacity(mut sizes: Vec<usize>, payload_len: usize) -> Vec<usi
|
|||||||
sizes
|
sizes
|
||||||
}
|
}
|
||||||
|
|
||||||
|
fn emulated_app_data_sizes(cached: &CachedTlsData) -> Vec<usize> {
|
||||||
|
match cached.behavior_profile.source {
|
||||||
|
TlsProfileSource::Raw | TlsProfileSource::Merged => {
|
||||||
|
if !cached.behavior_profile.app_data_record_sizes.is_empty() {
|
||||||
|
return cached.behavior_profile.app_data_record_sizes.clone();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
TlsProfileSource::Default | TlsProfileSource::Rustls => {}
|
||||||
|
}
|
||||||
|
|
||||||
|
let mut sizes = cached.app_data_records_sizes.clone();
|
||||||
|
if sizes.is_empty() {
|
||||||
|
sizes.push(cached.total_app_data_len.max(1024));
|
||||||
|
}
|
||||||
|
sizes
|
||||||
|
}
|
||||||
|
|
||||||
|
fn emulated_change_cipher_spec_count(cached: &CachedTlsData) -> usize {
|
||||||
|
usize::from(cached.behavior_profile.change_cipher_spec_count.max(1))
|
||||||
|
}
|
||||||
|
|
||||||
|
fn emulated_ticket_record_sizes(
|
||||||
|
cached: &CachedTlsData,
|
||||||
|
new_session_tickets: u8,
|
||||||
|
rng: &SecureRandom,
|
||||||
|
) -> Vec<usize> {
|
||||||
|
let mut sizes = match cached.behavior_profile.source {
|
||||||
|
TlsProfileSource::Raw | TlsProfileSource::Merged => {
|
||||||
|
cached.behavior_profile.ticket_record_sizes.clone()
|
||||||
|
}
|
||||||
|
TlsProfileSource::Default | TlsProfileSource::Rustls => Vec::new(),
|
||||||
|
};
|
||||||
|
|
||||||
|
let target_count = sizes
|
||||||
|
.len()
|
||||||
|
.max(usize::from(
|
||||||
|
new_session_tickets.min(MAX_TICKET_RECORDS as u8),
|
||||||
|
))
|
||||||
|
.min(MAX_TICKET_RECORDS);
|
||||||
|
|
||||||
|
while sizes.len() < target_count {
|
||||||
|
sizes.push(rng.range(48) + 48);
|
||||||
|
}
|
||||||
|
|
||||||
|
sizes
|
||||||
|
}
|
||||||
|
|
||||||
fn build_compact_cert_info_payload(cert_info: &ParsedCertificateInfo) -> Option<Vec<u8>> {
|
fn build_compact_cert_info_payload(cert_info: &ParsedCertificateInfo) -> Option<Vec<u8>> {
|
||||||
let mut fields = Vec::new();
|
let mut fields = Vec::new();
|
||||||
|
|
||||||
@@ -180,39 +228,21 @@ pub fn build_emulated_server_hello(
|
|||||||
server_hello.extend_from_slice(&message);
|
server_hello.extend_from_slice(&message);
|
||||||
|
|
||||||
// --- ChangeCipherSpec ---
|
// --- ChangeCipherSpec ---
|
||||||
let change_cipher_spec = [
|
let change_cipher_spec_count = emulated_change_cipher_spec_count(cached);
|
||||||
|
let mut change_cipher_spec = Vec::with_capacity(change_cipher_spec_count * 6);
|
||||||
|
for _ in 0..change_cipher_spec_count {
|
||||||
|
change_cipher_spec.extend_from_slice(&[
|
||||||
TLS_RECORD_CHANGE_CIPHER,
|
TLS_RECORD_CHANGE_CIPHER,
|
||||||
TLS_VERSION[0],
|
TLS_VERSION[0],
|
||||||
TLS_VERSION[1],
|
TLS_VERSION[1],
|
||||||
0x00,
|
0x00,
|
||||||
0x01,
|
0x01,
|
||||||
0x01,
|
0x01,
|
||||||
];
|
]);
|
||||||
|
}
|
||||||
|
|
||||||
// --- ApplicationData (fake encrypted records) ---
|
// --- ApplicationData (fake encrypted records) ---
|
||||||
let sizes = match cached.behavior_profile.source {
|
let mut sizes = jitter_and_clamp_sizes(&emulated_app_data_sizes(cached), rng);
|
||||||
TlsProfileSource::Raw | TlsProfileSource::Merged => cached
|
|
||||||
.app_data_records_sizes
|
|
||||||
.first()
|
|
||||||
.copied()
|
|
||||||
.or_else(|| {
|
|
||||||
cached
|
|
||||||
.behavior_profile
|
|
||||||
.app_data_record_sizes
|
|
||||||
.first()
|
|
||||||
.copied()
|
|
||||||
})
|
|
||||||
.map(|size| vec![size])
|
|
||||||
.unwrap_or_else(|| vec![cached.total_app_data_len.max(1024)]),
|
|
||||||
_ => {
|
|
||||||
let mut sizes = cached.app_data_records_sizes.clone();
|
|
||||||
if sizes.is_empty() {
|
|
||||||
sizes.push(cached.total_app_data_len.max(1024));
|
|
||||||
}
|
|
||||||
sizes
|
|
||||||
}
|
|
||||||
};
|
|
||||||
let mut sizes = jitter_and_clamp_sizes(&sizes, rng);
|
|
||||||
let compact_payload = cached
|
let compact_payload = cached
|
||||||
.cert_info
|
.cert_info
|
||||||
.as_ref()
|
.as_ref()
|
||||||
@@ -299,10 +329,7 @@ pub fn build_emulated_server_hello(
|
|||||||
// --- Combine ---
|
// --- Combine ---
|
||||||
// Optional NewSessionTicket mimic records (opaque ApplicationData for fingerprint).
|
// Optional NewSessionTicket mimic records (opaque ApplicationData for fingerprint).
|
||||||
let mut tickets = Vec::new();
|
let mut tickets = Vec::new();
|
||||||
let ticket_count = new_session_tickets.min(4);
|
for ticket_len in emulated_ticket_record_sizes(cached, new_session_tickets, rng) {
|
||||||
if ticket_count > 0 {
|
|
||||||
for _ in 0..ticket_count {
|
|
||||||
let ticket_len: usize = rng.range(48) + 48;
|
|
||||||
let mut rec = Vec::with_capacity(5 + ticket_len);
|
let mut rec = Vec::with_capacity(5 + ticket_len);
|
||||||
rec.push(TLS_RECORD_APPLICATION);
|
rec.push(TLS_RECORD_APPLICATION);
|
||||||
rec.extend_from_slice(&TLS_VERSION);
|
rec.extend_from_slice(&TLS_VERSION);
|
||||||
@@ -310,7 +337,6 @@ pub fn build_emulated_server_hello(
|
|||||||
rec.extend_from_slice(&rng.bytes(ticket_len));
|
rec.extend_from_slice(&rng.bytes(ticket_len));
|
||||||
tickets.extend_from_slice(&rec);
|
tickets.extend_from_slice(&rec);
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
let mut response = Vec::with_capacity(
|
let mut response = Vec::with_capacity(
|
||||||
server_hello.len() + change_cipher_spec.len() + app_data.len() + tickets.len(),
|
server_hello.len() + change_cipher_spec.len() + app_data.len() + tickets.len(),
|
||||||
@@ -334,6 +360,10 @@ pub fn build_emulated_server_hello(
|
|||||||
#[path = "tests/emulator_security_tests.rs"]
|
#[path = "tests/emulator_security_tests.rs"]
|
||||||
mod security_tests;
|
mod security_tests;
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
#[path = "tests/emulator_profile_fidelity_security_tests.rs"]
|
||||||
|
mod emulator_profile_fidelity_security_tests;
|
||||||
|
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
mod tests {
|
mod tests {
|
||||||
use std::time::SystemTime;
|
use std::time::SystemTime;
|
||||||
@@ -478,7 +508,7 @@ mod tests {
|
|||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_build_emulated_server_hello_ignores_tail_records_for_raw_profile() {
|
fn test_build_emulated_server_hello_replays_tail_records_for_profiled_tls() {
|
||||||
let mut cached = make_cached(None);
|
let mut cached = make_cached(None);
|
||||||
cached.app_data_records_sizes = vec![27, 3905, 537, 69];
|
cached.app_data_records_sizes = vec![27, 3905, 537, 69];
|
||||||
cached.total_app_data_len = 4538;
|
cached.total_app_data_len = 4538;
|
||||||
@@ -500,11 +530,19 @@ mod tests {
|
|||||||
|
|
||||||
let hello_len = u16::from_be_bytes([response[3], response[4]]) as usize;
|
let hello_len = u16::from_be_bytes([response[3], response[4]]) as usize;
|
||||||
let ccs_start = 5 + hello_len;
|
let ccs_start = 5 + hello_len;
|
||||||
let app_start = ccs_start + 6;
|
let mut pos = ccs_start + 6;
|
||||||
let app_len =
|
let mut app_lengths = Vec::new();
|
||||||
u16::from_be_bytes([response[app_start + 3], response[app_start + 4]]) as usize;
|
while pos + 5 <= response.len() {
|
||||||
|
assert_eq!(response[pos], TLS_RECORD_APPLICATION);
|
||||||
|
let record_len = u16::from_be_bytes([response[pos + 3], response[pos + 4]]) as usize;
|
||||||
|
app_lengths.push(record_len);
|
||||||
|
pos += 5 + record_len;
|
||||||
|
}
|
||||||
|
|
||||||
assert_eq!(response[app_start], TLS_RECORD_APPLICATION);
|
assert_eq!(app_lengths.len(), 4);
|
||||||
assert_eq!(app_start + 5 + app_len, response.len());
|
assert_eq!(app_lengths[0], 64);
|
||||||
|
assert_eq!(app_lengths[3], 69);
|
||||||
|
assert!(app_lengths[1] >= 64);
|
||||||
|
assert!(app_lengths[2] >= 64);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,6 +1,7 @@
|
|||||||
#![allow(clippy::too_many_arguments)]
|
#![allow(clippy::too_many_arguments)]
|
||||||
|
|
||||||
use dashmap::DashMap;
|
use dashmap::DashMap;
|
||||||
|
use std::net::SocketAddr;
|
||||||
use std::sync::Arc;
|
use std::sync::Arc;
|
||||||
use std::sync::OnceLock;
|
use std::sync::OnceLock;
|
||||||
use std::time::{Duration, Instant};
|
use std::time::{Duration, Instant};
|
||||||
@@ -793,6 +794,51 @@ async fn connect_tcp_with_upstream(
|
|||||||
))
|
))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
fn socket_addrs_from_upstream_stream(
|
||||||
|
stream: &UpstreamStream,
|
||||||
|
) -> (Option<SocketAddr>, Option<SocketAddr>) {
|
||||||
|
match stream {
|
||||||
|
UpstreamStream::Tcp(tcp) => (tcp.local_addr().ok(), tcp.peer_addr().ok()),
|
||||||
|
UpstreamStream::Shadowsocks(_) => (None, None),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn build_tls_fetch_proxy_header(
|
||||||
|
proxy_protocol: u8,
|
||||||
|
src_addr: Option<SocketAddr>,
|
||||||
|
dst_addr: Option<SocketAddr>,
|
||||||
|
) -> Option<Vec<u8>> {
|
||||||
|
match proxy_protocol {
|
||||||
|
0 => None,
|
||||||
|
2 => {
|
||||||
|
let header = match (src_addr, dst_addr) {
|
||||||
|
(Some(src @ SocketAddr::V4(_)), Some(dst @ SocketAddr::V4(_)))
|
||||||
|
| (Some(src @ SocketAddr::V6(_)), Some(dst @ SocketAddr::V6(_))) => {
|
||||||
|
ProxyProtocolV2Builder::new().with_addrs(src, dst).build()
|
||||||
|
}
|
||||||
|
_ => ProxyProtocolV2Builder::new().build(),
|
||||||
|
};
|
||||||
|
Some(header)
|
||||||
|
}
|
||||||
|
_ => {
|
||||||
|
let header = match (src_addr, dst_addr) {
|
||||||
|
(Some(SocketAddr::V4(src)), Some(SocketAddr::V4(dst))) => {
|
||||||
|
ProxyProtocolV1Builder::new()
|
||||||
|
.tcp4(src.into(), dst.into())
|
||||||
|
.build()
|
||||||
|
}
|
||||||
|
(Some(SocketAddr::V6(src)), Some(SocketAddr::V6(dst))) => {
|
||||||
|
ProxyProtocolV1Builder::new()
|
||||||
|
.tcp6(src.into(), dst.into())
|
||||||
|
.build()
|
||||||
|
}
|
||||||
|
_ => ProxyProtocolV1Builder::new().build(),
|
||||||
|
};
|
||||||
|
Some(header)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
fn encode_tls13_certificate_message(cert_chain_der: &[Vec<u8>]) -> Option<Vec<u8>> {
|
fn encode_tls13_certificate_message(cert_chain_der: &[Vec<u8>]) -> Option<Vec<u8>> {
|
||||||
if cert_chain_der.is_empty() {
|
if cert_chain_der.is_empty() {
|
||||||
return None;
|
return None;
|
||||||
@@ -824,7 +870,7 @@ async fn fetch_via_raw_tls_stream<S>(
|
|||||||
mut stream: S,
|
mut stream: S,
|
||||||
sni: &str,
|
sni: &str,
|
||||||
connect_timeout: Duration,
|
connect_timeout: Duration,
|
||||||
proxy_protocol: u8,
|
proxy_header: Option<Vec<u8>>,
|
||||||
profile: TlsFetchProfile,
|
profile: TlsFetchProfile,
|
||||||
grease_enabled: bool,
|
grease_enabled: bool,
|
||||||
deterministic: bool,
|
deterministic: bool,
|
||||||
@@ -835,11 +881,7 @@ where
|
|||||||
let rng = SecureRandom::new();
|
let rng = SecureRandom::new();
|
||||||
let client_hello = build_client_hello(sni, &rng, profile, grease_enabled, deterministic);
|
let client_hello = build_client_hello(sni, &rng, profile, grease_enabled, deterministic);
|
||||||
timeout(connect_timeout, async {
|
timeout(connect_timeout, async {
|
||||||
if proxy_protocol > 0 {
|
if let Some(header) = proxy_header.as_ref() {
|
||||||
let header = match proxy_protocol {
|
|
||||||
2 => ProxyProtocolV2Builder::new().build(),
|
|
||||||
_ => ProxyProtocolV1Builder::new().build(),
|
|
||||||
};
|
|
||||||
stream.write_all(&header).await?;
|
stream.write_all(&header).await?;
|
||||||
}
|
}
|
||||||
stream.write_all(&client_hello).await?;
|
stream.write_all(&client_hello).await?;
|
||||||
@@ -921,11 +963,12 @@ async fn fetch_via_raw_tls(
|
|||||||
sock = %sock_path,
|
sock = %sock_path,
|
||||||
"Raw TLS fetch using mask unix socket"
|
"Raw TLS fetch using mask unix socket"
|
||||||
);
|
);
|
||||||
|
let proxy_header = build_tls_fetch_proxy_header(proxy_protocol, None, None);
|
||||||
return fetch_via_raw_tls_stream(
|
return fetch_via_raw_tls_stream(
|
||||||
stream,
|
stream,
|
||||||
sni,
|
sni,
|
||||||
connect_timeout,
|
connect_timeout,
|
||||||
proxy_protocol,
|
proxy_header,
|
||||||
profile,
|
profile,
|
||||||
grease_enabled,
|
grease_enabled,
|
||||||
deterministic,
|
deterministic,
|
||||||
@@ -956,11 +999,13 @@ async fn fetch_via_raw_tls(
|
|||||||
let stream =
|
let stream =
|
||||||
connect_tcp_with_upstream(host, port, connect_timeout, upstream, scope, strict_route)
|
connect_tcp_with_upstream(host, port, connect_timeout, upstream, scope, strict_route)
|
||||||
.await?;
|
.await?;
|
||||||
|
let (src_addr, dst_addr) = socket_addrs_from_upstream_stream(&stream);
|
||||||
|
let proxy_header = build_tls_fetch_proxy_header(proxy_protocol, src_addr, dst_addr);
|
||||||
fetch_via_raw_tls_stream(
|
fetch_via_raw_tls_stream(
|
||||||
stream,
|
stream,
|
||||||
sni,
|
sni,
|
||||||
connect_timeout,
|
connect_timeout,
|
||||||
proxy_protocol,
|
proxy_header,
|
||||||
profile,
|
profile,
|
||||||
grease_enabled,
|
grease_enabled,
|
||||||
deterministic,
|
deterministic,
|
||||||
@@ -972,17 +1017,13 @@ async fn fetch_via_rustls_stream<S>(
|
|||||||
mut stream: S,
|
mut stream: S,
|
||||||
host: &str,
|
host: &str,
|
||||||
sni: &str,
|
sni: &str,
|
||||||
proxy_protocol: u8,
|
proxy_header: Option<Vec<u8>>,
|
||||||
) -> Result<TlsFetchResult>
|
) -> Result<TlsFetchResult>
|
||||||
where
|
where
|
||||||
S: AsyncRead + AsyncWrite + Unpin,
|
S: AsyncRead + AsyncWrite + Unpin,
|
||||||
{
|
{
|
||||||
// rustls handshake path for certificate and basic negotiated metadata.
|
// rustls handshake path for certificate and basic negotiated metadata.
|
||||||
if proxy_protocol > 0 {
|
if let Some(header) = proxy_header.as_ref() {
|
||||||
let header = match proxy_protocol {
|
|
||||||
2 => ProxyProtocolV2Builder::new().build(),
|
|
||||||
_ => ProxyProtocolV1Builder::new().build(),
|
|
||||||
};
|
|
||||||
stream.write_all(&header).await?;
|
stream.write_all(&header).await?;
|
||||||
stream.flush().await?;
|
stream.flush().await?;
|
||||||
}
|
}
|
||||||
@@ -1082,7 +1123,8 @@ async fn fetch_via_rustls(
|
|||||||
sock = %sock_path,
|
sock = %sock_path,
|
||||||
"Rustls fetch using mask unix socket"
|
"Rustls fetch using mask unix socket"
|
||||||
);
|
);
|
||||||
return fetch_via_rustls_stream(stream, host, sni, proxy_protocol).await;
|
let proxy_header = build_tls_fetch_proxy_header(proxy_protocol, None, None);
|
||||||
|
return fetch_via_rustls_stream(stream, host, sni, proxy_header).await;
|
||||||
}
|
}
|
||||||
Ok(Err(e)) => {
|
Ok(Err(e)) => {
|
||||||
warn!(
|
warn!(
|
||||||
@@ -1108,7 +1150,9 @@ async fn fetch_via_rustls(
|
|||||||
let stream =
|
let stream =
|
||||||
connect_tcp_with_upstream(host, port, connect_timeout, upstream, scope, strict_route)
|
connect_tcp_with_upstream(host, port, connect_timeout, upstream, scope, strict_route)
|
||||||
.await?;
|
.await?;
|
||||||
fetch_via_rustls_stream(stream, host, sni, proxy_protocol).await
|
let (src_addr, dst_addr) = socket_addrs_from_upstream_stream(&stream);
|
||||||
|
let proxy_header = build_tls_fetch_proxy_header(proxy_protocol, src_addr, dst_addr);
|
||||||
|
fetch_via_rustls_stream(stream, host, sni, proxy_header).await
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Fetch real TLS metadata with an adaptive multi-profile strategy.
|
/// Fetch real TLS metadata with an adaptive multi-profile strategy.
|
||||||
@@ -1278,11 +1322,13 @@ pub async fn fetch_real_tls(
|
|||||||
|
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
mod tests {
|
mod tests {
|
||||||
|
use std::net::SocketAddr;
|
||||||
use std::time::{Duration, Instant};
|
use std::time::{Duration, Instant};
|
||||||
|
|
||||||
use super::{
|
use super::{
|
||||||
ProfileCacheValue, TlsFetchStrategy, build_client_hello, derive_behavior_profile,
|
ProfileCacheValue, TlsFetchStrategy, build_client_hello, build_tls_fetch_proxy_header,
|
||||||
encode_tls13_certificate_message, order_profiles, profile_cache, profile_cache_key,
|
derive_behavior_profile, encode_tls13_certificate_message, order_profiles, profile_cache,
|
||||||
|
profile_cache_key,
|
||||||
};
|
};
|
||||||
use crate::config::TlsFetchProfile;
|
use crate::config::TlsFetchProfile;
|
||||||
use crate::crypto::SecureRandom;
|
use crate::crypto::SecureRandom;
|
||||||
@@ -1423,4 +1469,48 @@ mod tests {
|
|||||||
|
|
||||||
assert_eq!(first, second);
|
assert_eq!(first, second);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_build_tls_fetch_proxy_header_v2_with_tcp_addrs() {
|
||||||
|
let src: SocketAddr = "198.51.100.10:42000".parse().expect("valid src");
|
||||||
|
let dst: SocketAddr = "203.0.113.20:443".parse().expect("valid dst");
|
||||||
|
let header = build_tls_fetch_proxy_header(2, Some(src), Some(dst)).expect("header");
|
||||||
|
|
||||||
|
assert_eq!(
|
||||||
|
&header[..12],
|
||||||
|
&[
|
||||||
|
0x0d, 0x0a, 0x0d, 0x0a, 0x00, 0x0d, 0x0a, 0x51, 0x55, 0x49, 0x54, 0x0a
|
||||||
|
]
|
||||||
|
);
|
||||||
|
assert_eq!(header[12], 0x21);
|
||||||
|
assert_eq!(header[13], 0x11);
|
||||||
|
assert_eq!(u16::from_be_bytes([header[14], header[15]]), 12);
|
||||||
|
assert_eq!(&header[16..20], &[198, 51, 100, 10]);
|
||||||
|
assert_eq!(&header[20..24], &[203, 0, 113, 20]);
|
||||||
|
assert_eq!(u16::from_be_bytes([header[24], header[25]]), 42000);
|
||||||
|
assert_eq!(u16::from_be_bytes([header[26], header[27]]), 443);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_build_tls_fetch_proxy_header_v2_mixed_family_falls_back_to_local_command() {
|
||||||
|
let src: SocketAddr = "198.51.100.10:42000".parse().expect("valid src");
|
||||||
|
let dst: SocketAddr = "[2001:db8::20]:443".parse().expect("valid dst");
|
||||||
|
let header = build_tls_fetch_proxy_header(2, Some(src), Some(dst)).expect("header");
|
||||||
|
|
||||||
|
assert_eq!(header[12], 0x20);
|
||||||
|
assert_eq!(header[13], 0x00);
|
||||||
|
assert_eq!(u16::from_be_bytes([header[14], header[15]]), 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_build_tls_fetch_proxy_header_v1_with_tcp_addrs() {
|
||||||
|
let src: SocketAddr = "198.51.100.10:42000".parse().expect("valid src");
|
||||||
|
let dst: SocketAddr = "203.0.113.20:443".parse().expect("valid dst");
|
||||||
|
let header = build_tls_fetch_proxy_header(1, Some(src), Some(dst)).expect("header");
|
||||||
|
|
||||||
|
assert_eq!(
|
||||||
|
header,
|
||||||
|
b"PROXY TCP4 198.51.100.10 203.0.113.20 42000 443\r\n"
|
||||||
|
);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -0,0 +1,95 @@
|
|||||||
|
use std::time::SystemTime;
|
||||||
|
|
||||||
|
use crate::crypto::SecureRandom;
|
||||||
|
use crate::protocol::constants::{
|
||||||
|
TLS_RECORD_APPLICATION, TLS_RECORD_CHANGE_CIPHER, TLS_RECORD_HANDSHAKE,
|
||||||
|
};
|
||||||
|
use crate::tls_front::emulator::build_emulated_server_hello;
|
||||||
|
use crate::tls_front::types::{
|
||||||
|
CachedTlsData, ParsedServerHello, TlsBehaviorProfile, TlsProfileSource,
|
||||||
|
};
|
||||||
|
|
||||||
|
fn make_cached() -> CachedTlsData {
|
||||||
|
CachedTlsData {
|
||||||
|
server_hello_template: ParsedServerHello {
|
||||||
|
version: [0x03, 0x03],
|
||||||
|
random: [0u8; 32],
|
||||||
|
session_id: Vec::new(),
|
||||||
|
cipher_suite: [0x13, 0x01],
|
||||||
|
compression: 0,
|
||||||
|
extensions: Vec::new(),
|
||||||
|
},
|
||||||
|
cert_info: None,
|
||||||
|
cert_payload: None,
|
||||||
|
app_data_records_sizes: vec![1200, 900, 220, 180],
|
||||||
|
total_app_data_len: 2500,
|
||||||
|
behavior_profile: TlsBehaviorProfile {
|
||||||
|
change_cipher_spec_count: 2,
|
||||||
|
app_data_record_sizes: vec![1200, 900],
|
||||||
|
ticket_record_sizes: vec![220, 180],
|
||||||
|
source: TlsProfileSource::Merged,
|
||||||
|
},
|
||||||
|
fetched_at: SystemTime::now(),
|
||||||
|
domain: "example.com".to_string(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn record_lengths_by_type(response: &[u8], wanted_type: u8) -> Vec<usize> {
|
||||||
|
let mut out = Vec::new();
|
||||||
|
let mut pos = 0usize;
|
||||||
|
while pos + 5 <= response.len() {
|
||||||
|
let record_type = response[pos];
|
||||||
|
let record_len = u16::from_be_bytes([response[pos + 3], response[pos + 4]]) as usize;
|
||||||
|
if pos + 5 + record_len > response.len() {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
if record_type == wanted_type {
|
||||||
|
out.push(record_len);
|
||||||
|
}
|
||||||
|
pos += 5 + record_len;
|
||||||
|
}
|
||||||
|
out
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn emulated_server_hello_replays_profile_change_cipher_spec_count() {
|
||||||
|
let cached = make_cached();
|
||||||
|
let rng = SecureRandom::new();
|
||||||
|
|
||||||
|
let response = build_emulated_server_hello(
|
||||||
|
b"secret",
|
||||||
|
&[0x71; 32],
|
||||||
|
&[0x72; 16],
|
||||||
|
&cached,
|
||||||
|
false,
|
||||||
|
&rng,
|
||||||
|
None,
|
||||||
|
0,
|
||||||
|
);
|
||||||
|
|
||||||
|
assert_eq!(response[0], TLS_RECORD_HANDSHAKE);
|
||||||
|
let ccs_records = record_lengths_by_type(&response, TLS_RECORD_CHANGE_CIPHER);
|
||||||
|
assert_eq!(ccs_records.len(), 2);
|
||||||
|
assert!(ccs_records.iter().all(|len| *len == 1));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn emulated_server_hello_replays_profile_ticket_tail_lengths() {
|
||||||
|
let cached = make_cached();
|
||||||
|
let rng = SecureRandom::new();
|
||||||
|
|
||||||
|
let response = build_emulated_server_hello(
|
||||||
|
b"secret",
|
||||||
|
&[0x81; 32],
|
||||||
|
&[0x82; 16],
|
||||||
|
&cached,
|
||||||
|
false,
|
||||||
|
&rng,
|
||||||
|
None,
|
||||||
|
0,
|
||||||
|
);
|
||||||
|
|
||||||
|
let app_records = record_lengths_by_type(&response, TLS_RECORD_APPLICATION);
|
||||||
|
assert!(app_records.len() >= 4);
|
||||||
|
assert_eq!(&app_records[app_records.len() - 2..], &[220, 180]);
|
||||||
|
}
|
||||||
13
src/transport/middle_proxy/fairness/mod.rs
Normal file
13
src/transport/middle_proxy/fairness/mod.rs
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
//! Backpressure-driven fairness control for ME reader routing.
|
||||||
|
//!
|
||||||
|
//! This module keeps fairness decisions worker-local:
|
||||||
|
//! each reader loop owns one scheduler instance and mutates it without locks.
|
||||||
|
|
||||||
|
mod model;
|
||||||
|
mod pressure;
|
||||||
|
mod scheduler;
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
pub(crate) use model::PressureState;
|
||||||
|
pub(crate) use model::{AdmissionDecision, DispatchAction, DispatchFeedback, SchedulerDecision};
|
||||||
|
pub(crate) use scheduler::{WorkerFairnessConfig, WorkerFairnessSnapshot, WorkerFairnessState};
|
||||||
140
src/transport/middle_proxy/fairness/model.rs
Normal file
140
src/transport/middle_proxy/fairness/model.rs
Normal file
@@ -0,0 +1,140 @@
|
|||||||
|
use std::time::Instant;
|
||||||
|
|
||||||
|
use bytes::Bytes;
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
|
||||||
|
#[repr(u8)]
|
||||||
|
pub(crate) enum PressureState {
|
||||||
|
Normal = 0,
|
||||||
|
Pressured = 1,
|
||||||
|
Shedding = 2,
|
||||||
|
Saturated = 3,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl PressureState {
|
||||||
|
pub(crate) fn as_u8(self) -> u8 {
|
||||||
|
self as u8
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for PressureState {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self::Normal
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub(crate) enum FlowPressureClass {
|
||||||
|
Healthy,
|
||||||
|
Bursty,
|
||||||
|
Backpressured,
|
||||||
|
Standing,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub(crate) enum StandingQueueState {
|
||||||
|
Transient,
|
||||||
|
Standing,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub(crate) enum FlowSchedulerState {
|
||||||
|
Idle,
|
||||||
|
Active,
|
||||||
|
Backpressured,
|
||||||
|
Penalized,
|
||||||
|
SheddingCandidate,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub(crate) struct QueuedFrame {
|
||||||
|
pub(crate) conn_id: u64,
|
||||||
|
pub(crate) flags: u32,
|
||||||
|
pub(crate) data: Bytes,
|
||||||
|
pub(crate) enqueued_at: Instant,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl QueuedFrame {
|
||||||
|
#[inline]
|
||||||
|
pub(crate) fn queued_bytes(&self) -> u64 {
|
||||||
|
self.data.len() as u64
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub(crate) struct FlowFairnessState {
|
||||||
|
pub(crate) _flow_id: u64,
|
||||||
|
pub(crate) _worker_id: u16,
|
||||||
|
pub(crate) pending_bytes: u64,
|
||||||
|
pub(crate) deficit_bytes: i64,
|
||||||
|
pub(crate) queue_started_at: Option<Instant>,
|
||||||
|
pub(crate) last_drain_at: Option<Instant>,
|
||||||
|
pub(crate) recent_drain_bytes: u64,
|
||||||
|
pub(crate) consecutive_stalls: u8,
|
||||||
|
pub(crate) consecutive_skips: u8,
|
||||||
|
pub(crate) penalty_score: u16,
|
||||||
|
pub(crate) pressure_class: FlowPressureClass,
|
||||||
|
pub(crate) standing_state: StandingQueueState,
|
||||||
|
pub(crate) scheduler_state: FlowSchedulerState,
|
||||||
|
pub(crate) bucket_id: usize,
|
||||||
|
pub(crate) in_active_ring: bool,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl FlowFairnessState {
|
||||||
|
pub(crate) fn new(flow_id: u64, worker_id: u16, bucket_id: usize) -> Self {
|
||||||
|
Self {
|
||||||
|
_flow_id: flow_id,
|
||||||
|
_worker_id: worker_id,
|
||||||
|
pending_bytes: 0,
|
||||||
|
deficit_bytes: 0,
|
||||||
|
queue_started_at: None,
|
||||||
|
last_drain_at: None,
|
||||||
|
recent_drain_bytes: 0,
|
||||||
|
consecutive_stalls: 0,
|
||||||
|
consecutive_skips: 0,
|
||||||
|
penalty_score: 0,
|
||||||
|
pressure_class: FlowPressureClass::Healthy,
|
||||||
|
standing_state: StandingQueueState::Transient,
|
||||||
|
scheduler_state: FlowSchedulerState::Idle,
|
||||||
|
bucket_id,
|
||||||
|
in_active_ring: false,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub(crate) enum AdmissionDecision {
|
||||||
|
Admit,
|
||||||
|
RejectWorkerCap,
|
||||||
|
RejectFlowCap,
|
||||||
|
RejectBucketCap,
|
||||||
|
RejectSaturated,
|
||||||
|
RejectStandingFlow,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub(crate) enum SchedulerDecision {
|
||||||
|
Idle,
|
||||||
|
Dispatch(DispatchCandidate),
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub(crate) struct DispatchCandidate {
|
||||||
|
pub(crate) frame: QueuedFrame,
|
||||||
|
pub(crate) pressure_state: PressureState,
|
||||||
|
pub(crate) flow_class: FlowPressureClass,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub(crate) enum DispatchFeedback {
|
||||||
|
Routed,
|
||||||
|
QueueFull,
|
||||||
|
ChannelClosed,
|
||||||
|
NoConn,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub(crate) enum DispatchAction {
|
||||||
|
Continue,
|
||||||
|
CloseFlow,
|
||||||
|
}
|
||||||
214
src/transport/middle_proxy/fairness/pressure.rs
Normal file
214
src/transport/middle_proxy/fairness/pressure.rs
Normal file
@@ -0,0 +1,214 @@
|
|||||||
|
use std::time::{Duration, Instant};
|
||||||
|
|
||||||
|
use super::model::PressureState;
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy)]
|
||||||
|
pub(crate) struct PressureSignals {
|
||||||
|
pub(crate) active_flows: usize,
|
||||||
|
pub(crate) total_queued_bytes: u64,
|
||||||
|
pub(crate) standing_flows: usize,
|
||||||
|
pub(crate) backpressured_flows: usize,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub(crate) struct PressureConfig {
|
||||||
|
pub(crate) evaluate_every_rounds: u32,
|
||||||
|
pub(crate) transition_hysteresis_rounds: u8,
|
||||||
|
pub(crate) standing_ratio_pressured_pct: u8,
|
||||||
|
pub(crate) standing_ratio_shedding_pct: u8,
|
||||||
|
pub(crate) standing_ratio_saturated_pct: u8,
|
||||||
|
pub(crate) queue_ratio_pressured_pct: u8,
|
||||||
|
pub(crate) queue_ratio_shedding_pct: u8,
|
||||||
|
pub(crate) queue_ratio_saturated_pct: u8,
|
||||||
|
pub(crate) reject_window: Duration,
|
||||||
|
pub(crate) rejects_pressured: u32,
|
||||||
|
pub(crate) rejects_shedding: u32,
|
||||||
|
pub(crate) rejects_saturated: u32,
|
||||||
|
pub(crate) stalls_pressured: u32,
|
||||||
|
pub(crate) stalls_shedding: u32,
|
||||||
|
pub(crate) stalls_saturated: u32,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for PressureConfig {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
evaluate_every_rounds: 8,
|
||||||
|
transition_hysteresis_rounds: 3,
|
||||||
|
standing_ratio_pressured_pct: 20,
|
||||||
|
standing_ratio_shedding_pct: 35,
|
||||||
|
standing_ratio_saturated_pct: 50,
|
||||||
|
queue_ratio_pressured_pct: 65,
|
||||||
|
queue_ratio_shedding_pct: 82,
|
||||||
|
queue_ratio_saturated_pct: 94,
|
||||||
|
reject_window: Duration::from_secs(2),
|
||||||
|
rejects_pressured: 32,
|
||||||
|
rejects_shedding: 96,
|
||||||
|
rejects_saturated: 256,
|
||||||
|
stalls_pressured: 32,
|
||||||
|
stalls_shedding: 96,
|
||||||
|
stalls_saturated: 256,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug)]
|
||||||
|
pub(crate) struct PressureEvaluator {
|
||||||
|
state: PressureState,
|
||||||
|
candidate_state: PressureState,
|
||||||
|
candidate_hits: u8,
|
||||||
|
rounds_since_eval: u32,
|
||||||
|
window_started_at: Instant,
|
||||||
|
admission_rejects_window: u32,
|
||||||
|
route_stalls_window: u32,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl PressureEvaluator {
|
||||||
|
pub(crate) fn new(now: Instant) -> Self {
|
||||||
|
Self {
|
||||||
|
state: PressureState::Normal,
|
||||||
|
candidate_state: PressureState::Normal,
|
||||||
|
candidate_hits: 0,
|
||||||
|
rounds_since_eval: 0,
|
||||||
|
window_started_at: now,
|
||||||
|
admission_rejects_window: 0,
|
||||||
|
route_stalls_window: 0,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[inline]
|
||||||
|
pub(crate) fn state(&self) -> PressureState {
|
||||||
|
self.state
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn note_admission_reject(&mut self, now: Instant, cfg: &PressureConfig) {
|
||||||
|
self.rotate_window_if_needed(now, cfg);
|
||||||
|
self.admission_rejects_window = self.admission_rejects_window.saturating_add(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn note_route_stall(&mut self, now: Instant, cfg: &PressureConfig) {
|
||||||
|
self.rotate_window_if_needed(now, cfg);
|
||||||
|
self.route_stalls_window = self.route_stalls_window.saturating_add(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn maybe_evaluate(
|
||||||
|
&mut self,
|
||||||
|
now: Instant,
|
||||||
|
cfg: &PressureConfig,
|
||||||
|
max_total_queued_bytes: u64,
|
||||||
|
signals: PressureSignals,
|
||||||
|
force: bool,
|
||||||
|
) -> PressureState {
|
||||||
|
self.rotate_window_if_needed(now, cfg);
|
||||||
|
self.rounds_since_eval = self.rounds_since_eval.saturating_add(1);
|
||||||
|
if !force && self.rounds_since_eval < cfg.evaluate_every_rounds.max(1) {
|
||||||
|
return self.state;
|
||||||
|
}
|
||||||
|
self.rounds_since_eval = 0;
|
||||||
|
|
||||||
|
let target = self.derive_target_state(cfg, max_total_queued_bytes, signals);
|
||||||
|
if target == self.state {
|
||||||
|
self.candidate_state = target;
|
||||||
|
self.candidate_hits = 0;
|
||||||
|
return self.state;
|
||||||
|
}
|
||||||
|
|
||||||
|
if self.candidate_state == target {
|
||||||
|
self.candidate_hits = self.candidate_hits.saturating_add(1);
|
||||||
|
} else {
|
||||||
|
self.candidate_state = target;
|
||||||
|
self.candidate_hits = 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
if self.candidate_hits >= cfg.transition_hysteresis_rounds.max(1) {
|
||||||
|
self.state = target;
|
||||||
|
self.candidate_hits = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
self.state
|
||||||
|
}
|
||||||
|
|
||||||
|
fn derive_target_state(
|
||||||
|
&self,
|
||||||
|
cfg: &PressureConfig,
|
||||||
|
max_total_queued_bytes: u64,
|
||||||
|
signals: PressureSignals,
|
||||||
|
) -> PressureState {
|
||||||
|
let queue_ratio_pct = if max_total_queued_bytes == 0 {
|
||||||
|
100
|
||||||
|
} else {
|
||||||
|
((signals.total_queued_bytes.saturating_mul(100)) / max_total_queued_bytes).min(100)
|
||||||
|
as u8
|
||||||
|
};
|
||||||
|
|
||||||
|
let standing_ratio_pct = if signals.active_flows == 0 {
|
||||||
|
0
|
||||||
|
} else {
|
||||||
|
((signals.standing_flows.saturating_mul(100)) / signals.active_flows).min(100) as u8
|
||||||
|
};
|
||||||
|
|
||||||
|
let mut pressure_score = 0u8;
|
||||||
|
|
||||||
|
if queue_ratio_pct >= cfg.queue_ratio_pressured_pct {
|
||||||
|
pressure_score = pressure_score.max(1);
|
||||||
|
}
|
||||||
|
if queue_ratio_pct >= cfg.queue_ratio_shedding_pct {
|
||||||
|
pressure_score = pressure_score.max(2);
|
||||||
|
}
|
||||||
|
if queue_ratio_pct >= cfg.queue_ratio_saturated_pct {
|
||||||
|
pressure_score = pressure_score.max(3);
|
||||||
|
}
|
||||||
|
|
||||||
|
if standing_ratio_pct >= cfg.standing_ratio_pressured_pct {
|
||||||
|
pressure_score = pressure_score.max(1);
|
||||||
|
}
|
||||||
|
if standing_ratio_pct >= cfg.standing_ratio_shedding_pct {
|
||||||
|
pressure_score = pressure_score.max(2);
|
||||||
|
}
|
||||||
|
if standing_ratio_pct >= cfg.standing_ratio_saturated_pct {
|
||||||
|
pressure_score = pressure_score.max(3);
|
||||||
|
}
|
||||||
|
|
||||||
|
if self.admission_rejects_window >= cfg.rejects_pressured {
|
||||||
|
pressure_score = pressure_score.max(1);
|
||||||
|
}
|
||||||
|
if self.admission_rejects_window >= cfg.rejects_shedding {
|
||||||
|
pressure_score = pressure_score.max(2);
|
||||||
|
}
|
||||||
|
if self.admission_rejects_window >= cfg.rejects_saturated {
|
||||||
|
pressure_score = pressure_score.max(3);
|
||||||
|
}
|
||||||
|
|
||||||
|
if self.route_stalls_window >= cfg.stalls_pressured {
|
||||||
|
pressure_score = pressure_score.max(1);
|
||||||
|
}
|
||||||
|
if self.route_stalls_window >= cfg.stalls_shedding {
|
||||||
|
pressure_score = pressure_score.max(2);
|
||||||
|
}
|
||||||
|
if self.route_stalls_window >= cfg.stalls_saturated {
|
||||||
|
pressure_score = pressure_score.max(3);
|
||||||
|
}
|
||||||
|
|
||||||
|
if signals.backpressured_flows > signals.active_flows.saturating_div(2)
|
||||||
|
&& signals.active_flows > 0
|
||||||
|
{
|
||||||
|
pressure_score = pressure_score.max(2);
|
||||||
|
}
|
||||||
|
|
||||||
|
match pressure_score {
|
||||||
|
0 => PressureState::Normal,
|
||||||
|
1 => PressureState::Pressured,
|
||||||
|
2 => PressureState::Shedding,
|
||||||
|
_ => PressureState::Saturated,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn rotate_window_if_needed(&mut self, now: Instant, cfg: &PressureConfig) {
|
||||||
|
if now.saturating_duration_since(self.window_started_at) < cfg.reject_window {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
self.window_started_at = now;
|
||||||
|
self.admission_rejects_window = 0;
|
||||||
|
self.route_stalls_window = 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
556
src/transport/middle_proxy/fairness/scheduler.rs
Normal file
556
src/transport/middle_proxy/fairness/scheduler.rs
Normal file
@@ -0,0 +1,556 @@
|
|||||||
|
use std::collections::{HashMap, VecDeque};
|
||||||
|
use std::time::{Duration, Instant};
|
||||||
|
|
||||||
|
use bytes::Bytes;
|
||||||
|
|
||||||
|
use super::model::{
|
||||||
|
AdmissionDecision, DispatchAction, DispatchCandidate, DispatchFeedback, FlowFairnessState,
|
||||||
|
FlowPressureClass, FlowSchedulerState, PressureState, QueuedFrame, SchedulerDecision,
|
||||||
|
StandingQueueState,
|
||||||
|
};
|
||||||
|
use super::pressure::{PressureConfig, PressureEvaluator, PressureSignals};
|
||||||
|
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub(crate) struct WorkerFairnessConfig {
|
||||||
|
pub(crate) worker_id: u16,
|
||||||
|
pub(crate) max_active_flows: usize,
|
||||||
|
pub(crate) max_total_queued_bytes: u64,
|
||||||
|
pub(crate) max_flow_queued_bytes: u64,
|
||||||
|
pub(crate) base_quantum_bytes: u32,
|
||||||
|
pub(crate) pressured_quantum_bytes: u32,
|
||||||
|
pub(crate) penalized_quantum_bytes: u32,
|
||||||
|
pub(crate) standing_queue_min_age: Duration,
|
||||||
|
pub(crate) standing_queue_min_backlog_bytes: u64,
|
||||||
|
pub(crate) standing_stall_threshold: u8,
|
||||||
|
pub(crate) max_consecutive_stalls_before_shed: u8,
|
||||||
|
pub(crate) max_consecutive_stalls_before_close: u8,
|
||||||
|
pub(crate) soft_bucket_count: usize,
|
||||||
|
pub(crate) soft_bucket_share_pct: u8,
|
||||||
|
pub(crate) pressure: PressureConfig,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for WorkerFairnessConfig {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
worker_id: 0,
|
||||||
|
max_active_flows: 4096,
|
||||||
|
max_total_queued_bytes: 16 * 1024 * 1024,
|
||||||
|
max_flow_queued_bytes: 512 * 1024,
|
||||||
|
base_quantum_bytes: 32 * 1024,
|
||||||
|
pressured_quantum_bytes: 16 * 1024,
|
||||||
|
penalized_quantum_bytes: 8 * 1024,
|
||||||
|
standing_queue_min_age: Duration::from_millis(250),
|
||||||
|
standing_queue_min_backlog_bytes: 64 * 1024,
|
||||||
|
standing_stall_threshold: 3,
|
||||||
|
max_consecutive_stalls_before_shed: 4,
|
||||||
|
max_consecutive_stalls_before_close: 16,
|
||||||
|
soft_bucket_count: 64,
|
||||||
|
soft_bucket_share_pct: 25,
|
||||||
|
pressure: PressureConfig::default(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct FlowEntry {
|
||||||
|
fairness: FlowFairnessState,
|
||||||
|
queue: VecDeque<QueuedFrame>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl FlowEntry {
|
||||||
|
fn new(flow_id: u64, worker_id: u16, bucket_id: usize) -> Self {
|
||||||
|
Self {
|
||||||
|
fairness: FlowFairnessState::new(flow_id, worker_id, bucket_id),
|
||||||
|
queue: VecDeque::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, Default)]
|
||||||
|
pub(crate) struct WorkerFairnessSnapshot {
|
||||||
|
pub(crate) pressure_state: PressureState,
|
||||||
|
pub(crate) active_flows: usize,
|
||||||
|
pub(crate) total_queued_bytes: u64,
|
||||||
|
pub(crate) standing_flows: usize,
|
||||||
|
pub(crate) backpressured_flows: usize,
|
||||||
|
pub(crate) scheduler_rounds: u64,
|
||||||
|
pub(crate) deficit_grants: u64,
|
||||||
|
pub(crate) deficit_skips: u64,
|
||||||
|
pub(crate) enqueue_rejects: u64,
|
||||||
|
pub(crate) shed_drops: u64,
|
||||||
|
pub(crate) fairness_penalties: u64,
|
||||||
|
pub(crate) downstream_stalls: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) struct WorkerFairnessState {
|
||||||
|
config: WorkerFairnessConfig,
|
||||||
|
pressure: PressureEvaluator,
|
||||||
|
flows: HashMap<u64, FlowEntry>,
|
||||||
|
active_ring: VecDeque<u64>,
|
||||||
|
total_queued_bytes: u64,
|
||||||
|
bucket_queued_bytes: Vec<u64>,
|
||||||
|
bucket_active_flows: Vec<usize>,
|
||||||
|
standing_flow_count: usize,
|
||||||
|
backpressured_flow_count: usize,
|
||||||
|
scheduler_rounds: u64,
|
||||||
|
deficit_grants: u64,
|
||||||
|
deficit_skips: u64,
|
||||||
|
enqueue_rejects: u64,
|
||||||
|
shed_drops: u64,
|
||||||
|
fairness_penalties: u64,
|
||||||
|
downstream_stalls: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl WorkerFairnessState {
|
||||||
|
pub(crate) fn new(config: WorkerFairnessConfig, now: Instant) -> Self {
|
||||||
|
let bucket_count = config.soft_bucket_count.max(1);
|
||||||
|
Self {
|
||||||
|
config,
|
||||||
|
pressure: PressureEvaluator::new(now),
|
||||||
|
flows: HashMap::new(),
|
||||||
|
active_ring: VecDeque::new(),
|
||||||
|
total_queued_bytes: 0,
|
||||||
|
bucket_queued_bytes: vec![0; bucket_count],
|
||||||
|
bucket_active_flows: vec![0; bucket_count],
|
||||||
|
standing_flow_count: 0,
|
||||||
|
backpressured_flow_count: 0,
|
||||||
|
scheduler_rounds: 0,
|
||||||
|
deficit_grants: 0,
|
||||||
|
deficit_skips: 0,
|
||||||
|
enqueue_rejects: 0,
|
||||||
|
shed_drops: 0,
|
||||||
|
fairness_penalties: 0,
|
||||||
|
downstream_stalls: 0,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn pressure_state(&self) -> PressureState {
|
||||||
|
self.pressure.state()
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn snapshot(&self) -> WorkerFairnessSnapshot {
|
||||||
|
WorkerFairnessSnapshot {
|
||||||
|
pressure_state: self.pressure.state(),
|
||||||
|
active_flows: self.flows.len(),
|
||||||
|
total_queued_bytes: self.total_queued_bytes,
|
||||||
|
standing_flows: self.standing_flow_count,
|
||||||
|
backpressured_flows: self.backpressured_flow_count,
|
||||||
|
scheduler_rounds: self.scheduler_rounds,
|
||||||
|
deficit_grants: self.deficit_grants,
|
||||||
|
deficit_skips: self.deficit_skips,
|
||||||
|
enqueue_rejects: self.enqueue_rejects,
|
||||||
|
shed_drops: self.shed_drops,
|
||||||
|
fairness_penalties: self.fairness_penalties,
|
||||||
|
downstream_stalls: self.downstream_stalls,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn enqueue_data(
|
||||||
|
&mut self,
|
||||||
|
conn_id: u64,
|
||||||
|
flags: u32,
|
||||||
|
data: Bytes,
|
||||||
|
now: Instant,
|
||||||
|
) -> AdmissionDecision {
|
||||||
|
let frame = QueuedFrame {
|
||||||
|
conn_id,
|
||||||
|
flags,
|
||||||
|
data,
|
||||||
|
enqueued_at: now,
|
||||||
|
};
|
||||||
|
let frame_bytes = frame.queued_bytes();
|
||||||
|
|
||||||
|
if self.pressure.state() == PressureState::Saturated {
|
||||||
|
self.pressure
|
||||||
|
.note_admission_reject(now, &self.config.pressure);
|
||||||
|
self.enqueue_rejects = self.enqueue_rejects.saturating_add(1);
|
||||||
|
return AdmissionDecision::RejectSaturated;
|
||||||
|
}
|
||||||
|
|
||||||
|
if self.total_queued_bytes.saturating_add(frame_bytes) > self.config.max_total_queued_bytes
|
||||||
|
{
|
||||||
|
self.pressure
|
||||||
|
.note_admission_reject(now, &self.config.pressure);
|
||||||
|
self.enqueue_rejects = self.enqueue_rejects.saturating_add(1);
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
return AdmissionDecision::RejectWorkerCap;
|
||||||
|
}
|
||||||
|
|
||||||
|
if !self.flows.contains_key(&conn_id) && self.flows.len() >= self.config.max_active_flows {
|
||||||
|
self.pressure
|
||||||
|
.note_admission_reject(now, &self.config.pressure);
|
||||||
|
self.enqueue_rejects = self.enqueue_rejects.saturating_add(1);
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
return AdmissionDecision::RejectWorkerCap;
|
||||||
|
}
|
||||||
|
|
||||||
|
let bucket_id = self.bucket_for(conn_id);
|
||||||
|
let bucket_cap = self
|
||||||
|
.config
|
||||||
|
.max_total_queued_bytes
|
||||||
|
.saturating_mul(self.config.soft_bucket_share_pct.max(1) as u64)
|
||||||
|
.saturating_div(100)
|
||||||
|
.max(self.config.max_flow_queued_bytes);
|
||||||
|
if self.bucket_queued_bytes[bucket_id].saturating_add(frame_bytes) > bucket_cap {
|
||||||
|
self.pressure
|
||||||
|
.note_admission_reject(now, &self.config.pressure);
|
||||||
|
self.enqueue_rejects = self.enqueue_rejects.saturating_add(1);
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
return AdmissionDecision::RejectBucketCap;
|
||||||
|
}
|
||||||
|
|
||||||
|
let entry = if let Some(flow) = self.flows.get_mut(&conn_id) {
|
||||||
|
flow
|
||||||
|
} else {
|
||||||
|
self.bucket_active_flows[bucket_id] =
|
||||||
|
self.bucket_active_flows[bucket_id].saturating_add(1);
|
||||||
|
self.flows.insert(
|
||||||
|
conn_id,
|
||||||
|
FlowEntry::new(conn_id, self.config.worker_id, bucket_id),
|
||||||
|
);
|
||||||
|
self.flows
|
||||||
|
.get_mut(&conn_id)
|
||||||
|
.expect("flow inserted must be retrievable")
|
||||||
|
};
|
||||||
|
|
||||||
|
if entry.fairness.pending_bytes.saturating_add(frame_bytes)
|
||||||
|
> self.config.max_flow_queued_bytes
|
||||||
|
{
|
||||||
|
self.pressure
|
||||||
|
.note_admission_reject(now, &self.config.pressure);
|
||||||
|
self.enqueue_rejects = self.enqueue_rejects.saturating_add(1);
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
return AdmissionDecision::RejectFlowCap;
|
||||||
|
}
|
||||||
|
|
||||||
|
if self.pressure.state() >= PressureState::Shedding
|
||||||
|
&& entry.fairness.standing_state == StandingQueueState::Standing
|
||||||
|
{
|
||||||
|
self.pressure
|
||||||
|
.note_admission_reject(now, &self.config.pressure);
|
||||||
|
self.enqueue_rejects = self.enqueue_rejects.saturating_add(1);
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
return AdmissionDecision::RejectStandingFlow;
|
||||||
|
}
|
||||||
|
|
||||||
|
entry.fairness.pending_bytes = entry.fairness.pending_bytes.saturating_add(frame_bytes);
|
||||||
|
if entry.fairness.queue_started_at.is_none() {
|
||||||
|
entry.fairness.queue_started_at = Some(now);
|
||||||
|
}
|
||||||
|
entry.queue.push_back(frame);
|
||||||
|
|
||||||
|
self.total_queued_bytes = self.total_queued_bytes.saturating_add(frame_bytes);
|
||||||
|
self.bucket_queued_bytes[bucket_id] =
|
||||||
|
self.bucket_queued_bytes[bucket_id].saturating_add(frame_bytes);
|
||||||
|
|
||||||
|
if !entry.fairness.in_active_ring {
|
||||||
|
entry.fairness.in_active_ring = true;
|
||||||
|
self.active_ring.push_back(conn_id);
|
||||||
|
}
|
||||||
|
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
AdmissionDecision::Admit
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn next_decision(&mut self, now: Instant) -> SchedulerDecision {
|
||||||
|
self.scheduler_rounds = self.scheduler_rounds.saturating_add(1);
|
||||||
|
self.evaluate_pressure(now, false);
|
||||||
|
|
||||||
|
let active_len = self.active_ring.len();
|
||||||
|
for _ in 0..active_len {
|
||||||
|
let Some(conn_id) = self.active_ring.pop_front() else {
|
||||||
|
break;
|
||||||
|
};
|
||||||
|
|
||||||
|
let mut candidate = None;
|
||||||
|
let mut requeue_active = false;
|
||||||
|
let mut drained_bytes = 0u64;
|
||||||
|
let mut bucket_id = 0usize;
|
||||||
|
let pressure_state = self.pressure.state();
|
||||||
|
|
||||||
|
if let Some(flow) = self.flows.get_mut(&conn_id) {
|
||||||
|
bucket_id = flow.fairness.bucket_id;
|
||||||
|
|
||||||
|
if flow.queue.is_empty() {
|
||||||
|
flow.fairness.in_active_ring = false;
|
||||||
|
flow.fairness.scheduler_state = FlowSchedulerState::Idle;
|
||||||
|
flow.fairness.pending_bytes = 0;
|
||||||
|
flow.fairness.queue_started_at = None;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
Self::classify_flow(&self.config, pressure_state, now, &mut flow.fairness);
|
||||||
|
|
||||||
|
let quantum =
|
||||||
|
Self::effective_quantum_bytes(&self.config, pressure_state, &flow.fairness);
|
||||||
|
flow.fairness.deficit_bytes = flow
|
||||||
|
.fairness
|
||||||
|
.deficit_bytes
|
||||||
|
.saturating_add(i64::from(quantum));
|
||||||
|
self.deficit_grants = self.deficit_grants.saturating_add(1);
|
||||||
|
|
||||||
|
let front_len = flow.queue.front().map_or(0, |front| front.queued_bytes());
|
||||||
|
if flow.fairness.deficit_bytes < front_len as i64 {
|
||||||
|
flow.fairness.consecutive_skips =
|
||||||
|
flow.fairness.consecutive_skips.saturating_add(1);
|
||||||
|
self.deficit_skips = self.deficit_skips.saturating_add(1);
|
||||||
|
requeue_active = true;
|
||||||
|
} else if let Some(frame) = flow.queue.pop_front() {
|
||||||
|
drained_bytes = frame.queued_bytes();
|
||||||
|
flow.fairness.pending_bytes =
|
||||||
|
flow.fairness.pending_bytes.saturating_sub(drained_bytes);
|
||||||
|
flow.fairness.deficit_bytes = flow
|
||||||
|
.fairness
|
||||||
|
.deficit_bytes
|
||||||
|
.saturating_sub(drained_bytes as i64);
|
||||||
|
flow.fairness.consecutive_skips = 0;
|
||||||
|
flow.fairness.queue_started_at =
|
||||||
|
flow.queue.front().map(|front| front.enqueued_at);
|
||||||
|
requeue_active = !flow.queue.is_empty();
|
||||||
|
if !requeue_active {
|
||||||
|
flow.fairness.scheduler_state = FlowSchedulerState::Idle;
|
||||||
|
flow.fairness.in_active_ring = false;
|
||||||
|
}
|
||||||
|
candidate = Some(DispatchCandidate {
|
||||||
|
pressure_state,
|
||||||
|
flow_class: flow.fairness.pressure_class,
|
||||||
|
frame,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if drained_bytes > 0 {
|
||||||
|
self.total_queued_bytes = self.total_queued_bytes.saturating_sub(drained_bytes);
|
||||||
|
self.bucket_queued_bytes[bucket_id] =
|
||||||
|
self.bucket_queued_bytes[bucket_id].saturating_sub(drained_bytes);
|
||||||
|
}
|
||||||
|
|
||||||
|
if requeue_active {
|
||||||
|
if let Some(flow) = self.flows.get_mut(&conn_id) {
|
||||||
|
flow.fairness.in_active_ring = true;
|
||||||
|
}
|
||||||
|
self.active_ring.push_back(conn_id);
|
||||||
|
}
|
||||||
|
|
||||||
|
if let Some(candidate) = candidate {
|
||||||
|
return SchedulerDecision::Dispatch(candidate);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
SchedulerDecision::Idle
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn apply_dispatch_feedback(
|
||||||
|
&mut self,
|
||||||
|
conn_id: u64,
|
||||||
|
candidate: DispatchCandidate,
|
||||||
|
feedback: DispatchFeedback,
|
||||||
|
now: Instant,
|
||||||
|
) -> DispatchAction {
|
||||||
|
match feedback {
|
||||||
|
DispatchFeedback::Routed => {
|
||||||
|
if let Some(flow) = self.flows.get_mut(&conn_id) {
|
||||||
|
flow.fairness.last_drain_at = Some(now);
|
||||||
|
flow.fairness.recent_drain_bytes = flow
|
||||||
|
.fairness
|
||||||
|
.recent_drain_bytes
|
||||||
|
.saturating_add(candidate.frame.queued_bytes());
|
||||||
|
flow.fairness.consecutive_stalls = 0;
|
||||||
|
if flow.fairness.scheduler_state != FlowSchedulerState::Idle {
|
||||||
|
flow.fairness.scheduler_state = FlowSchedulerState::Active;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
self.evaluate_pressure(now, false);
|
||||||
|
DispatchAction::Continue
|
||||||
|
}
|
||||||
|
DispatchFeedback::QueueFull => {
|
||||||
|
self.pressure.note_route_stall(now, &self.config.pressure);
|
||||||
|
self.downstream_stalls = self.downstream_stalls.saturating_add(1);
|
||||||
|
let Some(flow) = self.flows.get_mut(&conn_id) else {
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
return DispatchAction::Continue;
|
||||||
|
};
|
||||||
|
|
||||||
|
flow.fairness.consecutive_stalls =
|
||||||
|
flow.fairness.consecutive_stalls.saturating_add(1);
|
||||||
|
flow.fairness.scheduler_state = FlowSchedulerState::Backpressured;
|
||||||
|
flow.fairness.pressure_class = FlowPressureClass::Backpressured;
|
||||||
|
|
||||||
|
let state = self.pressure.state();
|
||||||
|
let should_shed_frame = matches!(state, PressureState::Saturated)
|
||||||
|
|| (matches!(state, PressureState::Shedding)
|
||||||
|
&& flow.fairness.standing_state == StandingQueueState::Standing
|
||||||
|
&& flow.fairness.consecutive_stalls
|
||||||
|
>= self.config.max_consecutive_stalls_before_shed);
|
||||||
|
|
||||||
|
if should_shed_frame {
|
||||||
|
self.shed_drops = self.shed_drops.saturating_add(1);
|
||||||
|
self.fairness_penalties = self.fairness_penalties.saturating_add(1);
|
||||||
|
} else {
|
||||||
|
let frame_bytes = candidate.frame.queued_bytes();
|
||||||
|
flow.queue.push_front(candidate.frame);
|
||||||
|
flow.fairness.pending_bytes =
|
||||||
|
flow.fairness.pending_bytes.saturating_add(frame_bytes);
|
||||||
|
flow.fairness.queue_started_at =
|
||||||
|
flow.queue.front().map(|front| front.enqueued_at);
|
||||||
|
self.total_queued_bytes = self.total_queued_bytes.saturating_add(frame_bytes);
|
||||||
|
self.bucket_queued_bytes[flow.fairness.bucket_id] = self.bucket_queued_bytes
|
||||||
|
[flow.fairness.bucket_id]
|
||||||
|
.saturating_add(frame_bytes);
|
||||||
|
if !flow.fairness.in_active_ring {
|
||||||
|
flow.fairness.in_active_ring = true;
|
||||||
|
self.active_ring.push_back(conn_id);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if flow.fairness.consecutive_stalls
|
||||||
|
>= self.config.max_consecutive_stalls_before_close
|
||||||
|
&& self.pressure.state() == PressureState::Saturated
|
||||||
|
{
|
||||||
|
self.remove_flow(conn_id);
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
return DispatchAction::CloseFlow;
|
||||||
|
}
|
||||||
|
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
DispatchAction::Continue
|
||||||
|
}
|
||||||
|
DispatchFeedback::ChannelClosed | DispatchFeedback::NoConn => {
|
||||||
|
self.remove_flow(conn_id);
|
||||||
|
self.evaluate_pressure(now, true);
|
||||||
|
DispatchAction::CloseFlow
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn remove_flow(&mut self, conn_id: u64) {
|
||||||
|
let Some(entry) = self.flows.remove(&conn_id) else {
|
||||||
|
return;
|
||||||
|
};
|
||||||
|
|
||||||
|
self.bucket_active_flows[entry.fairness.bucket_id] =
|
||||||
|
self.bucket_active_flows[entry.fairness.bucket_id].saturating_sub(1);
|
||||||
|
|
||||||
|
let mut reclaimed = 0u64;
|
||||||
|
for frame in entry.queue {
|
||||||
|
reclaimed = reclaimed.saturating_add(frame.queued_bytes());
|
||||||
|
}
|
||||||
|
self.total_queued_bytes = self.total_queued_bytes.saturating_sub(reclaimed);
|
||||||
|
self.bucket_queued_bytes[entry.fairness.bucket_id] =
|
||||||
|
self.bucket_queued_bytes[entry.fairness.bucket_id].saturating_sub(reclaimed);
|
||||||
|
}
|
||||||
|
|
||||||
|
fn evaluate_pressure(&mut self, now: Instant, force: bool) {
|
||||||
|
let mut standing = 0usize;
|
||||||
|
let mut backpressured = 0usize;
|
||||||
|
|
||||||
|
for flow in self.flows.values_mut() {
|
||||||
|
Self::classify_flow(&self.config, self.pressure.state(), now, &mut flow.fairness);
|
||||||
|
if flow.fairness.standing_state == StandingQueueState::Standing {
|
||||||
|
standing = standing.saturating_add(1);
|
||||||
|
}
|
||||||
|
if matches!(
|
||||||
|
flow.fairness.scheduler_state,
|
||||||
|
FlowSchedulerState::Backpressured
|
||||||
|
| FlowSchedulerState::Penalized
|
||||||
|
| FlowSchedulerState::SheddingCandidate
|
||||||
|
) {
|
||||||
|
backpressured = backpressured.saturating_add(1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
self.standing_flow_count = standing;
|
||||||
|
self.backpressured_flow_count = backpressured;
|
||||||
|
|
||||||
|
let _ = self.pressure.maybe_evaluate(
|
||||||
|
now,
|
||||||
|
&self.config.pressure,
|
||||||
|
self.config.max_total_queued_bytes,
|
||||||
|
PressureSignals {
|
||||||
|
active_flows: self.flows.len(),
|
||||||
|
total_queued_bytes: self.total_queued_bytes,
|
||||||
|
standing_flows: standing,
|
||||||
|
backpressured_flows: backpressured,
|
||||||
|
},
|
||||||
|
force,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
fn classify_flow(
|
||||||
|
config: &WorkerFairnessConfig,
|
||||||
|
pressure_state: PressureState,
|
||||||
|
now: Instant,
|
||||||
|
fairness: &mut FlowFairnessState,
|
||||||
|
) {
|
||||||
|
if fairness.pending_bytes == 0 {
|
||||||
|
fairness.pressure_class = FlowPressureClass::Healthy;
|
||||||
|
fairness.standing_state = StandingQueueState::Transient;
|
||||||
|
fairness.scheduler_state = FlowSchedulerState::Idle;
|
||||||
|
fairness.penalty_score = fairness.penalty_score.saturating_sub(1);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
let queue_age = fairness
|
||||||
|
.queue_started_at
|
||||||
|
.map(|ts| now.saturating_duration_since(ts))
|
||||||
|
.unwrap_or_default();
|
||||||
|
let drain_stalled = fairness
|
||||||
|
.last_drain_at
|
||||||
|
.map(|ts| now.saturating_duration_since(ts) >= config.standing_queue_min_age)
|
||||||
|
.unwrap_or(true);
|
||||||
|
|
||||||
|
let standing = fairness.pending_bytes >= config.standing_queue_min_backlog_bytes
|
||||||
|
&& queue_age >= config.standing_queue_min_age
|
||||||
|
&& (fairness.consecutive_stalls >= config.standing_stall_threshold || drain_stalled);
|
||||||
|
|
||||||
|
if standing {
|
||||||
|
fairness.standing_state = StandingQueueState::Standing;
|
||||||
|
fairness.pressure_class = FlowPressureClass::Standing;
|
||||||
|
fairness.penalty_score = fairness.penalty_score.saturating_add(1);
|
||||||
|
fairness.scheduler_state = if pressure_state >= PressureState::Shedding {
|
||||||
|
FlowSchedulerState::SheddingCandidate
|
||||||
|
} else {
|
||||||
|
FlowSchedulerState::Penalized
|
||||||
|
};
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
fairness.standing_state = StandingQueueState::Transient;
|
||||||
|
if fairness.consecutive_stalls > 0 {
|
||||||
|
fairness.pressure_class = FlowPressureClass::Backpressured;
|
||||||
|
fairness.scheduler_state = FlowSchedulerState::Backpressured;
|
||||||
|
} else if fairness.pending_bytes >= config.standing_queue_min_backlog_bytes {
|
||||||
|
fairness.pressure_class = FlowPressureClass::Bursty;
|
||||||
|
fairness.scheduler_state = FlowSchedulerState::Active;
|
||||||
|
} else {
|
||||||
|
fairness.pressure_class = FlowPressureClass::Healthy;
|
||||||
|
fairness.scheduler_state = FlowSchedulerState::Active;
|
||||||
|
}
|
||||||
|
fairness.penalty_score = fairness.penalty_score.saturating_sub(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
fn effective_quantum_bytes(
|
||||||
|
config: &WorkerFairnessConfig,
|
||||||
|
pressure_state: PressureState,
|
||||||
|
fairness: &FlowFairnessState,
|
||||||
|
) -> u32 {
|
||||||
|
let penalized = matches!(
|
||||||
|
fairness.scheduler_state,
|
||||||
|
FlowSchedulerState::Penalized | FlowSchedulerState::SheddingCandidate
|
||||||
|
);
|
||||||
|
|
||||||
|
if penalized {
|
||||||
|
return config.penalized_quantum_bytes.max(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
match pressure_state {
|
||||||
|
PressureState::Normal => config.base_quantum_bytes.max(1),
|
||||||
|
PressureState::Pressured => config.pressured_quantum_bytes.max(1),
|
||||||
|
PressureState::Shedding => config.pressured_quantum_bytes.max(1),
|
||||||
|
PressureState::Saturated => config.penalized_quantum_bytes.max(1),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn bucket_for(&self, conn_id: u64) -> usize {
|
||||||
|
(conn_id as usize) % self.bucket_queued_bytes.len().max(1)
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -2,6 +2,10 @@
|
|||||||
|
|
||||||
mod codec;
|
mod codec;
|
||||||
mod config_updater;
|
mod config_updater;
|
||||||
|
mod fairness;
|
||||||
|
#[cfg(test)]
|
||||||
|
#[path = "tests/fairness_security_tests.rs"]
|
||||||
|
mod fairness_security_tests;
|
||||||
mod handshake;
|
mod handshake;
|
||||||
mod health;
|
mod health;
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
|
|||||||
@@ -20,11 +20,15 @@ use crate::protocol::constants::*;
|
|||||||
use crate::stats::Stats;
|
use crate::stats::Stats;
|
||||||
|
|
||||||
use super::codec::{RpcChecksumMode, WriterCommand, rpc_crc};
|
use super::codec::{RpcChecksumMode, WriterCommand, rpc_crc};
|
||||||
|
use super::fairness::{
|
||||||
|
AdmissionDecision, DispatchAction, DispatchFeedback, SchedulerDecision, WorkerFairnessConfig,
|
||||||
|
WorkerFairnessSnapshot, WorkerFairnessState,
|
||||||
|
};
|
||||||
use super::registry::RouteResult;
|
use super::registry::RouteResult;
|
||||||
use super::{ConnRegistry, MeResponse};
|
use super::{ConnRegistry, MeResponse};
|
||||||
|
|
||||||
const DATA_ROUTE_MAX_ATTEMPTS: usize = 3;
|
const DATA_ROUTE_MAX_ATTEMPTS: usize = 3;
|
||||||
const DATA_ROUTE_QUEUE_FULL_STARVATION_THRESHOLD: u8 = 3;
|
const DATA_ROUTE_QUEUE_FULL_STARVATION_THRESHOLD: u8 = 3;
|
||||||
|
const FAIRNESS_DRAIN_BUDGET_PER_LOOP: usize = 128;
|
||||||
|
|
||||||
fn should_close_on_route_result_for_data(result: RouteResult) -> bool {
|
fn should_close_on_route_result_for_data(result: RouteResult) -> bool {
|
||||||
matches!(result, RouteResult::NoConn | RouteResult::ChannelClosed)
|
matches!(result, RouteResult::NoConn | RouteResult::ChannelClosed)
|
||||||
@@ -77,6 +81,118 @@ async fn route_data_with_retry(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[inline]
|
||||||
|
fn route_feedback(result: RouteResult) -> DispatchFeedback {
|
||||||
|
match result {
|
||||||
|
RouteResult::Routed => DispatchFeedback::Routed,
|
||||||
|
RouteResult::NoConn => DispatchFeedback::NoConn,
|
||||||
|
RouteResult::ChannelClosed => DispatchFeedback::ChannelClosed,
|
||||||
|
RouteResult::QueueFullBase | RouteResult::QueueFullHigh => DispatchFeedback::QueueFull,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn report_route_drop(result: RouteResult, stats: &Stats) {
|
||||||
|
match result {
|
||||||
|
RouteResult::NoConn => stats.increment_me_route_drop_no_conn(),
|
||||||
|
RouteResult::ChannelClosed => stats.increment_me_route_drop_channel_closed(),
|
||||||
|
RouteResult::QueueFullBase => {
|
||||||
|
stats.increment_me_route_drop_queue_full();
|
||||||
|
stats.increment_me_route_drop_queue_full_base();
|
||||||
|
}
|
||||||
|
RouteResult::QueueFullHigh => {
|
||||||
|
stats.increment_me_route_drop_queue_full();
|
||||||
|
stats.increment_me_route_drop_queue_full_high();
|
||||||
|
}
|
||||||
|
RouteResult::Routed => {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn apply_fairness_metrics_delta(
|
||||||
|
stats: &Stats,
|
||||||
|
prev: &mut WorkerFairnessSnapshot,
|
||||||
|
current: WorkerFairnessSnapshot,
|
||||||
|
) {
|
||||||
|
stats.set_me_fair_active_flows_gauge(current.active_flows as u64);
|
||||||
|
stats.set_me_fair_queued_bytes_gauge(current.total_queued_bytes);
|
||||||
|
stats.set_me_fair_standing_flows_gauge(current.standing_flows as u64);
|
||||||
|
stats.set_me_fair_backpressured_flows_gauge(current.backpressured_flows as u64);
|
||||||
|
stats.set_me_fair_pressure_state_gauge(current.pressure_state.as_u8() as u64);
|
||||||
|
stats.add_me_fair_scheduler_rounds_total(
|
||||||
|
current
|
||||||
|
.scheduler_rounds
|
||||||
|
.saturating_sub(prev.scheduler_rounds),
|
||||||
|
);
|
||||||
|
stats.add_me_fair_deficit_grants_total(
|
||||||
|
current.deficit_grants.saturating_sub(prev.deficit_grants),
|
||||||
|
);
|
||||||
|
stats.add_me_fair_deficit_skips_total(current.deficit_skips.saturating_sub(prev.deficit_skips));
|
||||||
|
stats.add_me_fair_enqueue_rejects_total(
|
||||||
|
current.enqueue_rejects.saturating_sub(prev.enqueue_rejects),
|
||||||
|
);
|
||||||
|
stats.add_me_fair_shed_drops_total(current.shed_drops.saturating_sub(prev.shed_drops));
|
||||||
|
stats.add_me_fair_penalties_total(
|
||||||
|
current
|
||||||
|
.fairness_penalties
|
||||||
|
.saturating_sub(prev.fairness_penalties),
|
||||||
|
);
|
||||||
|
stats.add_me_fair_downstream_stalls_total(
|
||||||
|
current
|
||||||
|
.downstream_stalls
|
||||||
|
.saturating_sub(prev.downstream_stalls),
|
||||||
|
);
|
||||||
|
*prev = current;
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn drain_fairness_scheduler(
|
||||||
|
fairness: &mut WorkerFairnessState,
|
||||||
|
reg: &ConnRegistry,
|
||||||
|
tx: &mpsc::Sender<WriterCommand>,
|
||||||
|
data_route_queue_full_streak: &mut HashMap<u64, u8>,
|
||||||
|
route_wait_ms: u64,
|
||||||
|
stats: &Stats,
|
||||||
|
) {
|
||||||
|
for _ in 0..FAIRNESS_DRAIN_BUDGET_PER_LOOP {
|
||||||
|
let now = Instant::now();
|
||||||
|
let SchedulerDecision::Dispatch(candidate) = fairness.next_decision(now) else {
|
||||||
|
break;
|
||||||
|
};
|
||||||
|
let cid = candidate.frame.conn_id;
|
||||||
|
let _pressure_state = candidate.pressure_state;
|
||||||
|
let _flow_class = candidate.flow_class;
|
||||||
|
let routed = route_data_with_retry(
|
||||||
|
reg,
|
||||||
|
cid,
|
||||||
|
candidate.frame.flags,
|
||||||
|
candidate.frame.data.clone(),
|
||||||
|
route_wait_ms,
|
||||||
|
)
|
||||||
|
.await;
|
||||||
|
if matches!(routed, RouteResult::Routed) {
|
||||||
|
data_route_queue_full_streak.remove(&cid);
|
||||||
|
} else {
|
||||||
|
report_route_drop(routed, stats);
|
||||||
|
}
|
||||||
|
let action = fairness.apply_dispatch_feedback(cid, candidate, route_feedback(routed), now);
|
||||||
|
if is_data_route_queue_full(routed) {
|
||||||
|
let streak = data_route_queue_full_streak.entry(cid).or_insert(0);
|
||||||
|
*streak = streak.saturating_add(1);
|
||||||
|
if should_close_on_queue_full_streak(*streak) {
|
||||||
|
fairness.remove_flow(cid);
|
||||||
|
data_route_queue_full_streak.remove(&cid);
|
||||||
|
reg.unregister(cid).await;
|
||||||
|
send_close_conn(tx, cid).await;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if action == DispatchAction::CloseFlow || should_close_on_route_result_for_data(routed) {
|
||||||
|
fairness.remove_flow(cid);
|
||||||
|
data_route_queue_full_streak.remove(&cid);
|
||||||
|
reg.unregister(cid).await;
|
||||||
|
send_close_conn(tx, cid).await;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
pub(crate) async fn reader_loop(
|
pub(crate) async fn reader_loop(
|
||||||
mut rd: tokio::io::ReadHalf<TcpStream>,
|
mut rd: tokio::io::ReadHalf<TcpStream>,
|
||||||
dk: [u8; 32],
|
dk: [u8; 32],
|
||||||
@@ -98,7 +214,21 @@ pub(crate) async fn reader_loop(
|
|||||||
let mut raw = enc_leftover;
|
let mut raw = enc_leftover;
|
||||||
let mut expected_seq: i32 = 0;
|
let mut expected_seq: i32 = 0;
|
||||||
let mut data_route_queue_full_streak = HashMap::<u64, u8>::new();
|
let mut data_route_queue_full_streak = HashMap::<u64, u8>::new();
|
||||||
|
let mut fairness = WorkerFairnessState::new(
|
||||||
|
WorkerFairnessConfig {
|
||||||
|
worker_id: (writer_id as u16).saturating_add(1),
|
||||||
|
max_active_flows: reg.route_channel_capacity().saturating_mul(4).max(256),
|
||||||
|
max_total_queued_bytes: (reg.route_channel_capacity() as u64)
|
||||||
|
.saturating_mul(16 * 1024)
|
||||||
|
.max(4 * 1024 * 1024),
|
||||||
|
max_flow_queued_bytes: (reg.route_channel_capacity() as u64)
|
||||||
|
.saturating_mul(2 * 1024)
|
||||||
|
.clamp(64 * 1024, 2 * 1024 * 1024),
|
||||||
|
..WorkerFairnessConfig::default()
|
||||||
|
},
|
||||||
|
Instant::now(),
|
||||||
|
);
|
||||||
|
let mut fairness_snapshot = fairness.snapshot();
|
||||||
loop {
|
loop {
|
||||||
let mut tmp = [0u8; 65_536];
|
let mut tmp = [0u8; 65_536];
|
||||||
let n = tokio::select! {
|
let n = tokio::select! {
|
||||||
@@ -181,36 +311,20 @@ pub(crate) async fn reader_loop(
|
|||||||
let data = body.slice(12..);
|
let data = body.slice(12..);
|
||||||
trace!(cid, flags, len = data.len(), "RPC_PROXY_ANS");
|
trace!(cid, flags, len = data.len(), "RPC_PROXY_ANS");
|
||||||
|
|
||||||
let route_wait_ms = reader_route_data_wait_ms.load(Ordering::Relaxed);
|
let admission = fairness.enqueue_data(cid, flags, data, Instant::now());
|
||||||
let routed =
|
if !matches!(admission, AdmissionDecision::Admit) {
|
||||||
route_data_with_retry(reg.as_ref(), cid, flags, data, route_wait_ms).await;
|
|
||||||
if matches!(routed, RouteResult::Routed) {
|
|
||||||
data_route_queue_full_streak.remove(&cid);
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
match routed {
|
|
||||||
RouteResult::NoConn => stats.increment_me_route_drop_no_conn(),
|
|
||||||
RouteResult::ChannelClosed => stats.increment_me_route_drop_channel_closed(),
|
|
||||||
RouteResult::QueueFullBase => {
|
|
||||||
stats.increment_me_route_drop_queue_full();
|
|
||||||
stats.increment_me_route_drop_queue_full_base();
|
|
||||||
}
|
|
||||||
RouteResult::QueueFullHigh => {
|
|
||||||
stats.increment_me_route_drop_queue_full();
|
stats.increment_me_route_drop_queue_full();
|
||||||
stats.increment_me_route_drop_queue_full_high();
|
stats.increment_me_route_drop_queue_full_high();
|
||||||
}
|
|
||||||
RouteResult::Routed => {}
|
|
||||||
}
|
|
||||||
if should_close_on_route_result_for_data(routed) {
|
|
||||||
data_route_queue_full_streak.remove(&cid);
|
|
||||||
reg.unregister(cid).await;
|
|
||||||
send_close_conn(&tx, cid).await;
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
if is_data_route_queue_full(routed) {
|
|
||||||
let streak = data_route_queue_full_streak.entry(cid).or_insert(0);
|
let streak = data_route_queue_full_streak.entry(cid).or_insert(0);
|
||||||
*streak = streak.saturating_add(1);
|
*streak = streak.saturating_add(1);
|
||||||
if should_close_on_queue_full_streak(*streak) {
|
if should_close_on_queue_full_streak(*streak)
|
||||||
|
|| matches!(
|
||||||
|
admission,
|
||||||
|
AdmissionDecision::RejectSaturated
|
||||||
|
| AdmissionDecision::RejectStandingFlow
|
||||||
|
)
|
||||||
|
{
|
||||||
|
fairness.remove_flow(cid);
|
||||||
data_route_queue_full_streak.remove(&cid);
|
data_route_queue_full_streak.remove(&cid);
|
||||||
reg.unregister(cid).await;
|
reg.unregister(cid).await;
|
||||||
send_close_conn(&tx, cid).await;
|
send_close_conn(&tx, cid).await;
|
||||||
@@ -249,12 +363,14 @@ pub(crate) async fn reader_loop(
|
|||||||
let _ = reg.route_nowait(cid, MeResponse::Close).await;
|
let _ = reg.route_nowait(cid, MeResponse::Close).await;
|
||||||
reg.unregister(cid).await;
|
reg.unregister(cid).await;
|
||||||
data_route_queue_full_streak.remove(&cid);
|
data_route_queue_full_streak.remove(&cid);
|
||||||
|
fairness.remove_flow(cid);
|
||||||
} else if pt == RPC_CLOSE_CONN_U32 && body.len() >= 8 {
|
} else if pt == RPC_CLOSE_CONN_U32 && body.len() >= 8 {
|
||||||
let cid = u64::from_le_bytes(body[0..8].try_into().unwrap());
|
let cid = u64::from_le_bytes(body[0..8].try_into().unwrap());
|
||||||
debug!(cid, "RPC_CLOSE_CONN from ME");
|
debug!(cid, "RPC_CLOSE_CONN from ME");
|
||||||
let _ = reg.route_nowait(cid, MeResponse::Close).await;
|
let _ = reg.route_nowait(cid, MeResponse::Close).await;
|
||||||
reg.unregister(cid).await;
|
reg.unregister(cid).await;
|
||||||
data_route_queue_full_streak.remove(&cid);
|
data_route_queue_full_streak.remove(&cid);
|
||||||
|
fairness.remove_flow(cid);
|
||||||
} else if pt == RPC_PING_U32 && body.len() >= 8 {
|
} else if pt == RPC_PING_U32 && body.len() >= 8 {
|
||||||
let ping_id = i64::from_le_bytes(body[0..8].try_into().unwrap());
|
let ping_id = i64::from_le_bytes(body[0..8].try_into().unwrap());
|
||||||
trace!(ping_id, "RPC_PING -> RPC_PONG");
|
trace!(ping_id, "RPC_PING -> RPC_PONG");
|
||||||
@@ -310,6 +426,19 @@ pub(crate) async fn reader_loop(
|
|||||||
"Unknown RPC"
|
"Unknown RPC"
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
let route_wait_ms = reader_route_data_wait_ms.load(Ordering::Relaxed);
|
||||||
|
drain_fairness_scheduler(
|
||||||
|
&mut fairness,
|
||||||
|
reg.as_ref(),
|
||||||
|
&tx,
|
||||||
|
&mut data_route_queue_full_streak,
|
||||||
|
route_wait_ms,
|
||||||
|
stats.as_ref(),
|
||||||
|
)
|
||||||
|
.await;
|
||||||
|
let current_snapshot = fairness.snapshot();
|
||||||
|
apply_fairness_metrics_delta(stats.as_ref(), &mut fairness_snapshot, current_snapshot);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -140,6 +140,10 @@ impl ConnRegistry {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
pub fn route_channel_capacity(&self) -> usize {
|
||||||
|
self.route_channel_capacity
|
||||||
|
}
|
||||||
|
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
pub fn new() -> Self {
|
pub fn new() -> Self {
|
||||||
Self::with_route_channel_capacity(4096)
|
Self::with_route_channel_capacity(4096)
|
||||||
|
|||||||
185
src/transport/middle_proxy/tests/fairness_security_tests.rs
Normal file
185
src/transport/middle_proxy/tests/fairness_security_tests.rs
Normal file
@@ -0,0 +1,185 @@
|
|||||||
|
use std::time::{Duration, Instant};
|
||||||
|
|
||||||
|
use bytes::Bytes;
|
||||||
|
|
||||||
|
use crate::transport::middle_proxy::fairness::{
|
||||||
|
AdmissionDecision, DispatchAction, DispatchFeedback, PressureState, SchedulerDecision,
|
||||||
|
WorkerFairnessConfig, WorkerFairnessState,
|
||||||
|
};
|
||||||
|
|
||||||
|
fn enqueue_payload(size: usize) -> Bytes {
|
||||||
|
Bytes::from(vec![0xAB; size])
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn fairness_rejects_when_worker_budget_is_exhausted() {
|
||||||
|
let now = Instant::now();
|
||||||
|
let mut fairness = WorkerFairnessState::new(
|
||||||
|
WorkerFairnessConfig {
|
||||||
|
max_total_queued_bytes: 1024,
|
||||||
|
max_flow_queued_bytes: 1024,
|
||||||
|
..WorkerFairnessConfig::default()
|
||||||
|
},
|
||||||
|
now,
|
||||||
|
);
|
||||||
|
|
||||||
|
assert_eq!(
|
||||||
|
fairness.enqueue_data(1, 0, enqueue_payload(700), now),
|
||||||
|
AdmissionDecision::Admit
|
||||||
|
);
|
||||||
|
assert_eq!(
|
||||||
|
fairness.enqueue_data(2, 0, enqueue_payload(400), now),
|
||||||
|
AdmissionDecision::RejectWorkerCap
|
||||||
|
);
|
||||||
|
|
||||||
|
let snapshot = fairness.snapshot();
|
||||||
|
assert!(snapshot.total_queued_bytes <= 1024);
|
||||||
|
assert_eq!(snapshot.enqueue_rejects, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn fairness_marks_standing_queue_after_stall_and_age_threshold() {
|
||||||
|
let mut now = Instant::now();
|
||||||
|
let mut fairness = WorkerFairnessState::new(
|
||||||
|
WorkerFairnessConfig {
|
||||||
|
standing_queue_min_age: Duration::from_millis(50),
|
||||||
|
standing_queue_min_backlog_bytes: 256,
|
||||||
|
standing_stall_threshold: 1,
|
||||||
|
max_flow_queued_bytes: 4096,
|
||||||
|
max_total_queued_bytes: 4096,
|
||||||
|
..WorkerFairnessConfig::default()
|
||||||
|
},
|
||||||
|
now,
|
||||||
|
);
|
||||||
|
|
||||||
|
assert_eq!(
|
||||||
|
fairness.enqueue_data(11, 0, enqueue_payload(512), now),
|
||||||
|
AdmissionDecision::Admit
|
||||||
|
);
|
||||||
|
|
||||||
|
now += Duration::from_millis(100);
|
||||||
|
let SchedulerDecision::Dispatch(candidate) = fairness.next_decision(now) else {
|
||||||
|
panic!("expected dispatch candidate");
|
||||||
|
};
|
||||||
|
|
||||||
|
let action = fairness.apply_dispatch_feedback(11, candidate, DispatchFeedback::QueueFull, now);
|
||||||
|
assert!(matches!(action, DispatchAction::Continue));
|
||||||
|
|
||||||
|
let snapshot = fairness.snapshot();
|
||||||
|
assert_eq!(snapshot.standing_flows, 1);
|
||||||
|
assert!(snapshot.backpressured_flows >= 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn fairness_keeps_fast_flow_progress_under_slow_neighbor() {
|
||||||
|
let mut now = Instant::now();
|
||||||
|
let mut fairness = WorkerFairnessState::new(
|
||||||
|
WorkerFairnessConfig {
|
||||||
|
max_total_queued_bytes: 64 * 1024,
|
||||||
|
max_flow_queued_bytes: 32 * 1024,
|
||||||
|
..WorkerFairnessConfig::default()
|
||||||
|
},
|
||||||
|
now,
|
||||||
|
);
|
||||||
|
|
||||||
|
for _ in 0..16 {
|
||||||
|
assert_eq!(
|
||||||
|
fairness.enqueue_data(1, 0, enqueue_payload(512), now),
|
||||||
|
AdmissionDecision::Admit
|
||||||
|
);
|
||||||
|
assert_eq!(
|
||||||
|
fairness.enqueue_data(2, 0, enqueue_payload(512), now),
|
||||||
|
AdmissionDecision::Admit
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
let mut fast_routed = 0u64;
|
||||||
|
for _ in 0..128 {
|
||||||
|
now += Duration::from_millis(5);
|
||||||
|
let SchedulerDecision::Dispatch(candidate) = fairness.next_decision(now) else {
|
||||||
|
break;
|
||||||
|
};
|
||||||
|
let cid = candidate.frame.conn_id;
|
||||||
|
let feedback = if cid == 2 {
|
||||||
|
DispatchFeedback::QueueFull
|
||||||
|
} else {
|
||||||
|
fast_routed = fast_routed.saturating_add(1);
|
||||||
|
DispatchFeedback::Routed
|
||||||
|
};
|
||||||
|
let _ = fairness.apply_dispatch_feedback(cid, candidate, feedback, now);
|
||||||
|
}
|
||||||
|
|
||||||
|
let snapshot = fairness.snapshot();
|
||||||
|
assert!(fast_routed > 0, "fast flow must continue making progress");
|
||||||
|
assert!(snapshot.total_queued_bytes <= 64 * 1024);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn fairness_pressure_hysteresis_prevents_instant_flapping() {
|
||||||
|
let mut now = Instant::now();
|
||||||
|
let mut cfg = WorkerFairnessConfig::default();
|
||||||
|
cfg.max_total_queued_bytes = 4096;
|
||||||
|
cfg.max_flow_queued_bytes = 4096;
|
||||||
|
cfg.pressure.evaluate_every_rounds = 1;
|
||||||
|
cfg.pressure.transition_hysteresis_rounds = 3;
|
||||||
|
cfg.pressure.queue_ratio_pressured_pct = 40;
|
||||||
|
cfg.pressure.queue_ratio_shedding_pct = 60;
|
||||||
|
cfg.pressure.queue_ratio_saturated_pct = 80;
|
||||||
|
|
||||||
|
let mut fairness = WorkerFairnessState::new(cfg, now);
|
||||||
|
|
||||||
|
for _ in 0..4 {
|
||||||
|
assert_eq!(
|
||||||
|
fairness.enqueue_data(9, 0, enqueue_payload(900), now),
|
||||||
|
AdmissionDecision::Admit
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
for _ in 0..2 {
|
||||||
|
now += Duration::from_millis(1);
|
||||||
|
let _ = fairness.next_decision(now);
|
||||||
|
}
|
||||||
|
|
||||||
|
assert_eq!(
|
||||||
|
fairness.pressure_state(),
|
||||||
|
PressureState::Normal,
|
||||||
|
"state must not flip before hysteresis confirmations"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn fairness_randomized_sequence_preserves_memory_bounds() {
|
||||||
|
let mut now = Instant::now();
|
||||||
|
let mut fairness = WorkerFairnessState::new(
|
||||||
|
WorkerFairnessConfig {
|
||||||
|
max_total_queued_bytes: 32 * 1024,
|
||||||
|
max_flow_queued_bytes: 4 * 1024,
|
||||||
|
..WorkerFairnessConfig::default()
|
||||||
|
},
|
||||||
|
now,
|
||||||
|
);
|
||||||
|
|
||||||
|
let mut seed = 0xC0FFEE_u64;
|
||||||
|
for _ in 0..4096 {
|
||||||
|
seed ^= seed << 7;
|
||||||
|
seed ^= seed >> 9;
|
||||||
|
seed ^= seed << 8;
|
||||||
|
let flow = (seed % 32) + 1;
|
||||||
|
let size = ((seed >> 8) % 512 + 64) as usize;
|
||||||
|
let _ = fairness.enqueue_data(flow, 0, enqueue_payload(size), now);
|
||||||
|
|
||||||
|
now += Duration::from_millis(1);
|
||||||
|
if let SchedulerDecision::Dispatch(candidate) = fairness.next_decision(now) {
|
||||||
|
let feedback = if seed & 0x1 == 0 {
|
||||||
|
DispatchFeedback::Routed
|
||||||
|
} else {
|
||||||
|
DispatchFeedback::QueueFull
|
||||||
|
};
|
||||||
|
let _ =
|
||||||
|
fairness.apply_dispatch_feedback(candidate.frame.conn_id, candidate, feedback, now);
|
||||||
|
}
|
||||||
|
|
||||||
|
let snapshot = fairness.snapshot();
|
||||||
|
assert!(snapshot.total_queued_bytes <= 32 * 1024);
|
||||||
|
}
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user