Compare commits

..

22 Commits

Author SHA1 Message Date
Alexey
30ba41eb47 Merge pull request #479 from telemt/bump
Update Cargo.toml
2026-03-18 11:57:25 +03:00
Alexey
42f946f29e Update Cargo.toml 2026-03-18 11:57:09 +03:00
Alexey
c53d7951b5 Merge pull request #468 from temandroid/main
feat: add Telemt Control API Python simple client with CLI
2026-03-18 11:56:32 +03:00
Alexey
f36e264093 Merge pull request #477 from Dimasssss/CONFIG_PARAMS.md
Create CONFIG_PARAMS.en.md
2026-03-18 11:56:17 +03:00
Alexey
a3bdf64353 ME Coverage Ratio in API + as Draining Factor: merge pull request #478 from telemt/flow-api
ME Coverage Ratio in API + as Draining Factor
2026-03-18 11:56:01 +03:00
Alexey
2aa7ea5137 ME Coverage Ratio in API + as Draining Factor 2026-03-18 11:46:13 +03:00
Dimasssss
462c927da6 Create CONFIG_PARAMS.en.md 2026-03-18 10:53:09 +03:00
Alexey
cb87b2eac3 Adaptive Buffers + Session Eviction Method: merge pull request #475 from telemt/flow-buffers
Adaptive Buffers + Session Eviction Method
2026-03-18 10:52:22 +03:00
Alexey
3739f38440 Adaptive Buffers + Session Eviction Method 2026-03-18 10:49:02 +03:00
TEMAndroid
8e96039a1c Merge branch 'telemt:main' into main 2026-03-17 20:09:50 +03:00
TEMAndroid
36b360dfb6 feat: add Telemt Control API Python simple client with CLI
Stdlib-only HTTP client covering all /v1 endpoints with argparse CLI.
Supports If-Match concurrency, typed errors, user CRUD, and all runtime/stats routes.

Usage: ./telemt_api.py help

AI-Generated from API.md. 
Partially tested. 
Use with caution...
2026-03-17 20:09:36 +03:00
Alexey
5dd0c47f14 Merge pull request #464 from temandroid/patch-1
feat(zabbix): add graphs to Telemt template
2026-03-17 18:53:07 +03:00
TEMAndroid
4739083f57 feat(zabbix): add graphs to Telemt template
- Add per-user graph prototypes (Connections, IPs, Traffic, Messages)
- Add server-level graphs (Connections, Uptime, ME Keepalive, ME Reconnects,
  ME Route Drops, ME Writer Pool/Removals, Desync, Upstream, Refill)
2026-03-17 18:24:57 +03:00
Alexey
37a31c13cb Merge pull request #460 from telemt/bump
Update Cargo.toml
2026-03-17 16:31:46 +03:00
Alexey
35bca7d4cc Update Cargo.toml 2026-03-17 16:31:32 +03:00
Alexey
f39d317d93 Merge pull request #459 from telemt/flow-perf
Flow perf
2026-03-17 16:28:59 +03:00
Alexey
d4d93aabf5 Merge pull request #458 from DavidOsipov/ME-draining-fix-3.3.19
Add health monitoring tests for draining writers
2026-03-17 16:17:41 +03:00
David Osipov
c9271d9083 Add health monitoring tests for draining writers
- Introduced adversarial tests to validate the behavior of the health monitoring system under various conditions, including the management of draining writers.
- Implemented integration tests to ensure the health monitor correctly handles expired and empty draining writers.
- Added regression tests to verify the functionality of the draining writers' cleanup process, ensuring it adheres to the defined thresholds and budgets.
- Updated the module structure to include the new test files for better organization and maintainability.
2026-03-17 17:11:51 +04:00
Alexey
9c9ba4becd Merge pull request #452 from Dimasssss/patch-1
Update TLS-F-TCP-S.ru.md
2026-03-17 15:27:43 +03:00
Dimasssss
bd0cefdb12 Update TLS-F-TCP-S.ru.md 2026-03-17 11:56:56 +03:00
Alexey
e2ed1eb286 Merge pull request #450 from kutovoys/main
feat: add metrics_listen option for metrics endpoint bind address
2026-03-17 11:46:52 +03:00
Sergey Kutovoy
a74def9561 Update metrics configuration to support custom listen address
- Bump telemt dependency version from 3.3.15 to 3.3.19.
- Add `metrics_listen` option to `config.toml` for specifying a custom address for the metrics endpoint.
- Update `ServerConfig` struct to include `metrics_listen` and adjust logic in `spawn_metrics_if_configured` to prioritize this new option over `metrics_port`.
- Enhance error handling for invalid listen addresses in metrics setup.
2026-03-17 12:58:40 +05:00
53 changed files with 5214 additions and 6059 deletions

View File

@@ -1,15 +0,0 @@
[bans]
multiple-versions = "deny"
wildcards = "allow"
highlight = "all"
# Explicitly flag the weak cryptography so the agent is forced to justify its existence
[[bans.skip]]
name = "md-5"
version = "*"
reason = "MUST VERIFY: Only allowed for legacy checksums, never for security."
[[bans.skip]]
name = "sha1"
version = "*"
reason = "MUST VERIFY: Only allowed for backwards compatibility."

View File

@@ -5,22 +5,6 @@ Your responses are precise, minimal, and architecturally sound. You are working
---
### Context: The Telemt Project
You are working on **Telemt**, a high-performance, production-grade Telegram MTProxy implementation written in Rust. It is explicitly designed to operate in highly hostile network environments and evade advanced network censorship.
**Adversarial Threat Model:**
The proxy operates under constant surveillance by DPI (Deep Packet Inspection) systems and active scanners (state firewalls, mobile operator fraud controls). These entities actively probe IPs, analyze protocol handshakes, and look for known proxy signatures to block or throttle traffic.
**Core Architectural Pillars:**
1. **TLS-Fronting (TLS-F) & TCP-Splitting (TCP-S):** To the outside world, Telemt looks like a standard TLS server. If a client presents a valid MTProxy key, the connection is handled internally. If a censor's scanner, web browser, or unauthorized crawler connects, Telemt seamlessly splices the TCP connection (L4) to a real, legitimate HTTPS fallback server (e.g., Nginx) without modifying the `ClientHello` or terminating the TLS handshake.
2. **Middle-End (ME) Orchestration:** A highly concurrent, generation-based pool managing upstream connections to Telegram Datacenters (DCs). It utilizes an **Adaptive Floor** (dynamically scaling writer connections based on traffic), **Hardswaps** (zero-downtime pool reconfiguration), and **STUN/NAT** reflection mechanisms.
3. **Strict KDF Routing:** Cryptographic Key Derivation Functions (KDF) in this protocol strictly rely on the exact pairing of Source IP/Port and Destination IP/Port. Deviations or missing port logic will silently break the MTProto handshake.
4. **Data Plane vs. Control Plane Isolation:** The Data Plane (readers, writers, payload relay, TCP splicing) must remain strictly non-blocking, zero-allocation in hot paths, and highly resilient to network backpressure. The Control Plane (API, metrics, pool generation swaps, config reloads) orchestrates the state asynchronously without stalling the Data Plane.
Any modification you make must preserve Telemt's invisibility to censors, its strict memory-safety invariants, and its hot-path throughput.
### 0. Priority Resolution — Scope Control
This section resolves conflicts between code quality enforcement and scope limitation.

8
Cargo.lock generated
View File

@@ -2025,12 +2025,6 @@ version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6ce2be8dc25455e1f91df71bfa12ad37d7af1092ae736f3a6cd0e37bc7810596"
[[package]]
name = "static_assertions"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a2eb9349b6444b326872e140eb1cf5e7c522154d69e7a0ffb0fb81c06b37543f"
[[package]]
name = "subtle"
version = "2.6.1"
@@ -2133,8 +2127,6 @@ dependencies = [
"sha1",
"sha2",
"socket2 0.5.10",
"static_assertions",
"subtle",
"thiserror 2.0.18",
"tokio",
"tokio-rustls",

View File

@@ -1,6 +1,6 @@
[package]
name = "telemt"
version = "3.3.20"
version = "3.3.21"
edition = "2024"
[dependencies]
@@ -22,8 +22,6 @@ hmac = "0.12"
crc32fast = "1.4"
crc32c = "0.6"
zeroize = { version = "1.8", features = ["derive"] }
subtle = "2.6"
static_assertions = "1.1"
# Network
socket2 = { version = "0.5", features = ["all"] }

View File

@@ -32,12 +32,13 @@ show = "*"
port = 443
# proxy_protocol = false # Enable if behind HAProxy/nginx with PROXY protocol
# metrics_port = 9090
# metrics_listen = "0.0.0.0:9090" # Listen address for metrics (overrides metrics_port)
# metrics_whitelist = ["127.0.0.1", "::1", "0.0.0.0/0"]
[server.api]
enabled = true
listen = "0.0.0.0:9091"
whitelist = ["127.0.0.0/8", "172.16.0.0/12"]
whitelist = ["127.0.0.0/8"]
minimal_runtime_enabled = false
minimal_runtime_cache_ttl_ms = 1000

View File

@@ -6,8 +6,7 @@ services:
restart: unless-stopped
ports:
- "443:443"
- "127.0.0.1:9090:9090" # Metrics
- "127.0.0.1:9091:9091" # API
- "127.0.0.1:9090:9090"
# Allow caching 'proxy-secret' in read-only container
working_dir: /run/telemt
volumes:
@@ -29,8 +28,3 @@ services:
nofile:
soft: 65536
hard: 65536
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"

289
docs/CONFIG_PARAMS.en.md Normal file
View File

@@ -0,0 +1,289 @@
# Telemt Config Parameters Reference
This document lists all configuration keys accepted by `config.toml`.
> [!WARNING]
>
> The configuration parameters detailed in this document are intended for advanced users and fine-tuning purposes. Modifying these settings without a clear understanding of their function may lead to application instability or other unexpected behavior. Please proceed with caution and at your own risk.
## Top-level keys
| Parameter | Type | Description |
|---|---|---|
| include | `String` (special directive) | Includes another TOML file with `include = "relative/or/absolute/path.toml"`; includes are processed recursively before parsing. |
| show_link | `"*" \| String[]` | Legacy top-level link visibility selector (`"*"` for all users or explicit usernames list). |
| dc_overrides | `Map<String, String[]>` | Overrides DC endpoints for non-standard DCs; key is DC id string, value is `ip:port` list. |
| default_dc | `u8` | Default DC index used for unmapped non-standard DCs. |
## [general]
| Parameter | Type | Description |
|---|---|---|
| data_path | `String` | Optional runtime data directory path. |
| prefer_ipv6 | `bool` | Prefer IPv6 where applicable in runtime logic. |
| fast_mode | `bool` | Enables fast-path optimizations for traffic processing. |
| use_middle_proxy | `bool` | Enables Middle Proxy mode. |
| proxy_secret_path | `String` | Path to proxy secret binary; can be auto-downloaded if absent. |
| proxy_config_v4_cache_path | `String` | Optional cache path for raw `getProxyConfig` (IPv4) snapshot. |
| proxy_config_v6_cache_path | `String` | Optional cache path for raw `getProxyConfigV6` (IPv6) snapshot. |
| ad_tag | `String` | Global fallback ad tag (32 hex characters). |
| middle_proxy_nat_ip | `IpAddr` | Explicit public IP override for NAT environments. |
| middle_proxy_nat_probe | `bool` | Enables NAT probing for Middle Proxy KDF/public address discovery. |
| middle_proxy_nat_stun | `String` | Deprecated legacy single STUN server for NAT probing. |
| middle_proxy_nat_stun_servers | `String[]` | Deprecated legacy STUN list for NAT probing fallback. |
| stun_nat_probe_concurrency | `usize` | Maximum concurrent STUN probes during NAT detection. |
| middle_proxy_pool_size | `usize` | Target size of active Middle Proxy writer pool. |
| middle_proxy_warm_standby | `usize` | Number of warm standby Middle-End connections. |
| me_init_retry_attempts | `u32` | Startup retries for ME pool initialization (`0` means unlimited). |
| me2dc_fallback | `bool` | Allows fallback from ME mode to direct DC when ME startup fails. |
| me_keepalive_enabled | `bool` | Enables ME keepalive padding frames. |
| me_keepalive_interval_secs | `u64` | Keepalive interval in seconds. |
| me_keepalive_jitter_secs | `u64` | Keepalive jitter in seconds. |
| me_keepalive_payload_random | `bool` | Randomizes keepalive payload bytes instead of zero payload. |
| rpc_proxy_req_every | `u64` | Interval for service `RPC_PROXY_REQ` activity signals (`0` disables). |
| me_writer_cmd_channel_capacity | `usize` | Capacity of per-writer command channel. |
| me_route_channel_capacity | `usize` | Capacity of per-connection ME response route channel. |
| me_c2me_channel_capacity | `usize` | Capacity of per-client command queue (client reader -> ME sender). |
| me_reader_route_data_wait_ms | `u64` | Bounded wait for routing ME DATA to per-connection queue (`0` = no wait). |
| me_d2c_flush_batch_max_frames | `usize` | Max ME->client frames coalesced before flush. |
| me_d2c_flush_batch_max_bytes | `usize` | Max ME->client payload bytes coalesced before flush. |
| me_d2c_flush_batch_max_delay_us | `u64` | Max microsecond wait for coalescing more ME->client frames (`0` disables timed coalescing). |
| me_d2c_ack_flush_immediate | `bool` | Flushes client writer immediately after quick-ack write. |
| direct_relay_copy_buf_c2s_bytes | `usize` | Copy buffer size for client->DC direction in direct relay. |
| direct_relay_copy_buf_s2c_bytes | `usize` | Copy buffer size for DC->client direction in direct relay. |
| crypto_pending_buffer | `usize` | Max pending ciphertext buffer per client writer (bytes). |
| max_client_frame | `usize` | Maximum allowed client MTProto frame size (bytes). |
| desync_all_full | `bool` | Emits full crypto-desync forensic logs for every event. |
| beobachten | `bool` | Enables per-IP forensic observation buckets. |
| beobachten_minutes | `u64` | Retention window (minutes) for per-IP observation buckets. |
| beobachten_flush_secs | `u64` | Snapshot flush interval (seconds) for observation output file. |
| beobachten_file | `String` | Observation snapshot output file path. |
| hardswap | `bool` | Enables hard-swap generation switching for ME pool updates. |
| me_warmup_stagger_enabled | `bool` | Enables staggered warmup for extra ME writers. |
| me_warmup_step_delay_ms | `u64` | Base delay between warmup connections (ms). |
| me_warmup_step_jitter_ms | `u64` | Jitter for warmup delay (ms). |
| me_reconnect_max_concurrent_per_dc | `u32` | Max concurrent reconnect attempts per DC. |
| me_reconnect_backoff_base_ms | `u64` | Base reconnect backoff in ms. |
| me_reconnect_backoff_cap_ms | `u64` | Cap reconnect backoff in ms. |
| me_reconnect_fast_retry_count | `u32` | Number of fast retry attempts before backoff. |
| me_single_endpoint_shadow_writers | `u8` | Additional reserve writers for one-endpoint DC groups. |
| me_single_endpoint_outage_mode_enabled | `bool` | Enables aggressive outage recovery for one-endpoint DC groups. |
| me_single_endpoint_outage_disable_quarantine | `bool` | Ignores endpoint quarantine in one-endpoint outage mode. |
| me_single_endpoint_outage_backoff_min_ms | `u64` | Minimum reconnect backoff in outage mode (ms). |
| me_single_endpoint_outage_backoff_max_ms | `u64` | Maximum reconnect backoff in outage mode (ms). |
| me_single_endpoint_shadow_rotate_every_secs | `u64` | Periodic shadow writer rotation interval (`0` disables). |
| me_floor_mode | `"static" \| "adaptive"` | Writer floor policy mode. |
| me_adaptive_floor_idle_secs | `u64` | Idle time before adaptive floor may reduce one-endpoint target. |
| me_adaptive_floor_min_writers_single_endpoint | `u8` | Minimum adaptive writer target for one-endpoint DC groups. |
| me_adaptive_floor_min_writers_multi_endpoint | `u8` | Minimum adaptive writer target for multi-endpoint DC groups. |
| me_adaptive_floor_recover_grace_secs | `u64` | Grace period to hold static floor after activity. |
| me_adaptive_floor_writers_per_core_total | `u16` | Global writer budget per logical CPU core in adaptive mode. |
| me_adaptive_floor_cpu_cores_override | `u16` | Manual CPU core count override (`0` uses auto-detection). |
| me_adaptive_floor_max_extra_writers_single_per_core | `u16` | Per-core max extra writers above base floor for one-endpoint DCs. |
| me_adaptive_floor_max_extra_writers_multi_per_core | `u16` | Per-core max extra writers above base floor for multi-endpoint DCs. |
| me_adaptive_floor_max_active_writers_per_core | `u16` | Hard cap for active ME writers per logical CPU core. |
| me_adaptive_floor_max_warm_writers_per_core | `u16` | Hard cap for warm ME writers per logical CPU core. |
| me_adaptive_floor_max_active_writers_global | `u32` | Hard global cap for active ME writers. |
| me_adaptive_floor_max_warm_writers_global | `u32` | Hard global cap for warm ME writers. |
| upstream_connect_retry_attempts | `u32` | Connect attempts for selected upstream before error/fallback. |
| upstream_connect_retry_backoff_ms | `u64` | Delay between upstream connect attempts (ms). |
| upstream_connect_budget_ms | `u64` | Total wall-clock budget for one upstream connect request (ms). |
| upstream_unhealthy_fail_threshold | `u32` | Consecutive failed requests before upstream is marked unhealthy. |
| upstream_connect_failfast_hard_errors | `bool` | Skips additional retries for hard non-transient connect errors. |
| stun_iface_mismatch_ignore | `bool` | Ignores STUN/interface mismatch and keeps Middle Proxy mode. |
| unknown_dc_log_path | `String` | File path for unknown-DC request logging (`null` disables file path). |
| unknown_dc_file_log_enabled | `bool` | Enables unknown-DC file logging. |
| log_level | `"debug" \| "verbose" \| "normal" \| "silent"` | Runtime logging verbosity. |
| disable_colors | `bool` | Disables ANSI colors in logs. |
| me_socks_kdf_policy | `"strict" \| "compat"` | SOCKS-bound KDF fallback policy for ME handshake. |
| me_route_backpressure_base_timeout_ms | `u64` | Base backpressure timeout for route-channel send (ms). |
| me_route_backpressure_high_timeout_ms | `u64` | High backpressure timeout when queue occupancy exceeds watermark (ms). |
| me_route_backpressure_high_watermark_pct | `u8` | Queue occupancy threshold (%) for high timeout mode. |
| me_health_interval_ms_unhealthy | `u64` | Health monitor interval while writer coverage is degraded (ms). |
| me_health_interval_ms_healthy | `u64` | Health monitor interval while writer coverage is healthy (ms). |
| me_admission_poll_ms | `u64` | Poll interval for conditional-admission checks (ms). |
| me_warn_rate_limit_ms | `u64` | Cooldown for repetitive ME warning logs (ms). |
| me_route_no_writer_mode | `"async_recovery_failfast" \| "inline_recovery_legacy" \| "hybrid_async_persistent"` | Route behavior when no writer is immediately available. |
| me_route_no_writer_wait_ms | `u64` | Max wait in async-recovery failfast mode (ms). |
| me_route_inline_recovery_attempts | `u32` | Inline recovery attempts in legacy mode. |
| me_route_inline_recovery_wait_ms | `u64` | Max inline recovery wait in legacy mode (ms). |
| fast_mode_min_tls_record | `usize` | Minimum TLS record size when fast-mode coalescing is enabled (`0` disables). |
| update_every | `u64` | Unified interval for config/secret updater tasks. |
| me_reinit_every_secs | `u64` | Periodic ME pool reinitialization interval (seconds). |
| me_hardswap_warmup_delay_min_ms | `u64` | Minimum delay between hardswap warmup connects (ms). |
| me_hardswap_warmup_delay_max_ms | `u64` | Maximum delay between hardswap warmup connects (ms). |
| me_hardswap_warmup_extra_passes | `u8` | Additional warmup passes per hardswap cycle. |
| me_hardswap_warmup_pass_backoff_base_ms | `u64` | Base backoff between hardswap warmup passes (ms). |
| me_config_stable_snapshots | `u8` | Number of identical config snapshots required before apply. |
| me_config_apply_cooldown_secs | `u64` | Cooldown between applied ME map updates (seconds). |
| me_snapshot_require_http_2xx | `bool` | Requires 2xx HTTP responses for applying config snapshots. |
| me_snapshot_reject_empty_map | `bool` | Rejects empty config snapshots. |
| me_snapshot_min_proxy_for_lines | `u32` | Minimum parsed `proxy_for` rows required to accept snapshot. |
| proxy_secret_stable_snapshots | `u8` | Number of identical secret snapshots required before runtime rotation. |
| proxy_secret_rotate_runtime | `bool` | Enables runtime proxy-secret rotation from remote source. |
| me_secret_atomic_snapshot | `bool` | Keeps selector and secret bytes from the same snapshot atomically. |
| proxy_secret_len_max | `usize` | Maximum allowed proxy-secret length (bytes). |
| me_pool_drain_ttl_secs | `u64` | Drain TTL for stale ME writers after endpoint-map changes (seconds). |
| me_pool_drain_threshold | `u64` | Max draining stale writers before batch force-close (`0` disables threshold cleanup). |
| me_bind_stale_mode | `"never" \| "ttl" \| "always"` | Policy for new binds on stale draining writers. |
| me_bind_stale_ttl_secs | `u64` | TTL for stale bind allowance when stale mode is `ttl`. |
| me_pool_min_fresh_ratio | `f32` | Minimum desired-DC fresh coverage ratio before draining stale writers. |
| me_reinit_drain_timeout_secs | `u64` | Force-close timeout for stale writers after endpoint-map changes (`0` disables force-close). |
| proxy_secret_auto_reload_secs | `u64` | Deprecated legacy secret reload interval (fallback when `update_every` is not set). |
| proxy_config_auto_reload_secs | `u64` | Deprecated legacy config reload interval (fallback when `update_every` is not set). |
| me_reinit_singleflight | `bool` | Serializes ME reinit cycles across trigger sources. |
| me_reinit_trigger_channel | `usize` | Trigger queue capacity for reinit scheduler. |
| me_reinit_coalesce_window_ms | `u64` | Trigger coalescing window before starting reinit (ms). |
| me_deterministic_writer_sort | `bool` | Enables deterministic candidate sort for writer binding path. |
| me_writer_pick_mode | `"sorted_rr" \| "p2c"` | Writer selection mode for route bind path. |
| me_writer_pick_sample_size | `u8` | Number of candidates sampled by picker in `p2c` mode. |
| ntp_check | `bool` | Enables NTP drift check at startup. |
| ntp_servers | `String[]` | NTP servers used for drift check. |
| auto_degradation_enabled | `bool` | Enables automatic degradation from ME to direct DC. |
| degradation_min_unavailable_dc_groups | `u8` | Minimum unavailable ME DC groups required before degrading. |
## [general.modes]
| Parameter | Type | Description |
|---|---|---|
| classic | `bool` | Enables classic MTProxy mode. |
| secure | `bool` | Enables secure mode. |
| tls | `bool` | Enables TLS mode. |
## [general.links]
| Parameter | Type | Description |
|---|---|---|
| show | `"*" \| String[]` | Selects users whose tg:// links are shown at startup. |
| public_host | `String` | Public hostname/IP override for generated tg:// links. |
| public_port | `u16` | Public port override for generated tg:// links. |
## [general.telemetry]
| Parameter | Type | Description |
|---|---|---|
| core_enabled | `bool` | Enables core hot-path telemetry counters. |
| user_enabled | `bool` | Enables per-user telemetry counters. |
| me_level | `"silent" \| "normal" \| "debug"` | Middle-End telemetry verbosity level. |
## [network]
| Parameter | Type | Description |
|---|---|---|
| ipv4 | `bool` | Enables IPv4 networking. |
| ipv6 | `bool` | Enables/disables IPv6 (`null` = auto-detect availability). |
| prefer | `u8` | Preferred IP family for selection (`4` or `6`). |
| multipath | `bool` | Enables multipath behavior where supported. |
| stun_use | `bool` | Global switch for STUN probing. |
| stun_servers | `String[]` | STUN server list for public IP detection. |
| stun_tcp_fallback | `bool` | Enables TCP STUN fallback when UDP STUN is blocked. |
| http_ip_detect_urls | `String[]` | HTTP endpoints used as fallback public IP detectors. |
| cache_public_ip_path | `String` | File path for caching detected public IP. |
| dns_overrides | `String[]` | Runtime DNS overrides in `host:port:ip` format. |
## [server]
| Parameter | Type | Description |
|---|---|---|
| port | `u16` | Main proxy listen port. |
| listen_addr_ipv4 | `String` | IPv4 bind address for TCP listener. |
| listen_addr_ipv6 | `String` | IPv6 bind address for TCP listener. |
| listen_unix_sock | `String` | Unix socket path for listener. |
| listen_unix_sock_perm | `String` | Unix socket permissions in octal string (e.g., `"0666"`). |
| listen_tcp | `bool` | Explicit TCP listener enable/disable override. |
| proxy_protocol | `bool` | Enables HAProxy PROXY protocol parsing on incoming client connections. |
| proxy_protocol_header_timeout_ms | `u64` | Timeout for PROXY protocol header read/parse (ms). |
| metrics_port | `u16` | Metrics endpoint port (enables metrics listener). |
| metrics_listen | `String` | Full metrics bind address (`IP:PORT`), overrides `metrics_port`. |
| metrics_whitelist | `IpNetwork[]` | CIDR whitelist for metrics endpoint access. |
| max_connections | `u32` | Max concurrent client connections (`0` = unlimited). |
## [server.api]
| Parameter | Type | Description |
|---|---|---|
| enabled | `bool` | Enables control-plane REST API. |
| listen | `String` | API bind address in `IP:PORT` format. |
| whitelist | `IpNetwork[]` | CIDR whitelist allowed to access API. |
| auth_header | `String` | Exact expected `Authorization` header value (empty = disabled). |
| request_body_limit_bytes | `usize` | Maximum accepted HTTP request body size. |
| minimal_runtime_enabled | `bool` | Enables minimal runtime snapshots endpoint logic. |
| minimal_runtime_cache_ttl_ms | `u64` | Cache TTL for minimal runtime snapshots (ms; `0` disables cache). |
| runtime_edge_enabled | `bool` | Enables runtime edge endpoints. |
| runtime_edge_cache_ttl_ms | `u64` | Cache TTL for runtime edge aggregation payloads (ms). |
| runtime_edge_top_n | `usize` | Top-N size for edge connection leaderboard. |
| runtime_edge_events_capacity | `usize` | Ring-buffer capacity for runtime edge events. |
| read_only | `bool` | Rejects mutating API endpoints when enabled. |
## [[server.listeners]]
| Parameter | Type | Description |
|---|---|---|
| ip | `IpAddr` | Listener bind IP. |
| announce | `String` | Public IP/domain announced in proxy links (priority over `announce_ip`). |
| announce_ip | `IpAddr` | Deprecated legacy announce IP (migrated to `announce` if needed). |
| proxy_protocol | `bool` | Per-listener override for PROXY protocol enable flag. |
| reuse_allow | `bool` | Enables `SO_REUSEPORT` for multi-instance bind sharing. |
## [timeouts]
| Parameter | Type | Description |
|---|---|---|
| client_handshake | `u64` | Client handshake timeout. |
| tg_connect | `u64` | Upstream Telegram connect timeout. |
| client_keepalive | `u64` | Client keepalive timeout. |
| client_ack | `u64` | Client ACK timeout. |
| me_one_retry | `u8` | Quick ME reconnect attempts for single-address DC. |
| me_one_timeout_ms | `u64` | Timeout per quick attempt for single-address DC (ms). |
## [censorship]
| Parameter | Type | Description |
|---|---|---|
| tls_domain | `String` | Primary TLS domain used in fake TLS handshake profile. |
| tls_domains | `String[]` | Additional TLS domains for generating multiple links. |
| mask | `bool` | Enables masking/fronting relay mode. |
| mask_host | `String` | Upstream mask host for TLS fronting relay. |
| mask_port | `u16` | Upstream mask port for TLS fronting relay. |
| mask_unix_sock | `String` | Unix socket path for mask backend instead of TCP host/port. |
| fake_cert_len | `usize` | Length of synthetic certificate payload when emulation data is unavailable. |
| tls_emulation | `bool` | Enables certificate/TLS behavior emulation from cached real fronts. |
| tls_front_dir | `String` | Directory path for TLS front cache storage. |
| server_hello_delay_min_ms | `u64` | Minimum server_hello delay for anti-fingerprint behavior (ms). |
| server_hello_delay_max_ms | `u64` | Maximum server_hello delay for anti-fingerprint behavior (ms). |
| tls_new_session_tickets | `u8` | Number of `NewSessionTicket` messages to emit after handshake. |
| tls_full_cert_ttl_secs | `u64` | TTL for sending full cert payload per (domain, client IP) tuple. |
| alpn_enforce | `bool` | Enforces ALPN echo behavior based on client preference. |
| mask_proxy_protocol | `u8` | PROXY protocol mode for mask backend (`0` disabled, `1` v1, `2` v2). |
## [access]
| Parameter | Type | Description |
|---|---|---|
| users | `Map<String, String>` | Username -> 32-hex secret mapping. |
| user_ad_tags | `Map<String, String>` | Per-user ad tags (32 hex chars). |
| user_max_tcp_conns | `Map<String, usize>` | Per-user maximum concurrent TCP connections. |
| user_expirations | `Map<String, DateTime<Utc>>` | Per-user account expiration timestamps. |
| user_data_quota | `Map<String, u64>` | Per-user data quota limits. |
| user_max_unique_ips | `Map<String, usize>` | Per-user unique source IP limits. |
| user_max_unique_ips_global_each | `usize` | Global fallback per-user unique IP limit when no per-user override exists. |
| user_max_unique_ips_mode | `"active_window" \| "time_window" \| "combined"` | Unique source IP limit accounting mode. |
| user_max_unique_ips_window_secs | `u64` | Recent-window size for unique IP accounting (seconds). |
| replay_check_len | `usize` | Replay check storage length. |
| replay_window_secs | `u64` | Replay protection time window in seconds. |
| ignore_time_skew | `bool` | Ignores client/server timestamp skew in replay validation. |
## [[upstreams]]
| Parameter | Type | Description |
|---|---|---|
| type | `"direct" \| "socks4" \| "socks5"` | Upstream transport type selector. |
| weight | `u16` | Weighted selection coefficient for this upstream. |
| enabled | `bool` | Enables/disables this upstream entry. |
| scopes | `String` | Comma-separated scope tags for routing. |
| interface | `String` | Optional outgoing interface name (`direct`, `socks4`, `socks5`). |
| bind_addresses | `String[]` | Optional source bind addresses for `direct` upstream. |
| address | `String` | Upstream proxy address (`host:port`) for SOCKS upstreams. |
| user_id | `String` | SOCKS4 user ID (only for `type = "socks4"`). |
| username | `String` | SOCKS5 username (only for `type = "socks5"`). |
| password | `String` | SOCKS5 password (only for `type = "socks5"`). |

View File

@@ -38,8 +38,9 @@ umweltschutz.de -> A-запись 198.18.88.88
В конфигурации Telemt:
```
tls_domain = umweltschutz.de
```toml
[censorship]
tls_domain = "umweltschutz.de"
```
Этот домен используется клиентом как SNI в ClientHello
@@ -56,8 +57,9 @@ tls_domain = umweltschutz.de
В конфигурации Telemt:
```
mask_host = 127.0.0.1
```toml
[censorship]
mask_host = "127.0.0.1"
mask_port = 8443
```
@@ -151,16 +153,18 @@ mask_host:mask_port
Например:
```
tls_domain = github.com
mask_host = github.com
```toml
[censorship]
tls_domain = "github.com"
mask_host = "github.com"
mask_port = 443
```
или
```
mask_host = 140.82.121.4
```toml
[censorship]
mask_host = "140.82.121.4"
```
В этом случае:

View File

@@ -195,6 +195,8 @@ pub(super) struct ZeroPoolData {
pub(super) pool_swap_total: u64,
pub(super) pool_drain_active: u64,
pub(super) pool_force_close_total: u64,
pub(super) pool_drain_soft_evict_total: u64,
pub(super) pool_drain_soft_evict_writer_total: u64,
pub(super) pool_stale_pick_total: u64,
pub(super) writer_removed_total: u64,
pub(super) writer_removed_unexpected_total: u64,
@@ -235,6 +237,7 @@ pub(super) struct MeWritersSummary {
pub(super) available_pct: f64,
pub(super) required_writers: usize,
pub(super) alive_writers: usize,
pub(super) coverage_ratio: f64,
pub(super) coverage_pct: f64,
pub(super) fresh_alive_writers: usize,
pub(super) fresh_coverage_pct: f64,
@@ -283,6 +286,7 @@ pub(super) struct DcStatus {
pub(super) floor_max: usize,
pub(super) floor_capped: bool,
pub(super) alive_writers: usize,
pub(super) coverage_ratio: f64,
pub(super) coverage_pct: f64,
pub(super) fresh_alive_writers: usize,
pub(super) fresh_coverage_pct: f64,
@@ -360,6 +364,11 @@ pub(super) struct MinimalMeRuntimeData {
pub(super) me_reconnect_backoff_cap_ms: u64,
pub(super) me_reconnect_fast_retry_count: u32,
pub(super) me_pool_drain_ttl_secs: u64,
pub(super) me_pool_drain_soft_evict_enabled: bool,
pub(super) me_pool_drain_soft_evict_grace_secs: u64,
pub(super) me_pool_drain_soft_evict_per_writer: u8,
pub(super) me_pool_drain_soft_evict_budget_per_core: u16,
pub(super) me_pool_drain_soft_evict_cooldown_ms: u64,
pub(super) me_pool_force_close_secs: u64,
pub(super) me_pool_min_fresh_ratio: f32,
pub(super) me_bind_stale_mode: &'static str,

View File

@@ -113,6 +113,7 @@ pub(super) struct RuntimeMeQualityDcRttData {
pub(super) rtt_ema_ms: Option<f64>,
pub(super) alive_writers: usize,
pub(super) required_writers: usize,
pub(super) coverage_ratio: f64,
pub(super) coverage_pct: f64,
}
@@ -388,6 +389,7 @@ pub(super) async fn build_runtime_me_quality_data(shared: &ApiShared) -> Runtime
rtt_ema_ms: dc.rtt_ms,
alive_writers: dc.alive_writers,
required_writers: dc.required_writers,
coverage_ratio: dc.coverage_ratio,
coverage_pct: dc.coverage_pct,
})
.collect(),

View File

@@ -96,6 +96,8 @@ pub(super) fn build_zero_all_data(stats: &Stats, configured_users: usize) -> Zer
pool_swap_total: stats.get_pool_swap_total(),
pool_drain_active: stats.get_pool_drain_active(),
pool_force_close_total: stats.get_pool_force_close_total(),
pool_drain_soft_evict_total: stats.get_pool_drain_soft_evict_total(),
pool_drain_soft_evict_writer_total: stats.get_pool_drain_soft_evict_writer_total(),
pool_stale_pick_total: stats.get_pool_stale_pick_total(),
writer_removed_total: stats.get_me_writer_removed_total(),
writer_removed_unexpected_total: stats.get_me_writer_removed_unexpected_total(),
@@ -313,6 +315,7 @@ async fn get_minimal_payload_cached(
available_pct: status.available_pct,
required_writers: status.required_writers,
alive_writers: status.alive_writers,
coverage_ratio: status.coverage_ratio,
coverage_pct: status.coverage_pct,
fresh_alive_writers: status.fresh_alive_writers,
fresh_coverage_pct: status.fresh_coverage_pct,
@@ -370,6 +373,7 @@ async fn get_minimal_payload_cached(
floor_max: entry.floor_max,
floor_capped: entry.floor_capped,
alive_writers: entry.alive_writers,
coverage_ratio: entry.coverage_ratio,
coverage_pct: entry.coverage_pct,
fresh_alive_writers: entry.fresh_alive_writers,
fresh_coverage_pct: entry.fresh_coverage_pct,
@@ -427,6 +431,11 @@ async fn get_minimal_payload_cached(
me_reconnect_backoff_cap_ms: runtime.me_reconnect_backoff_cap_ms,
me_reconnect_fast_retry_count: runtime.me_reconnect_fast_retry_count,
me_pool_drain_ttl_secs: runtime.me_pool_drain_ttl_secs,
me_pool_drain_soft_evict_enabled: runtime.me_pool_drain_soft_evict_enabled,
me_pool_drain_soft_evict_grace_secs: runtime.me_pool_drain_soft_evict_grace_secs,
me_pool_drain_soft_evict_per_writer: runtime.me_pool_drain_soft_evict_per_writer,
me_pool_drain_soft_evict_budget_per_core: runtime.me_pool_drain_soft_evict_budget_per_core,
me_pool_drain_soft_evict_cooldown_ms: runtime.me_pool_drain_soft_evict_cooldown_ms,
me_pool_force_close_secs: runtime.me_pool_force_close_secs,
me_pool_min_fresh_ratio: runtime.me_pool_min_fresh_ratio,
me_bind_stale_mode: runtime.me_bind_stale_mode,
@@ -495,6 +504,7 @@ fn disabled_me_writers(now_epoch_secs: u64, reason: &'static str) -> MeWritersDa
available_pct: 0.0,
required_writers: 0,
alive_writers: 0,
coverage_ratio: 0.0,
coverage_pct: 0.0,
fresh_alive_writers: 0,
fresh_coverage_pct: 0.0,

View File

@@ -27,8 +27,8 @@ const DEFAULT_ME_C2ME_CHANNEL_CAPACITY: usize = 1024;
const DEFAULT_ME_READER_ROUTE_DATA_WAIT_MS: u64 = 2;
const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_FRAMES: usize = 32;
const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_BYTES: usize = 128 * 1024;
const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_DELAY_US: u64 = 1500;
const DEFAULT_ME_D2C_ACK_FLUSH_IMMEDIATE: bool = false;
const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_DELAY_US: u64 = 500;
const DEFAULT_ME_D2C_ACK_FLUSH_IMMEDIATE: bool = true;
const DEFAULT_DIRECT_RELAY_COPY_BUF_C2S_BYTES: usize = 64 * 1024;
const DEFAULT_DIRECT_RELAY_COPY_BUF_S2C_BYTES: usize = 256 * 1024;
const DEFAULT_ME_WRITER_PICK_SAMPLE_SIZE: u8 = 3;
@@ -36,6 +36,11 @@ const DEFAULT_ME_HEALTH_INTERVAL_MS_UNHEALTHY: u64 = 1000;
const DEFAULT_ME_HEALTH_INTERVAL_MS_HEALTHY: u64 = 3000;
const DEFAULT_ME_ADMISSION_POLL_MS: u64 = 1000;
const DEFAULT_ME_WARN_RATE_LIMIT_MS: u64 = 5000;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_ENABLED: bool = true;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_GRACE_SECS: u64 = 30;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_PER_WRITER: u8 = 1;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_BUDGET_PER_CORE: u16 = 8;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_COOLDOWN_MS: u64 = 5000;
const DEFAULT_USER_MAX_UNIQUE_IPS_WINDOW_SECS: u64 = 30;
const DEFAULT_UPSTREAM_CONNECT_RETRY_ATTEMPTS: u32 = 2;
const DEFAULT_UPSTREAM_UNHEALTHY_FAIL_THRESHOLD: u32 = 5;
@@ -85,11 +90,11 @@ pub(crate) fn default_connect_timeout() -> u64 {
}
pub(crate) fn default_keepalive() -> u64 {
60
15
}
pub(crate) fn default_ack_timeout() -> u64 {
300
90
}
pub(crate) fn default_me_one_retry() -> u8 {
12
@@ -592,6 +597,26 @@ pub(crate) fn default_me_pool_drain_threshold() -> u64 {
128
}
pub(crate) fn default_me_pool_drain_soft_evict_enabled() -> bool {
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_ENABLED
}
pub(crate) fn default_me_pool_drain_soft_evict_grace_secs() -> u64 {
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_GRACE_SECS
}
pub(crate) fn default_me_pool_drain_soft_evict_per_writer() -> u8 {
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_PER_WRITER
}
pub(crate) fn default_me_pool_drain_soft_evict_budget_per_core() -> u16 {
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_BUDGET_PER_CORE
}
pub(crate) fn default_me_pool_drain_soft_evict_cooldown_ms() -> u64 {
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_COOLDOWN_MS
}
pub(crate) fn default_me_bind_stale_ttl_secs() -> u64 {
default_me_pool_drain_ttl_secs()
}

View File

@@ -56,6 +56,11 @@ pub struct HotFields {
pub hardswap: bool,
pub me_pool_drain_ttl_secs: u64,
pub me_pool_drain_threshold: u64,
pub me_pool_drain_soft_evict_enabled: bool,
pub me_pool_drain_soft_evict_grace_secs: u64,
pub me_pool_drain_soft_evict_per_writer: u8,
pub me_pool_drain_soft_evict_budget_per_core: u16,
pub me_pool_drain_soft_evict_cooldown_ms: u64,
pub me_pool_min_fresh_ratio: f32,
pub me_reinit_drain_timeout_secs: u64,
pub me_hardswap_warmup_delay_min_ms: u64,
@@ -138,6 +143,15 @@ impl HotFields {
hardswap: cfg.general.hardswap,
me_pool_drain_ttl_secs: cfg.general.me_pool_drain_ttl_secs,
me_pool_drain_threshold: cfg.general.me_pool_drain_threshold,
me_pool_drain_soft_evict_enabled: cfg.general.me_pool_drain_soft_evict_enabled,
me_pool_drain_soft_evict_grace_secs: cfg.general.me_pool_drain_soft_evict_grace_secs,
me_pool_drain_soft_evict_per_writer: cfg.general.me_pool_drain_soft_evict_per_writer,
me_pool_drain_soft_evict_budget_per_core: cfg
.general
.me_pool_drain_soft_evict_budget_per_core,
me_pool_drain_soft_evict_cooldown_ms: cfg
.general
.me_pool_drain_soft_evict_cooldown_ms,
me_pool_min_fresh_ratio: cfg.general.me_pool_min_fresh_ratio,
me_reinit_drain_timeout_secs: cfg.general.me_reinit_drain_timeout_secs,
me_hardswap_warmup_delay_min_ms: cfg.general.me_hardswap_warmup_delay_min_ms,
@@ -455,6 +469,15 @@ fn overlay_hot_fields(old: &ProxyConfig, new: &ProxyConfig) -> ProxyConfig {
cfg.general.hardswap = new.general.hardswap;
cfg.general.me_pool_drain_ttl_secs = new.general.me_pool_drain_ttl_secs;
cfg.general.me_pool_drain_threshold = new.general.me_pool_drain_threshold;
cfg.general.me_pool_drain_soft_evict_enabled = new.general.me_pool_drain_soft_evict_enabled;
cfg.general.me_pool_drain_soft_evict_grace_secs =
new.general.me_pool_drain_soft_evict_grace_secs;
cfg.general.me_pool_drain_soft_evict_per_writer =
new.general.me_pool_drain_soft_evict_per_writer;
cfg.general.me_pool_drain_soft_evict_budget_per_core =
new.general.me_pool_drain_soft_evict_budget_per_core;
cfg.general.me_pool_drain_soft_evict_cooldown_ms =
new.general.me_pool_drain_soft_evict_cooldown_ms;
cfg.general.me_pool_min_fresh_ratio = new.general.me_pool_min_fresh_ratio;
cfg.general.me_reinit_drain_timeout_secs = new.general.me_reinit_drain_timeout_secs;
cfg.general.me_hardswap_warmup_delay_min_ms = new.general.me_hardswap_warmup_delay_min_ms;
@@ -835,6 +858,25 @@ fn log_changes(
old_hot.me_pool_drain_threshold, new_hot.me_pool_drain_threshold,
);
}
if old_hot.me_pool_drain_soft_evict_enabled != new_hot.me_pool_drain_soft_evict_enabled
|| old_hot.me_pool_drain_soft_evict_grace_secs
!= new_hot.me_pool_drain_soft_evict_grace_secs
|| old_hot.me_pool_drain_soft_evict_per_writer
!= new_hot.me_pool_drain_soft_evict_per_writer
|| old_hot.me_pool_drain_soft_evict_budget_per_core
!= new_hot.me_pool_drain_soft_evict_budget_per_core
|| old_hot.me_pool_drain_soft_evict_cooldown_ms
!= new_hot.me_pool_drain_soft_evict_cooldown_ms
{
info!(
"config reload: me_pool_drain_soft_evict: enabled={} grace={}s per_writer={} budget_per_core={} cooldown={}ms",
new_hot.me_pool_drain_soft_evict_enabled,
new_hot.me_pool_drain_soft_evict_grace_secs,
new_hot.me_pool_drain_soft_evict_per_writer,
new_hot.me_pool_drain_soft_evict_budget_per_core,
new_hot.me_pool_drain_soft_evict_cooldown_ms
);
}
if (old_hot.me_pool_min_fresh_ratio - new_hot.me_pool_min_fresh_ratio).abs() > f32::EPSILON {
info!(

View File

@@ -406,6 +406,35 @@ impl ProxyConfig {
));
}
if config.general.me_pool_drain_soft_evict_grace_secs > 3600 {
return Err(ProxyError::Config(
"general.me_pool_drain_soft_evict_grace_secs must be within [0, 3600]".to_string(),
));
}
if config.general.me_pool_drain_soft_evict_per_writer == 0
|| config.general.me_pool_drain_soft_evict_per_writer > 16
{
return Err(ProxyError::Config(
"general.me_pool_drain_soft_evict_per_writer must be within [1, 16]".to_string(),
));
}
if config.general.me_pool_drain_soft_evict_budget_per_core == 0
|| config.general.me_pool_drain_soft_evict_budget_per_core > 64
{
return Err(ProxyError::Config(
"general.me_pool_drain_soft_evict_budget_per_core must be within [1, 64]"
.to_string(),
));
}
if config.general.me_pool_drain_soft_evict_cooldown_ms == 0 {
return Err(ProxyError::Config(
"general.me_pool_drain_soft_evict_cooldown_ms must be > 0".to_string(),
));
}
if config.access.user_max_unique_ips_window_secs == 0 {
return Err(ProxyError::Config(
"access.user_max_unique_ips_window_secs must be > 0".to_string(),

View File

@@ -803,6 +803,26 @@ pub struct GeneralConfig {
#[serde(default = "default_me_pool_drain_threshold")]
pub me_pool_drain_threshold: u64,
/// Enable staged client eviction for draining ME writers that remain non-empty past TTL.
#[serde(default = "default_me_pool_drain_soft_evict_enabled")]
pub me_pool_drain_soft_evict_enabled: bool,
/// Extra grace in seconds after drain TTL before soft-eviction stage starts.
#[serde(default = "default_me_pool_drain_soft_evict_grace_secs")]
pub me_pool_drain_soft_evict_grace_secs: u64,
/// Maximum number of client sessions to evict from one draining writer per health tick.
#[serde(default = "default_me_pool_drain_soft_evict_per_writer")]
pub me_pool_drain_soft_evict_per_writer: u8,
/// Soft-eviction budget per CPU core for one health tick.
#[serde(default = "default_me_pool_drain_soft_evict_budget_per_core")]
pub me_pool_drain_soft_evict_budget_per_core: u16,
/// Cooldown for repetitive soft-eviction on the same writer in milliseconds.
#[serde(default = "default_me_pool_drain_soft_evict_cooldown_ms")]
pub me_pool_drain_soft_evict_cooldown_ms: u64,
/// Policy for new binds on stale draining writers.
#[serde(default)]
pub me_bind_stale_mode: MeBindStaleMode,
@@ -984,6 +1004,13 @@ impl Default for GeneralConfig {
proxy_secret_len_max: default_proxy_secret_len_max(),
me_pool_drain_ttl_secs: default_me_pool_drain_ttl_secs(),
me_pool_drain_threshold: default_me_pool_drain_threshold(),
me_pool_drain_soft_evict_enabled: default_me_pool_drain_soft_evict_enabled(),
me_pool_drain_soft_evict_grace_secs: default_me_pool_drain_soft_evict_grace_secs(),
me_pool_drain_soft_evict_per_writer: default_me_pool_drain_soft_evict_per_writer(),
me_pool_drain_soft_evict_budget_per_core:
default_me_pool_drain_soft_evict_budget_per_core(),
me_pool_drain_soft_evict_cooldown_ms:
default_me_pool_drain_soft_evict_cooldown_ms(),
me_bind_stale_mode: MeBindStaleMode::default(),
me_bind_stale_ttl_secs: default_me_bind_stale_ttl_secs(),
me_pool_min_fresh_ratio: default_me_pool_min_fresh_ratio(),
@@ -1156,16 +1183,17 @@ pub struct ServerConfig {
#[serde(default = "default_proxy_protocol_header_timeout_ms")]
pub proxy_protocol_header_timeout_ms: u64,
/// Trusted source CIDRs allowed to send incoming PROXY protocol headers.
///
/// When non-empty, connections from addresses outside this allowlist are
/// rejected before `src_addr` is applied.
#[serde(default)]
pub proxy_protocol_trusted_cidrs: Vec<IpNetwork>,
/// Port for the Prometheus-compatible metrics endpoint.
/// Enables metrics when set; binds on all interfaces (dual-stack) by default.
#[serde(default)]
pub metrics_port: Option<u16>,
/// Listen address for metrics in `IP:PORT` format (e.g. `"127.0.0.1:9090"`).
/// When set, takes precedence over `metrics_port` and binds on the specified address only.
#[serde(default)]
pub metrics_listen: Option<String>,
/// CIDR whitelist for the metrics endpoint.
#[serde(default = "default_metrics_whitelist")]
pub metrics_whitelist: Vec<IpNetwork>,
@@ -1192,8 +1220,8 @@ impl Default for ServerConfig {
listen_tcp: None,
proxy_protocol: false,
proxy_protocol_header_timeout_ms: default_proxy_protocol_header_timeout_ms(),
proxy_protocol_trusted_cidrs: Vec::new(),
metrics_port: None,
metrics_listen: None,
metrics_whitelist: default_metrics_whitelist(),
api: ApiConfig::default(),
listeners: Vec::new(),

View File

@@ -0,0 +1,450 @@
use std::collections::HashMap;
use std::net::{IpAddr, Ipv4Addr};
use std::sync::Arc;
use std::time::Duration;
use crate::config::UserMaxUniqueIpsMode;
use crate::ip_tracker::UserIpTracker;
fn ip_from_idx(idx: u32) -> IpAddr {
let a = 10u8;
let b = ((idx / 65_536) % 256) as u8;
let c = ((idx / 256) % 256) as u8;
let d = (idx % 256) as u8;
IpAddr::V4(Ipv4Addr::new(a, b, c, d))
}
#[tokio::test]
async fn active_window_enforces_large_unique_ip_burst() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("burst_user", 64).await;
tracker
.set_limit_policy(UserMaxUniqueIpsMode::ActiveWindow, 30)
.await;
for idx in 0..64 {
assert!(tracker.check_and_add("burst_user", ip_from_idx(idx)).await.is_ok());
}
assert!(tracker.check_and_add("burst_user", ip_from_idx(9_999)).await.is_err());
assert_eq!(tracker.get_active_ip_count("burst_user").await, 64);
}
#[tokio::test]
async fn global_limit_applies_across_many_users() {
let tracker = UserIpTracker::new();
tracker.load_limits(3, &HashMap::new()).await;
for user_idx in 0..150u32 {
let user = format!("u{}", user_idx);
assert!(tracker.check_and_add(&user, ip_from_idx(user_idx * 10)).await.is_ok());
assert!(tracker
.check_and_add(&user, ip_from_idx(user_idx * 10 + 1))
.await
.is_ok());
assert!(tracker
.check_and_add(&user, ip_from_idx(user_idx * 10 + 2))
.await
.is_ok());
assert!(tracker
.check_and_add(&user, ip_from_idx(user_idx * 10 + 3))
.await
.is_err());
}
assert_eq!(tracker.get_stats().await.len(), 150);
}
#[tokio::test]
async fn user_zero_override_falls_back_to_global_limit() {
let tracker = UserIpTracker::new();
let mut limits = HashMap::new();
limits.insert("target".to_string(), 0);
tracker.load_limits(2, &limits).await;
assert!(tracker.check_and_add("target", ip_from_idx(1)).await.is_ok());
assert!(tracker.check_and_add("target", ip_from_idx(2)).await.is_ok());
assert!(tracker.check_and_add("target", ip_from_idx(3)).await.is_err());
assert_eq!(tracker.get_user_limit("target").await, Some(2));
}
#[tokio::test]
async fn remove_ip_is_idempotent_after_counter_reaches_zero() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("u", 2).await;
let ip = ip_from_idx(42);
tracker.check_and_add("u", ip).await.unwrap();
tracker.remove_ip("u", ip).await;
tracker.remove_ip("u", ip).await;
tracker.remove_ip("u", ip).await;
assert_eq!(tracker.get_active_ip_count("u").await, 0);
assert!(!tracker.is_ip_active("u", ip).await);
}
#[tokio::test]
async fn clear_user_ips_resets_active_and_recent() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("u", 10).await;
for idx in 0..6 {
tracker.check_and_add("u", ip_from_idx(idx)).await.unwrap();
}
tracker.clear_user_ips("u").await;
assert_eq!(tracker.get_active_ip_count("u").await, 0);
let counts = tracker
.get_recent_counts_for_users(&["u".to_string()])
.await;
assert_eq!(counts.get("u").copied().unwrap_or(0), 0);
}
#[tokio::test]
async fn clear_all_resets_multi_user_state() {
let tracker = UserIpTracker::new();
for user_idx in 0..80u32 {
let user = format!("u{}", user_idx);
for ip_idx in 0..3 {
tracker
.check_and_add(&user, ip_from_idx(user_idx * 100 + ip_idx))
.await
.unwrap();
}
}
tracker.clear_all().await;
assert!(tracker.get_stats().await.is_empty());
let users = (0..80u32)
.map(|idx| format!("u{}", idx))
.collect::<Vec<_>>();
let recent = tracker.get_recent_counts_for_users(&users).await;
assert!(recent.values().all(|count| *count == 0));
}
#[tokio::test]
async fn get_active_ips_for_users_are_sorted() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("user", 10).await;
tracker
.check_and_add("user", IpAddr::V4(Ipv4Addr::new(10, 0, 0, 9)))
.await
.unwrap();
tracker
.check_and_add("user", IpAddr::V4(Ipv4Addr::new(10, 0, 0, 1)))
.await
.unwrap();
tracker
.check_and_add("user", IpAddr::V4(Ipv4Addr::new(10, 0, 0, 5)))
.await
.unwrap();
let map = tracker
.get_active_ips_for_users(&["user".to_string()])
.await;
let ips = map.get("user").cloned().unwrap_or_default();
assert_eq!(
ips,
vec![
IpAddr::V4(Ipv4Addr::new(10, 0, 0, 1)),
IpAddr::V4(Ipv4Addr::new(10, 0, 0, 5)),
IpAddr::V4(Ipv4Addr::new(10, 0, 0, 9)),
]
);
}
#[tokio::test]
async fn get_recent_ips_for_users_are_sorted() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("user", 10).await;
tracker
.check_and_add("user", IpAddr::V4(Ipv4Addr::new(10, 1, 0, 9)))
.await
.unwrap();
tracker
.check_and_add("user", IpAddr::V4(Ipv4Addr::new(10, 1, 0, 1)))
.await
.unwrap();
tracker
.check_and_add("user", IpAddr::V4(Ipv4Addr::new(10, 1, 0, 5)))
.await
.unwrap();
let map = tracker
.get_recent_ips_for_users(&["user".to_string()])
.await;
let ips = map.get("user").cloned().unwrap_or_default();
assert_eq!(
ips,
vec![
IpAddr::V4(Ipv4Addr::new(10, 1, 0, 1)),
IpAddr::V4(Ipv4Addr::new(10, 1, 0, 5)),
IpAddr::V4(Ipv4Addr::new(10, 1, 0, 9)),
]
);
}
#[tokio::test]
async fn time_window_expires_for_large_rotation() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("tw", 1).await;
tracker
.set_limit_policy(UserMaxUniqueIpsMode::TimeWindow, 1)
.await;
tracker.check_and_add("tw", ip_from_idx(1)).await.unwrap();
tracker.remove_ip("tw", ip_from_idx(1)).await;
assert!(tracker.check_and_add("tw", ip_from_idx(2)).await.is_err());
tokio::time::sleep(Duration::from_millis(1_100)).await;
assert!(tracker.check_and_add("tw", ip_from_idx(2)).await.is_ok());
}
#[tokio::test]
async fn combined_mode_blocks_recent_after_disconnect() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("cmb", 1).await;
tracker
.set_limit_policy(UserMaxUniqueIpsMode::Combined, 2)
.await;
tracker.check_and_add("cmb", ip_from_idx(11)).await.unwrap();
tracker.remove_ip("cmb", ip_from_idx(11)).await;
assert!(tracker.check_and_add("cmb", ip_from_idx(12)).await.is_err());
}
#[tokio::test]
async fn load_limits_replaces_large_limit_map() {
let tracker = UserIpTracker::new();
let mut first = HashMap::new();
let mut second = HashMap::new();
for idx in 0..300usize {
first.insert(format!("u{}", idx), 2usize);
}
for idx in 150..450usize {
second.insert(format!("u{}", idx), 4usize);
}
tracker.load_limits(0, &first).await;
tracker.load_limits(0, &second).await;
assert_eq!(tracker.get_user_limit("u20").await, None);
assert_eq!(tracker.get_user_limit("u200").await, Some(4));
assert_eq!(tracker.get_user_limit("u420").await, Some(4));
}
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn concurrent_same_user_unique_ip_pressure_stays_bounded() {
let tracker = Arc::new(UserIpTracker::new());
tracker.set_user_limit("hot", 32).await;
tracker
.set_limit_policy(UserMaxUniqueIpsMode::ActiveWindow, 30)
.await;
let mut handles = Vec::new();
for worker in 0..16u32 {
let tracker_cloned = tracker.clone();
handles.push(tokio::spawn(async move {
let base = worker * 200;
for step in 0..200u32 {
let _ = tracker_cloned
.check_and_add("hot", ip_from_idx(base + step))
.await;
}
}));
}
for handle in handles {
handle.await.unwrap();
}
assert!(tracker.get_active_ip_count("hot").await <= 32);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn concurrent_many_users_isolate_limits() {
let tracker = Arc::new(UserIpTracker::new());
tracker.load_limits(4, &HashMap::new()).await;
let mut handles = Vec::new();
for user_idx in 0..120u32 {
let tracker_cloned = tracker.clone();
handles.push(tokio::spawn(async move {
let user = format!("u{}", user_idx);
for ip_idx in 0..10u32 {
let _ = tracker_cloned
.check_and_add(&user, ip_from_idx(user_idx * 1_000 + ip_idx))
.await;
}
}));
}
for handle in handles {
handle.await.unwrap();
}
let stats = tracker.get_stats().await;
assert_eq!(stats.len(), 120);
assert!(stats.iter().all(|(_, active, limit)| *active <= 4 && *limit == 4));
}
#[tokio::test]
async fn same_ip_reconnect_high_frequency_keeps_single_unique() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("same", 2).await;
let ip = ip_from_idx(9);
for _ in 0..2_000 {
tracker.check_and_add("same", ip).await.unwrap();
}
assert_eq!(tracker.get_active_ip_count("same").await, 1);
assert!(tracker.is_ip_active("same", ip).await);
}
#[tokio::test]
async fn format_stats_contains_expected_limited_and_unlimited_markers() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("limited", 2).await;
tracker.check_and_add("limited", ip_from_idx(1)).await.unwrap();
tracker.check_and_add("open", ip_from_idx(2)).await.unwrap();
let text = tracker.format_stats().await;
assert!(text.contains("limited"));
assert!(text.contains("open"));
assert!(text.contains("unlimited"));
}
#[tokio::test]
async fn stats_report_global_default_for_users_without_override() {
let tracker = UserIpTracker::new();
tracker.load_limits(5, &HashMap::new()).await;
tracker.check_and_add("a", ip_from_idx(1)).await.unwrap();
tracker.check_and_add("b", ip_from_idx(2)).await.unwrap();
let stats = tracker.get_stats().await;
assert!(stats.iter().any(|(user, _, limit)| user == "a" && *limit == 5));
assert!(stats.iter().any(|(user, _, limit)| user == "b" && *limit == 5));
}
#[tokio::test]
async fn stress_cycle_add_remove_clear_preserves_empty_end_state() {
let tracker = UserIpTracker::new();
for cycle in 0..50u32 {
let user = format!("cycle{}", cycle);
tracker.set_user_limit(&user, 128).await;
for ip_idx in 0..128u32 {
tracker
.check_and_add(&user, ip_from_idx(cycle * 10_000 + ip_idx))
.await
.unwrap();
}
for ip_idx in 0..128u32 {
tracker
.remove_ip(&user, ip_from_idx(cycle * 10_000 + ip_idx))
.await;
}
tracker.clear_user_ips(&user).await;
}
assert!(tracker.get_stats().await.is_empty());
}
#[tokio::test]
async fn remove_unknown_user_or_ip_does_not_corrupt_state() {
let tracker = UserIpTracker::new();
tracker.remove_ip("no_user", ip_from_idx(1)).await;
tracker.check_and_add("x", ip_from_idx(2)).await.unwrap();
tracker.remove_ip("x", ip_from_idx(3)).await;
assert_eq!(tracker.get_active_ip_count("x").await, 1);
assert!(tracker.is_ip_active("x", ip_from_idx(2)).await);
}
#[tokio::test]
async fn active_and_recent_views_match_after_mixed_workload() {
let tracker = UserIpTracker::new();
tracker.set_user_limit("mix", 16).await;
for ip_idx in 0..12u32 {
tracker.check_and_add("mix", ip_from_idx(ip_idx)).await.unwrap();
}
for ip_idx in 0..6u32 {
tracker.remove_ip("mix", ip_from_idx(ip_idx)).await;
}
let active = tracker
.get_active_ips_for_users(&["mix".to_string()])
.await
.get("mix")
.cloned()
.unwrap_or_default();
let recent_count = tracker
.get_recent_counts_for_users(&["mix".to_string()])
.await
.get("mix")
.copied()
.unwrap_or(0);
assert_eq!(active.len(), 6);
assert!(recent_count >= active.len());
assert!(recent_count <= 12);
}
#[tokio::test]
async fn global_limit_switch_updates_enforcement_immediately() {
let tracker = UserIpTracker::new();
tracker.load_limits(2, &HashMap::new()).await;
assert!(tracker.check_and_add("u", ip_from_idx(1)).await.is_ok());
assert!(tracker.check_and_add("u", ip_from_idx(2)).await.is_ok());
assert!(tracker.check_and_add("u", ip_from_idx(3)).await.is_err());
tracker.clear_user_ips("u").await;
tracker.load_limits(4, &HashMap::new()).await;
assert!(tracker.check_and_add("u", ip_from_idx(1)).await.is_ok());
assert!(tracker.check_and_add("u", ip_from_idx(2)).await.is_ok());
assert!(tracker.check_and_add("u", ip_from_idx(3)).await.is_ok());
assert!(tracker.check_and_add("u", ip_from_idx(4)).await.is_ok());
assert!(tracker.check_and_add("u", ip_from_idx(5)).await.is_err());
}
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn concurrent_reconnect_and_disconnect_preserves_non_negative_counts() {
let tracker = Arc::new(UserIpTracker::new());
tracker.set_user_limit("cc", 8).await;
let mut handles = Vec::new();
for worker in 0..8u32 {
let tracker_cloned = tracker.clone();
handles.push(tokio::spawn(async move {
let ip = ip_from_idx(50 + worker);
for _ in 0..500u32 {
let _ = tracker_cloned.check_and_add("cc", ip).await;
tracker_cloned.remove_ip("cc", ip).await;
}
}));
}
for handle in handles {
handle.await.unwrap();
}
assert!(tracker.get_active_ip_count("cc").await <= 8);
}

View File

@@ -238,6 +238,11 @@ pub(crate) async fn initialize_me_pool(
config.general.hardswap,
config.general.me_pool_drain_ttl_secs,
config.general.me_pool_drain_threshold,
config.general.me_pool_drain_soft_evict_enabled,
config.general.me_pool_drain_soft_evict_grace_secs,
config.general.me_pool_drain_soft_evict_per_writer,
config.general.me_pool_drain_soft_evict_budget_per_core,
config.general.me_pool_drain_soft_evict_cooldown_ms,
config.general.effective_me_pool_force_close_secs(),
config.general.me_pool_min_fresh_ratio,
config.general.me_hardswap_warmup_delay_min_ms,

View File

@@ -476,7 +476,7 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
Duration::from_secs(config.access.replay_window_secs),
));
let buffer_pool = Arc::new(BufferPool::with_config(16 * 1024, 4096));
let buffer_pool = Arc::new(BufferPool::with_config(64 * 1024, 4096));
connectivity::run_startup_connectivity(
&config,

View File

@@ -279,11 +279,32 @@ pub(crate) async fn spawn_metrics_if_configured(
ip_tracker: Arc<UserIpTracker>,
config_rx: watch::Receiver<Arc<ProxyConfig>>,
) {
if let Some(port) = config.server.metrics_port {
// metrics_listen takes precedence; fall back to metrics_port for backward compat.
let metrics_target: Option<(u16, Option<String>)> =
if let Some(ref listen) = config.server.metrics_listen {
match listen.parse::<std::net::SocketAddr>() {
Ok(addr) => Some((addr.port(), Some(listen.clone()))),
Err(e) => {
startup_tracker
.skip_component(
COMPONENT_METRICS_START,
Some(format!("invalid metrics_listen \"{}\": {}", listen, e)),
)
.await;
None
}
}
} else {
config.server.metrics_port.map(|p| (p, None))
};
if let Some((port, listen)) = metrics_target {
let fallback_label = format!("port {}", port);
let label = listen.as_deref().unwrap_or(&fallback_label);
startup_tracker
.start_component(
COMPONENT_METRICS_START,
Some(format!("spawn metrics endpoint on {}", port)),
Some(format!("spawn metrics endpoint on {}", label)),
)
.await;
let stats = stats.clone();
@@ -294,6 +315,7 @@ pub(crate) async fn spawn_metrics_if_configured(
tokio::spawn(async move {
metrics::serve(
port,
listen,
stats,
beobachten,
ip_tracker_metrics,
@@ -308,7 +330,7 @@ pub(crate) async fn spawn_metrics_if_configured(
Some("metrics task spawned".to_string()),
)
.await;
} else {
} else if config.server.metrics_listen.is_none() {
startup_tracker
.skip_component(
COMPONENT_METRICS_START,

View File

@@ -6,6 +6,8 @@ mod config;
mod crypto;
mod error;
mod ip_tracker;
#[cfg(test)]
mod ip_tracker_regression_tests;
mod maestro;
mod metrics;
mod network;

View File

@@ -21,6 +21,7 @@ use crate::transport::{ListenOptions, create_listener};
pub async fn serve(
port: u16,
listen: Option<String>,
stats: Arc<Stats>,
beobachten: Arc<BeobachtenStore>,
ip_tracker: Arc<UserIpTracker>,
@@ -28,6 +29,33 @@ pub async fn serve(
whitelist: Vec<IpNetwork>,
) {
let whitelist = Arc::new(whitelist);
// If `metrics_listen` is set, bind on that single address only.
if let Some(ref listen_addr) = listen {
let addr: SocketAddr = match listen_addr.parse() {
Ok(a) => a,
Err(e) => {
warn!(error = %e, "Invalid metrics_listen address: {}", listen_addr);
return;
}
};
let is_ipv6 = addr.is_ipv6();
match bind_metrics_listener(addr, is_ipv6) {
Ok(listener) => {
info!("Metrics endpoint: http://{}/metrics and /beobachten", addr);
serve_listener(
listener, stats, beobachten, ip_tracker, config_rx, whitelist,
)
.await;
}
Err(e) => {
warn!(error = %e, "Failed to bind metrics on {}", addr);
}
}
return;
}
// Fallback: bind on 0.0.0.0 and [::] using metrics_port.
let mut listener_v4 = None;
let mut listener_v6 = None;
@@ -264,6 +292,109 @@ async fn render_metrics(stats: &Stats, config: &ProxyConfig, ip_tracker: &UserIp
"telemt_connections_bad_total {}",
if core_enabled { stats.get_connects_bad() } else { 0 }
);
let _ = writeln!(out, "# HELP telemt_connections_current Current active connections");
let _ = writeln!(out, "# TYPE telemt_connections_current gauge");
let _ = writeln!(
out,
"telemt_connections_current {}",
if core_enabled {
stats.get_current_connections_total()
} else {
0
}
);
let _ = writeln!(out, "# HELP telemt_connections_direct_current Current active direct connections");
let _ = writeln!(out, "# TYPE telemt_connections_direct_current gauge");
let _ = writeln!(
out,
"telemt_connections_direct_current {}",
if core_enabled {
stats.get_current_connections_direct()
} else {
0
}
);
let _ = writeln!(out, "# HELP telemt_connections_me_current Current active middle-end connections");
let _ = writeln!(out, "# TYPE telemt_connections_me_current gauge");
let _ = writeln!(
out,
"telemt_connections_me_current {}",
if core_enabled {
stats.get_current_connections_me()
} else {
0
}
);
let _ = writeln!(
out,
"# HELP telemt_relay_adaptive_promotions_total Adaptive relay tier promotions"
);
let _ = writeln!(out, "# TYPE telemt_relay_adaptive_promotions_total counter");
let _ = writeln!(
out,
"telemt_relay_adaptive_promotions_total {}",
if core_enabled {
stats.get_relay_adaptive_promotions_total()
} else {
0
}
);
let _ = writeln!(
out,
"# HELP telemt_relay_adaptive_demotions_total Adaptive relay tier demotions"
);
let _ = writeln!(out, "# TYPE telemt_relay_adaptive_demotions_total counter");
let _ = writeln!(
out,
"telemt_relay_adaptive_demotions_total {}",
if core_enabled {
stats.get_relay_adaptive_demotions_total()
} else {
0
}
);
let _ = writeln!(
out,
"# HELP telemt_relay_adaptive_hard_promotions_total Adaptive relay hard promotions triggered by write pressure"
);
let _ = writeln!(
out,
"# TYPE telemt_relay_adaptive_hard_promotions_total counter"
);
let _ = writeln!(
out,
"telemt_relay_adaptive_hard_promotions_total {}",
if core_enabled {
stats.get_relay_adaptive_hard_promotions_total()
} else {
0
}
);
let _ = writeln!(out, "# HELP telemt_reconnect_evict_total Reconnect-driven session evictions");
let _ = writeln!(out, "# TYPE telemt_reconnect_evict_total counter");
let _ = writeln!(
out,
"telemt_reconnect_evict_total {}",
if core_enabled {
stats.get_reconnect_evict_total()
} else {
0
}
);
let _ = writeln!(
out,
"# HELP telemt_reconnect_stale_close_total Sessions closed because they became stale after reconnect"
);
let _ = writeln!(out, "# TYPE telemt_reconnect_stale_close_total counter");
let _ = writeln!(
out,
"telemt_reconnect_stale_close_total {}",
if core_enabled {
stats.get_reconnect_stale_close_total()
} else {
0
}
);
let _ = writeln!(out, "# HELP telemt_handshake_timeouts_total Handshake timeouts");
let _ = writeln!(out, "# TYPE telemt_handshake_timeouts_total counter");
@@ -1519,6 +1650,36 @@ async fn render_metrics(stats: &Stats, config: &ProxyConfig, ip_tracker: &UserIp
}
);
let _ = writeln!(
out,
"# HELP telemt_pool_drain_soft_evict_total Soft-evicted client sessions on stuck draining writers"
);
let _ = writeln!(out, "# TYPE telemt_pool_drain_soft_evict_total counter");
let _ = writeln!(
out,
"telemt_pool_drain_soft_evict_total {}",
if me_allows_normal {
stats.get_pool_drain_soft_evict_total()
} else {
0
}
);
let _ = writeln!(
out,
"# HELP telemt_pool_drain_soft_evict_writer_total Draining writers with at least one soft eviction"
);
let _ = writeln!(out, "# TYPE telemt_pool_drain_soft_evict_writer_total counter");
let _ = writeln!(
out,
"telemt_pool_drain_soft_evict_writer_total {}",
if me_allows_normal {
stats.get_pool_drain_soft_evict_writer_total()
} else {
0
}
);
let _ = writeln!(out, "# HELP telemt_pool_stale_pick_total Stale writer fallback picks for new binds");
let _ = writeln!(out, "# TYPE telemt_pool_stale_pick_total counter");
let _ = writeln!(
@@ -1836,6 +1997,8 @@ mod tests {
stats.increment_connects_all();
stats.increment_connects_all();
stats.increment_connects_bad();
stats.increment_current_connections_direct();
stats.increment_current_connections_me();
stats.increment_handshake_timeouts();
stats.increment_upstream_connect_attempt_total();
stats.increment_upstream_connect_attempt_total();
@@ -1867,6 +2030,9 @@ mod tests {
assert!(output.contains("telemt_connections_total 2"));
assert!(output.contains("telemt_connections_bad_total 1"));
assert!(output.contains("telemt_connections_current 2"));
assert!(output.contains("telemt_connections_direct_current 1"));
assert!(output.contains("telemt_connections_me_current 1"));
assert!(output.contains("telemt_handshake_timeouts_total 1"));
assert!(output.contains("telemt_upstream_connect_attempt_total 2"));
assert!(output.contains("telemt_upstream_connect_success_total 1"));
@@ -1909,6 +2075,9 @@ mod tests {
let output = render_metrics(&stats, &config, &tracker).await;
assert!(output.contains("telemt_connections_total 0"));
assert!(output.contains("telemt_connections_bad_total 0"));
assert!(output.contains("telemt_connections_current 0"));
assert!(output.contains("telemt_connections_direct_current 0"));
assert!(output.contains("telemt_connections_me_current 0"));
assert!(output.contains("telemt_handshake_timeouts_total 0"));
assert!(output.contains("telemt_user_unique_ips_current{user="));
assert!(output.contains("telemt_user_unique_ips_recent_window{user="));
@@ -1942,11 +2111,21 @@ mod tests {
assert!(output.contains("# TYPE telemt_uptime_seconds gauge"));
assert!(output.contains("# TYPE telemt_connections_total counter"));
assert!(output.contains("# TYPE telemt_connections_bad_total counter"));
assert!(output.contains("# TYPE telemt_connections_current gauge"));
assert!(output.contains("# TYPE telemt_connections_direct_current gauge"));
assert!(output.contains("# TYPE telemt_connections_me_current gauge"));
assert!(output.contains("# TYPE telemt_relay_adaptive_promotions_total counter"));
assert!(output.contains("# TYPE telemt_relay_adaptive_demotions_total counter"));
assert!(output.contains("# TYPE telemt_relay_adaptive_hard_promotions_total counter"));
assert!(output.contains("# TYPE telemt_reconnect_evict_total counter"));
assert!(output.contains("# TYPE telemt_reconnect_stale_close_total counter"));
assert!(output.contains("# TYPE telemt_handshake_timeouts_total counter"));
assert!(output.contains("# TYPE telemt_upstream_connect_attempt_total counter"));
assert!(output.contains("# TYPE telemt_me_rpc_proxy_req_signal_sent_total counter"));
assert!(output.contains("# TYPE telemt_me_idle_close_by_peer_total counter"));
assert!(output.contains("# TYPE telemt_me_writer_removed_total counter"));
assert!(output.contains("# TYPE telemt_pool_drain_soft_evict_total counter"));
assert!(output.contains("# TYPE telemt_pool_drain_soft_evict_writer_total counter"));
assert!(output.contains(
"# TYPE telemt_me_writer_removed_unexpected_minus_restored_total gauge"
));

View File

@@ -13,7 +13,6 @@ use super::constants::*;
use std::time::{SystemTime, UNIX_EPOCH};
use num_bigint::BigUint;
use num_traits::One;
use subtle::ConstantTimeEq;
// ============= Public Constants =============
@@ -29,8 +28,6 @@ pub const TLS_DIGEST_HALF_LEN: usize = 16;
/// Time skew limits for anti-replay (in seconds)
pub const TIME_SKEW_MIN: i64 = -20 * 60; // 20 minutes before
pub const TIME_SKEW_MAX: i64 = 10 * 60; // 10 minutes after
/// Maximum accepted boot-time timestamp (seconds) before skew checks are enforced.
pub const BOOT_TIME_MAX_SECS: u32 = 7 * 24 * 60 * 60;
// ============= Private Constants =============
@@ -128,7 +125,7 @@ impl TlsExtensionBuilder {
// protocol name length (1 byte)
// protocol name bytes
let proto_len = proto.len() as u8;
let list_len: u16 = 1 + u16::from(proto_len);
let list_len: u16 = 1 + proto_len as u16;
let ext_len: u16 = 2 + list_len;
self.extensions.extend_from_slice(&ext_len.to_be_bytes());
@@ -276,86 +273,13 @@ impl ServerHelloBuilder {
// ============= Public Functions =============
/// Validate TLS ClientHello against user secrets.
/// Validate TLS ClientHello against user secrets
///
/// Returns validation result if a matching user is found.
/// The result **must** be used — ignoring it silently bypasses authentication.
#[must_use]
pub fn validate_tls_handshake(
handshake: &[u8],
secrets: &[(String, Vec<u8>)],
ignore_time_skew: bool,
) -> Option<TlsValidation> {
validate_tls_handshake_with_replay_window(
handshake,
secrets,
ignore_time_skew,
u64::from(BOOT_TIME_MAX_SECS),
)
}
/// Validate TLS ClientHello and cap the boot-time bypass by replay-cache TTL.
///
/// A boot-time timestamp is only accepted when it falls below both
/// `BOOT_TIME_MAX_SECS` and the configured replay window, preventing timestamp
/// reuse outside replay cache coverage.
#[must_use]
pub fn validate_tls_handshake_with_replay_window(
handshake: &[u8],
secrets: &[(String, Vec<u8>)],
ignore_time_skew: bool,
replay_window_secs: u64,
) -> Option<TlsValidation> {
// Only pay the clock syscall when we will actually compare against it.
// If `ignore_time_skew` is set, a broken or unavailable system clock
// must not block legitimate clients — that would be a DoS via clock failure.
let now = if !ignore_time_skew {
system_time_to_unix_secs(SystemTime::now())?
} else {
0_i64
};
let replay_window_u32 = u32::try_from(replay_window_secs).unwrap_or(u32::MAX);
let boot_time_cap_secs = BOOT_TIME_MAX_SECS.min(replay_window_u32);
validate_tls_handshake_at_time_with_boot_cap(
handshake,
secrets,
ignore_time_skew,
now,
boot_time_cap_secs,
)
}
fn system_time_to_unix_secs(now: SystemTime) -> Option<i64> {
// `try_from` rejects values that overflow i64 (> ~292 billion years CE),
// whereas `as i64` would silently wrap to a negative timestamp and corrupt
// every subsequent time-skew comparison.
let d = now.duration_since(UNIX_EPOCH).ok()?;
i64::try_from(d.as_secs()).ok()
}
fn validate_tls_handshake_at_time(
handshake: &[u8],
secrets: &[(String, Vec<u8>)],
ignore_time_skew: bool,
now: i64,
) -> Option<TlsValidation> {
validate_tls_handshake_at_time_with_boot_cap(
handshake,
secrets,
ignore_time_skew,
now,
BOOT_TIME_MAX_SECS,
)
}
fn validate_tls_handshake_at_time_with_boot_cap(
handshake: &[u8],
secrets: &[(String, Vec<u8>)],
ignore_time_skew: bool,
now: i64,
boot_time_cap_secs: u32,
) -> Option<TlsValidation> {
if handshake.len() < TLS_DIGEST_POS + TLS_DIGEST_LEN + 1 {
return None;
@@ -381,56 +305,50 @@ fn validate_tls_handshake_at_time_with_boot_cap(
let mut msg = handshake.to_vec();
msg[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN].fill(0);
let mut first_match: Option<TlsValidation> = None;
// Get current time
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_secs() as i64;
for (user, secret) in secrets {
let computed = sha256_hmac(secret, &msg);
// Constant-time equality check on the 28-byte HMAC window.
// A variable-time short-circuit here lets an active censor measure how many
// bytes matched, enabling secret brute-force via timing side-channels.
// Direct comparison on the original arrays avoids a heap allocation and
// removes the `try_into().unwrap()` that the intermediate Vec would require.
if !bool::from(digest[..28].ct_eq(&computed[..28])) {
// XOR digests
let xored: Vec<u8> = digest.iter()
.zip(computed.iter())
.map(|(a, b)| a ^ b)
.collect();
// Check that first 28 bytes are zeros (timestamp in last 4)
if !xored[..28].iter().all(|&b| b == 0) {
continue;
}
// The last 4 bytes encode the timestamp as XOR(digest[28..32], computed[28..32]).
// Inline array construction is infallible: both slices are [u8; 32] by construction.
let timestamp = u32::from_le_bytes([
digest[28] ^ computed[28],
digest[29] ^ computed[29],
digest[30] ^ computed[30],
digest[31] ^ computed[31],
]);
// time_diff is only meaningful (and `now` is only valid) when we are
// actually checking the window. Keep both inside the guard to make
// the dead-code path explicit and prevent accidental future use of
// a sentinel `now` value outside its intended scope.
// Extract timestamp
let timestamp = u32::from_le_bytes(xored[28..32].try_into().unwrap());
let time_diff = now - timestamp as i64;
// Check time skew
if !ignore_time_skew {
// Allow very small timestamps (boot time instead of unix time)
// This is a quirk in some clients that use uptime instead of real time
let is_boot_time = timestamp < boot_time_cap_secs;
if !is_boot_time {
let time_diff = now - i64::from(timestamp);
if !(TIME_SKEW_MIN..=TIME_SKEW_MAX).contains(&time_diff) {
continue;
}
let is_boot_time = timestamp < 60 * 60 * 24 * 1000; // < ~2.7 years in seconds
if !is_boot_time && !(TIME_SKEW_MIN..=TIME_SKEW_MAX).contains(&time_diff) {
continue;
}
}
if first_match.is_none() {
first_match = Some(TlsValidation {
user: user.clone(),
session_id: session_id.clone(),
digest,
timestamp,
});
}
return Some(TlsValidation {
user: user.clone(),
session_id,
digest,
timestamp,
});
}
first_match
None
}
fn curve25519_prime() -> BigUint {
@@ -610,9 +528,7 @@ pub fn extract_sni_from_client_hello(handshake: &[u8]) -> Option<String> {
if name_type == 0 && name_len > 0
&& let Ok(host) = std::str::from_utf8(&handshake[sn_pos..sn_pos + name_len])
{
if is_valid_sni_hostname(host) {
return Some(host.to_string());
}
return Some(host.to_string());
}
sn_pos += name_len;
}
@@ -623,35 +539,6 @@ pub fn extract_sni_from_client_hello(handshake: &[u8]) -> Option<String> {
None
}
fn is_valid_sni_hostname(host: &str) -> bool {
if host.is_empty() || host.len() > 253 {
return false;
}
if host.starts_with('.') || host.ends_with('.') {
return false;
}
if host.parse::<std::net::IpAddr>().is_ok() {
return false;
}
for label in host.split('.') {
if label.is_empty() || label.len() > 63 {
return false;
}
if label.starts_with('-') || label.ends_with('-') {
return false;
}
if !label
.bytes()
.all(|b| b.is_ascii_alphanumeric() || b == b'-')
{
return false;
}
}
true
}
/// Extract ALPN protocol list from ClientHello, return in offered order.
pub fn extract_alpn_from_client_hello(handshake: &[u8]) -> Vec<Vec<u8>> {
let mut pos = 5; // after record header
@@ -780,29 +667,291 @@ fn validate_server_hello_structure(data: &[u8]) -> Result<(), ProxyError> {
Ok(())
}
// ============= Compile-time Security Invariants =============
/// Compile-time checks that enforce invariants the rest of the code relies on.
/// Using `static_assertions` ensures these can never silently break across
/// refactors without a compile error.
mod compile_time_security_checks {
use super::{TLS_DIGEST_LEN, TLS_DIGEST_HALF_LEN};
use static_assertions::const_assert;
// The digest must be exactly one SHA-256 output.
const_assert!(TLS_DIGEST_LEN == 32);
// Replay-dedup stores the first half; verify it is literally half.
const_assert!(TLS_DIGEST_HALF_LEN * 2 == TLS_DIGEST_LEN);
// The HMAC check window (28 bytes) plus the embedded timestamp (4 bytes)
// must exactly fill the digest. If TLS_DIGEST_LEN ever changes, these
// assertions will catch the mismatch before any timing-oracle fix is broke.
const_assert!(28 + 4 == TLS_DIGEST_LEN);
}
// ============= Security-focused regression tests =============
#[cfg(test)]
#[path = "tls_security_tests.rs"]
mod security_tests;
mod tests {
use super::*;
#[test]
fn test_is_tls_handshake() {
assert!(is_tls_handshake(&[0x16, 0x03, 0x01]));
assert!(is_tls_handshake(&[0x16, 0x03, 0x01, 0x02, 0x00]));
assert!(!is_tls_handshake(&[0x17, 0x03, 0x01])); // Application data
assert!(!is_tls_handshake(&[0x16, 0x03, 0x02])); // Wrong version
assert!(!is_tls_handshake(&[0x16, 0x03])); // Too short
}
#[test]
fn test_parse_tls_record_header() {
let header = [0x16, 0x03, 0x01, 0x02, 0x00];
let result = parse_tls_record_header(&header).unwrap();
assert_eq!(result.0, TLS_RECORD_HANDSHAKE);
assert_eq!(result.1, 512);
let header = [0x17, 0x03, 0x03, 0x40, 0x00];
let result = parse_tls_record_header(&header).unwrap();
assert_eq!(result.0, TLS_RECORD_APPLICATION);
assert_eq!(result.1, 16384);
}
#[test]
fn test_gen_fake_x25519_key() {
let rng = SecureRandom::new();
let key1 = gen_fake_x25519_key(&rng);
let key2 = gen_fake_x25519_key(&rng);
assert_eq!(key1.len(), 32);
assert_eq!(key2.len(), 32);
assert_ne!(key1, key2); // Should be random
}
#[test]
fn test_fake_x25519_key_is_quadratic_residue() {
let rng = SecureRandom::new();
let key = gen_fake_x25519_key(&rng);
let p = curve25519_prime();
let k_num = BigUint::from_bytes_le(&key);
let exponent = (&p - BigUint::one()) >> 1;
let legendre = k_num.modpow(&exponent, &p);
assert_eq!(legendre, BigUint::one());
}
#[test]
fn test_tls_extension_builder() {
let key = [0x42u8; 32];
let mut builder = TlsExtensionBuilder::new();
builder.add_key_share(&key);
builder.add_supported_versions(0x0304);
let result = builder.build();
// Check length prefix
let len = u16::from_be_bytes([result[0], result[1]]) as usize;
assert_eq!(len, result.len() - 2);
// Check key_share extension is present
assert!(result.len() > 40); // At least key share
}
#[test]
fn test_server_hello_builder() {
let session_id = vec![0x01, 0x02, 0x03, 0x04];
let key = [0x55u8; 32];
let builder = ServerHelloBuilder::new(session_id.clone())
.with_x25519_key(&key)
.with_tls13_version();
let record = builder.build_record();
// Validate structure
validate_server_hello_structure(&record).expect("Invalid ServerHello structure");
// Check record type
assert_eq!(record[0], TLS_RECORD_HANDSHAKE);
// Check version
assert_eq!(&record[1..3], &TLS_VERSION);
// Check message type (ServerHello = 0x02)
assert_eq!(record[5], 0x02);
}
#[test]
fn test_build_server_hello_structure() {
let secret = b"test secret";
let client_digest = [0x42u8; 32];
let session_id = vec![0xAA; 32];
let rng = SecureRandom::new();
let response = build_server_hello(secret, &client_digest, &session_id, 2048, &rng, None, 0);
// Should have at least 3 records
assert!(response.len() > 100);
// First record should be ServerHello
assert_eq!(response[0], TLS_RECORD_HANDSHAKE);
// Validate ServerHello structure
validate_server_hello_structure(&response).expect("Invalid ServerHello");
// Find Change Cipher Spec
let server_hello_len = 5 + u16::from_be_bytes([response[3], response[4]]) as usize;
let ccs_start = server_hello_len;
assert!(response.len() > ccs_start + 6);
assert_eq!(response[ccs_start], TLS_RECORD_CHANGE_CIPHER);
// Find Application Data
let ccs_len = 5 + u16::from_be_bytes([response[ccs_start + 3], response[ccs_start + 4]]) as usize;
let app_start = ccs_start + ccs_len;
assert!(response.len() > app_start + 5);
assert_eq!(response[app_start], TLS_RECORD_APPLICATION);
}
#[test]
fn test_build_server_hello_digest() {
let secret = b"test secret key here";
let client_digest = [0x42u8; 32];
let session_id = vec![0xAA; 32];
let rng = SecureRandom::new();
let response1 = build_server_hello(secret, &client_digest, &session_id, 1024, &rng, None, 0);
let response2 = build_server_hello(secret, &client_digest, &session_id, 1024, &rng, None, 0);
// Digest position should have non-zero data
let digest1 = &response1[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN];
assert!(!digest1.iter().all(|&b| b == 0));
// Different calls should have different digests (due to random cert)
let digest2 = &response2[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN];
assert_ne!(digest1, digest2);
}
#[test]
fn test_server_hello_extensions_length() {
let session_id = vec![0x01; 32];
let key = [0x55u8; 32];
let builder = ServerHelloBuilder::new(session_id)
.with_x25519_key(&key)
.with_tls13_version();
let record = builder.build_record();
// Parse to find extensions
let msg_start = 5; // After record header
let msg_len = u32::from_be_bytes([0, record[6], record[7], record[8]]) as usize;
// Skip to session ID
let session_id_pos = msg_start + 4 + 2 + 32; // header(4) + version(2) + random(32)
let session_id_len = record[session_id_pos] as usize;
// Skip to extensions
let ext_len_pos = session_id_pos + 1 + session_id_len + 2 + 1; // session_id + cipher(2) + compression(1)
let ext_len = u16::from_be_bytes([record[ext_len_pos], record[ext_len_pos + 1]]) as usize;
// Verify extensions length matches actual data
let extensions_data = &record[ext_len_pos + 2..msg_start + 4 + msg_len];
assert_eq!(ext_len, extensions_data.len(),
"Extension length mismatch: declared {}, actual {}", ext_len, extensions_data.len());
}
#[test]
fn test_validate_tls_handshake_format() {
// Build a minimal ClientHello-like structure
let mut handshake = vec![0u8; 100];
// Put a valid-looking digest at position 11
handshake[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN]
.copy_from_slice(&[0x42; 32]);
// Session ID length
handshake[TLS_DIGEST_POS + TLS_DIGEST_LEN] = 32;
// This won't validate (wrong HMAC) but shouldn't panic
let secrets = vec![("test".to_string(), b"secret".to_vec())];
let result = validate_tls_handshake(&handshake, &secrets, true);
// Should return None (no match) but not panic
assert!(result.is_none());
}
fn build_client_hello_with_exts(exts: Vec<(u16, Vec<u8>)>, host: &str) -> Vec<u8> {
let mut body = Vec::new();
body.extend_from_slice(&TLS_VERSION); // legacy version
body.extend_from_slice(&[0u8; 32]); // random
body.push(0); // session id len
body.extend_from_slice(&2u16.to_be_bytes()); // cipher suites len
body.extend_from_slice(&[0x13, 0x01]); // TLS_AES_128_GCM_SHA256
body.push(1); // compression len
body.push(0); // null compression
// Build SNI extension
let host_bytes = host.as_bytes();
let mut sni_ext = Vec::new();
sni_ext.extend_from_slice(&(host_bytes.len() as u16 + 3).to_be_bytes());
sni_ext.push(0);
sni_ext.extend_from_slice(&(host_bytes.len() as u16).to_be_bytes());
sni_ext.extend_from_slice(host_bytes);
let mut ext_blob = Vec::new();
for (typ, data) in exts {
ext_blob.extend_from_slice(&typ.to_be_bytes());
ext_blob.extend_from_slice(&(data.len() as u16).to_be_bytes());
ext_blob.extend_from_slice(&data);
}
// SNI last
ext_blob.extend_from_slice(&0x0000u16.to_be_bytes());
ext_blob.extend_from_slice(&(sni_ext.len() as u16).to_be_bytes());
ext_blob.extend_from_slice(&sni_ext);
body.extend_from_slice(&(ext_blob.len() as u16).to_be_bytes());
body.extend_from_slice(&ext_blob);
let mut handshake = Vec::new();
handshake.push(0x01); // ClientHello
let len_bytes = (body.len() as u32).to_be_bytes();
handshake.extend_from_slice(&len_bytes[1..4]);
handshake.extend_from_slice(&body);
let mut record = Vec::new();
record.push(TLS_RECORD_HANDSHAKE);
record.extend_from_slice(&[0x03, 0x01]);
record.extend_from_slice(&(handshake.len() as u16).to_be_bytes());
record.extend_from_slice(&handshake);
record
}
#[test]
fn test_extract_sni_with_grease_extension() {
// GREASE type 0x0a0a with zero length before SNI
let ch = build_client_hello_with_exts(vec![(0x0a0a, Vec::new())], "example.com");
let sni = extract_sni_from_client_hello(&ch);
assert_eq!(sni.as_deref(), Some("example.com"));
}
#[test]
fn test_extract_sni_tolerates_empty_unknown_extension() {
let ch = build_client_hello_with_exts(vec![(0x1234, Vec::new())], "test.local");
let sni = extract_sni_from_client_hello(&ch);
assert_eq!(sni.as_deref(), Some("test.local"));
}
#[test]
fn test_extract_alpn_single() {
let mut alpn_data = Vec::new();
// list length = 3 (1 length byte + "h2")
alpn_data.extend_from_slice(&3u16.to_be_bytes());
alpn_data.push(2);
alpn_data.extend_from_slice(b"h2");
let ch = build_client_hello_with_exts(vec![(0x0010, alpn_data)], "alpn.test");
let alpn = extract_alpn_from_client_hello(&ch);
let alpn_str: Vec<String> = alpn
.iter()
.map(|p| std::str::from_utf8(p).unwrap().to_string())
.collect();
assert_eq!(alpn_str, vec!["h2"]);
}
#[test]
fn test_extract_alpn_multiple() {
let mut alpn_data = Vec::new();
// list length = 11 (sum of per-proto lengths including length bytes)
alpn_data.extend_from_slice(&11u16.to_be_bytes());
alpn_data.push(2);
alpn_data.extend_from_slice(b"h2");
alpn_data.push(4);
alpn_data.extend_from_slice(b"spdy");
alpn_data.push(2);
alpn_data.extend_from_slice(b"h3");
let ch = build_client_hello_with_exts(vec![(0x0010, alpn_data)], "alpn.test");
let alpn = extract_alpn_from_client_hello(&ch);
let alpn_str: Vec<String> = alpn
.iter()
.map(|p| std::str::from_utf8(p).unwrap().to_string())
.collect();
assert_eq!(alpn_str, vec!["h2", "spdy", "h3"]);
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,383 @@
use dashmap::DashMap;
use std::cmp::max;
use std::sync::OnceLock;
use std::time::{Duration, Instant};
const EMA_ALPHA: f64 = 0.2;
const PROFILE_TTL: Duration = Duration::from_secs(300);
const THROUGHPUT_UP_BPS: f64 = 8_000_000.0;
const THROUGHPUT_DOWN_BPS: f64 = 2_000_000.0;
const RATIO_CONFIRM_THRESHOLD: f64 = 1.12;
const TIER1_HOLD_TICKS: u32 = 8;
const TIER2_HOLD_TICKS: u32 = 4;
const QUIET_DEMOTE_TICKS: u32 = 480;
const HARD_COOLDOWN_TICKS: u32 = 20;
const HARD_PENDING_THRESHOLD: u32 = 3;
const HARD_PARTIAL_RATIO_THRESHOLD: f64 = 0.25;
const DIRECT_C2S_CAP_BYTES: usize = 128 * 1024;
const DIRECT_S2C_CAP_BYTES: usize = 512 * 1024;
const ME_FRAMES_CAP: usize = 96;
const ME_BYTES_CAP: usize = 384 * 1024;
const ME_DELAY_MIN_US: u64 = 150;
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
pub enum AdaptiveTier {
Base = 0,
Tier1 = 1,
Tier2 = 2,
Tier3 = 3,
}
impl AdaptiveTier {
pub fn promote(self) -> Self {
match self {
Self::Base => Self::Tier1,
Self::Tier1 => Self::Tier2,
Self::Tier2 => Self::Tier3,
Self::Tier3 => Self::Tier3,
}
}
pub fn demote(self) -> Self {
match self {
Self::Base => Self::Base,
Self::Tier1 => Self::Base,
Self::Tier2 => Self::Tier1,
Self::Tier3 => Self::Tier2,
}
}
fn ratio(self) -> (usize, usize) {
match self {
Self::Base => (1, 1),
Self::Tier1 => (5, 4),
Self::Tier2 => (3, 2),
Self::Tier3 => (2, 1),
}
}
pub fn as_u8(self) -> u8 {
self as u8
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum TierTransitionReason {
SoftConfirmed,
HardPressure,
QuietDemotion,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct TierTransition {
pub from: AdaptiveTier,
pub to: AdaptiveTier,
pub reason: TierTransitionReason,
}
#[derive(Debug, Clone, Copy, Default)]
pub struct RelaySignalSample {
pub c2s_bytes: u64,
pub s2c_requested_bytes: u64,
pub s2c_written_bytes: u64,
pub s2c_write_ops: u64,
pub s2c_partial_writes: u64,
pub s2c_consecutive_pending_writes: u32,
}
#[derive(Debug, Clone, Copy)]
pub struct SessionAdaptiveController {
tier: AdaptiveTier,
max_tier_seen: AdaptiveTier,
throughput_ema_bps: f64,
incoming_ema_bps: f64,
outgoing_ema_bps: f64,
tier1_hold_ticks: u32,
tier2_hold_ticks: u32,
quiet_ticks: u32,
hard_cooldown_ticks: u32,
}
impl SessionAdaptiveController {
pub fn new(initial_tier: AdaptiveTier) -> Self {
Self {
tier: initial_tier,
max_tier_seen: initial_tier,
throughput_ema_bps: 0.0,
incoming_ema_bps: 0.0,
outgoing_ema_bps: 0.0,
tier1_hold_ticks: 0,
tier2_hold_ticks: 0,
quiet_ticks: 0,
hard_cooldown_ticks: 0,
}
}
pub fn max_tier_seen(&self) -> AdaptiveTier {
self.max_tier_seen
}
pub fn observe(&mut self, sample: RelaySignalSample, tick_secs: f64) -> Option<TierTransition> {
if tick_secs <= f64::EPSILON {
return None;
}
if self.hard_cooldown_ticks > 0 {
self.hard_cooldown_ticks -= 1;
}
let c2s_bps = (sample.c2s_bytes as f64 * 8.0) / tick_secs;
let incoming_bps = (sample.s2c_requested_bytes as f64 * 8.0) / tick_secs;
let outgoing_bps = (sample.s2c_written_bytes as f64 * 8.0) / tick_secs;
let throughput = c2s_bps.max(outgoing_bps);
self.throughput_ema_bps = ema(self.throughput_ema_bps, throughput);
self.incoming_ema_bps = ema(self.incoming_ema_bps, incoming_bps);
self.outgoing_ema_bps = ema(self.outgoing_ema_bps, outgoing_bps);
let tier1_now = self.throughput_ema_bps >= THROUGHPUT_UP_BPS;
if tier1_now {
self.tier1_hold_ticks = self.tier1_hold_ticks.saturating_add(1);
} else {
self.tier1_hold_ticks = 0;
}
let ratio = if self.outgoing_ema_bps <= f64::EPSILON {
0.0
} else {
self.incoming_ema_bps / self.outgoing_ema_bps
};
let tier2_now = ratio >= RATIO_CONFIRM_THRESHOLD;
if tier2_now {
self.tier2_hold_ticks = self.tier2_hold_ticks.saturating_add(1);
} else {
self.tier2_hold_ticks = 0;
}
let partial_ratio = if sample.s2c_write_ops == 0 {
0.0
} else {
sample.s2c_partial_writes as f64 / sample.s2c_write_ops as f64
};
let hard_now = sample.s2c_consecutive_pending_writes >= HARD_PENDING_THRESHOLD
|| partial_ratio >= HARD_PARTIAL_RATIO_THRESHOLD;
if hard_now && self.hard_cooldown_ticks == 0 {
return self.promote(TierTransitionReason::HardPressure, HARD_COOLDOWN_TICKS);
}
if self.tier1_hold_ticks >= TIER1_HOLD_TICKS && self.tier2_hold_ticks >= TIER2_HOLD_TICKS {
return self.promote(TierTransitionReason::SoftConfirmed, 0);
}
let demote_candidate = self.throughput_ema_bps < THROUGHPUT_DOWN_BPS && !tier2_now && !hard_now;
if demote_candidate {
self.quiet_ticks = self.quiet_ticks.saturating_add(1);
if self.quiet_ticks >= QUIET_DEMOTE_TICKS {
self.quiet_ticks = 0;
return self.demote(TierTransitionReason::QuietDemotion);
}
} else {
self.quiet_ticks = 0;
}
None
}
fn promote(
&mut self,
reason: TierTransitionReason,
hard_cooldown_ticks: u32,
) -> Option<TierTransition> {
let from = self.tier;
let to = from.promote();
if from == to {
return None;
}
self.tier = to;
self.max_tier_seen = max(self.max_tier_seen, to);
self.hard_cooldown_ticks = hard_cooldown_ticks;
self.tier1_hold_ticks = 0;
self.tier2_hold_ticks = 0;
self.quiet_ticks = 0;
Some(TierTransition { from, to, reason })
}
fn demote(&mut self, reason: TierTransitionReason) -> Option<TierTransition> {
let from = self.tier;
let to = from.demote();
if from == to {
return None;
}
self.tier = to;
self.tier1_hold_ticks = 0;
self.tier2_hold_ticks = 0;
Some(TierTransition { from, to, reason })
}
}
#[derive(Debug, Clone, Copy)]
struct UserAdaptiveProfile {
tier: AdaptiveTier,
seen_at: Instant,
}
fn profiles() -> &'static DashMap<String, UserAdaptiveProfile> {
static USER_PROFILES: OnceLock<DashMap<String, UserAdaptiveProfile>> = OnceLock::new();
USER_PROFILES.get_or_init(DashMap::new)
}
pub fn seed_tier_for_user(user: &str) -> AdaptiveTier {
let now = Instant::now();
if let Some(entry) = profiles().get(user) {
let value = entry.value();
if now.duration_since(value.seen_at) <= PROFILE_TTL {
return value.tier;
}
}
AdaptiveTier::Base
}
pub fn record_user_tier(user: &str, tier: AdaptiveTier) {
let now = Instant::now();
if let Some(mut entry) = profiles().get_mut(user) {
let existing = *entry;
let effective = if now.duration_since(existing.seen_at) > PROFILE_TTL {
tier
} else {
max(existing.tier, tier)
};
*entry = UserAdaptiveProfile {
tier: effective,
seen_at: now,
};
return;
}
profiles().insert(
user.to_string(),
UserAdaptiveProfile { tier, seen_at: now },
);
}
pub fn direct_copy_buffers_for_tier(
tier: AdaptiveTier,
base_c2s: usize,
base_s2c: usize,
) -> (usize, usize) {
let (num, den) = tier.ratio();
(
scale(base_c2s, num, den, DIRECT_C2S_CAP_BYTES),
scale(base_s2c, num, den, DIRECT_S2C_CAP_BYTES),
)
}
pub fn me_flush_policy_for_tier(
tier: AdaptiveTier,
base_frames: usize,
base_bytes: usize,
base_delay: Duration,
) -> (usize, usize, Duration) {
let (num, den) = tier.ratio();
let frames = scale(base_frames, num, den, ME_FRAMES_CAP).max(1);
let bytes = scale(base_bytes, num, den, ME_BYTES_CAP).max(4096);
let delay_us = base_delay.as_micros() as u64;
let adjusted_delay_us = match tier {
AdaptiveTier::Base => delay_us,
AdaptiveTier::Tier1 => (delay_us.saturating_mul(7)).saturating_div(10),
AdaptiveTier::Tier2 => delay_us.saturating_div(2),
AdaptiveTier::Tier3 => (delay_us.saturating_mul(3)).saturating_div(10),
}
.max(ME_DELAY_MIN_US)
.min(delay_us.max(ME_DELAY_MIN_US));
(frames, bytes, Duration::from_micros(adjusted_delay_us))
}
fn ema(prev: f64, value: f64) -> f64 {
if prev <= f64::EPSILON {
value
} else {
(prev * (1.0 - EMA_ALPHA)) + (value * EMA_ALPHA)
}
}
fn scale(base: usize, numerator: usize, denominator: usize, cap: usize) -> usize {
let scaled = base
.saturating_mul(numerator)
.saturating_div(denominator.max(1));
scaled.min(cap).max(1)
}
#[cfg(test)]
mod tests {
use super::*;
fn sample(
c2s_bytes: u64,
s2c_requested_bytes: u64,
s2c_written_bytes: u64,
s2c_write_ops: u64,
s2c_partial_writes: u64,
s2c_consecutive_pending_writes: u32,
) -> RelaySignalSample {
RelaySignalSample {
c2s_bytes,
s2c_requested_bytes,
s2c_written_bytes,
s2c_write_ops,
s2c_partial_writes,
s2c_consecutive_pending_writes,
}
}
#[test]
fn test_soft_promotion_requires_tier1_and_tier2() {
let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Base);
let tick_secs = 0.25;
let mut promoted = None;
for _ in 0..8 {
promoted = ctrl.observe(
sample(
300_000, // ~9.6 Mbps
320_000, // incoming > outgoing to confirm tier2
250_000,
10,
0,
0,
),
tick_secs,
);
}
let transition = promoted.expect("expected soft promotion");
assert_eq!(transition.from, AdaptiveTier::Base);
assert_eq!(transition.to, AdaptiveTier::Tier1);
assert_eq!(transition.reason, TierTransitionReason::SoftConfirmed);
}
#[test]
fn test_hard_promotion_on_pending_pressure() {
let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Base);
let transition = ctrl
.observe(
sample(10_000, 20_000, 10_000, 4, 1, 3),
0.25,
)
.expect("expected hard promotion");
assert_eq!(transition.reason, TierTransitionReason::HardPressure);
assert_eq!(transition.to, AdaptiveTier::Tier1);
}
#[test]
fn test_quiet_demotion_is_slow_and_stepwise() {
let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Tier2);
let mut demotion = None;
for _ in 0..QUIET_DEMOTE_TICKS {
demotion = ctrl.observe(sample(1, 1, 1, 1, 0, 0), 0.25);
}
let transition = demotion.expect("expected quiet demotion");
assert_eq!(transition.from, AdaptiveTier::Tier2);
assert_eq!(transition.to, AdaptiveTier::Tier1);
assert_eq!(transition.reason, TierTransitionReason::QuietDemotion);
}
}

View File

@@ -4,10 +4,7 @@ use std::future::Future;
use std::net::{IpAddr, SocketAddr};
use std::pin::Pin;
use std::sync::Arc;
use std::sync::OnceLock;
use std::sync::atomic::{AtomicBool, Ordering};
use std::time::Duration;
use ipnetwork::IpNetwork;
use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite};
use tokio::net::TcpStream;
use tokio::time::timeout;
@@ -26,7 +23,7 @@ enum HandshakeOutcome {
use crate::config::ProxyConfig;
use crate::crypto::SecureRandom;
use crate::error::{HandshakeResult, ProxyError, Result, StreamError};
use crate::error::{HandshakeResult, ProxyError, Result};
use crate::ip_tracker::UserIpTracker;
use crate::protocol::constants::*;
use crate::protocol::tls;
@@ -43,6 +40,7 @@ use crate::proxy::handshake::{HandshakeSuccess, handle_mtproto_handshake, handle
use crate::proxy::masking::handle_bad_client;
use crate::proxy::middle_relay::handle_via_middle_proxy;
use crate::proxy::route_mode::{RelayRouteMode, RouteRuntimeController};
use crate::proxy::session_eviction::register_session;
fn beobachten_ttl(config: &ProxyConfig) -> Duration {
Duration::from_secs(config.general.beobachten_minutes.saturating_mul(60))
@@ -66,30 +64,14 @@ fn record_handshake_failure_class(
peer_ip: IpAddr,
error: &ProxyError,
) {
let class = match error {
ProxyError::Io(err) if err.kind() == std::io::ErrorKind::UnexpectedEof => {
"expected_64_got_0"
}
ProxyError::Stream(StreamError::UnexpectedEof) => "expected_64_got_0",
_ => "other",
let class = if error.to_string().contains("expected 64 bytes, got 0") {
"expected_64_got_0"
} else {
"other"
};
record_beobachten_class(beobachten, config, peer_ip, class);
}
fn is_trusted_proxy_source(peer_ip: IpAddr, trusted: &[IpNetwork]) -> bool {
if trusted.is_empty() {
static EMPTY_PROXY_TRUST_WARNED: OnceLock<AtomicBool> = OnceLock::new();
let warned = EMPTY_PROXY_TRUST_WARNED.get_or_init(|| AtomicBool::new(false));
if !warned.swap(true, Ordering::Relaxed) {
warn!(
"PROXY protocol enabled but server.proxy_protocol_trusted_cidrs is empty; rejecting all PROXY headers by default"
);
}
return false;
}
trusted.iter().any(|cidr| cidr.contains(peer_ip))
}
pub async fn handle_client_stream<S>(
mut stream: S,
peer: SocketAddr,
@@ -123,17 +105,6 @@ where
);
match timeout(proxy_header_timeout, parse_proxy_protocol(&mut stream, peer)).await {
Ok(Ok(info)) => {
if !is_trusted_proxy_source(peer.ip(), &config.server.proxy_protocol_trusted_cidrs)
{
stats.increment_connects_bad();
warn!(
peer = %peer,
trusted = ?config.server.proxy_protocol_trusted_cidrs,
"Rejecting PROXY protocol header from untrusted source"
);
record_beobachten_class(&beobachten, &config, peer.ip(), "other");
return Err(ProxyError::InvalidProxyProtocol);
}
debug!(
peer = %peer,
client = %info.src_addr,
@@ -179,13 +150,8 @@ where
if is_tls {
let tls_len = u16::from_be_bytes([first_bytes[3], first_bytes[4]]) as usize;
// RFC 8446 §5.1 mandates that TLSPlaintext records must not exceed 2^14
// bytes (16_384). A client claiming a larger record is non-compliant and
// may be an active probe attempting to force large allocations.
//
// Also enforce a minimum record size to avoid trivial/garbage probes.
if !(512..=MAX_TLS_RECORD_SIZE).contains(&tls_len) {
debug!(peer = %real_peer, tls_len = tls_len, max_tls_len = MAX_TLS_RECORD_SIZE, "TLS handshake length out of bounds");
if tls_len < 512 {
debug!(peer = %real_peer, tls_len = tls_len, "TLS handshake too short");
stats.increment_connects_bad();
let (reader, writer) = tokio::io::split(stream);
handle_bad_client(
@@ -239,19 +205,9 @@ where
&config, &replay_checker, true, Some(tls_user.as_str()),
).await {
HandshakeResult::Success(result) => result,
HandshakeResult::BadClient { reader, writer } => {
HandshakeResult::BadClient { reader: _, writer: _ } => {
stats.increment_connects_bad();
debug!(peer = %peer, "Valid TLS but invalid MTProto handshake");
handle_bad_client(
reader,
writer,
&mtproto_handshake,
real_peer,
local_addr,
&config,
&beobachten,
)
.await;
return Ok(HandshakeOutcome::Handled);
}
HandshakeResult::Error(e) => return Err(e),
@@ -490,24 +446,6 @@ impl RunningClientHandler {
.await
{
Ok(Ok(info)) => {
if !is_trusted_proxy_source(
self.peer.ip(),
&self.config.server.proxy_protocol_trusted_cidrs,
) {
self.stats.increment_connects_bad();
warn!(
peer = %self.peer,
trusted = ?self.config.server.proxy_protocol_trusted_cidrs,
"Rejecting PROXY protocol header from untrusted source"
);
record_beobachten_class(
&self.beobachten,
&self.config,
self.peer.ip(),
"other",
);
return Err(ProxyError::InvalidProxyProtocol);
}
debug!(
peer = %self.peer,
client = %info.src_addr,
@@ -576,10 +514,8 @@ impl RunningClientHandler {
debug!(peer = %peer, tls_len = tls_len, "Reading TLS handshake");
// See RFC 8446 §5.1: TLSPlaintext records must not exceed 16_384 bytes.
// Treat too-small or too-large lengths as active probes and mask them.
if !(512..=MAX_TLS_RECORD_SIZE).contains(&tls_len) {
debug!(peer = %peer, tls_len = tls_len, max_tls_len = MAX_TLS_RECORD_SIZE, "TLS handshake length out of bounds");
if tls_len < 512 {
debug!(peer = %peer, tls_len = tls_len, "TLS handshake too short");
self.stats.increment_connects_bad();
let (reader, writer) = self.stream.into_split();
handle_bad_client(
@@ -655,19 +591,12 @@ impl RunningClientHandler {
.await
{
HandshakeResult::Success(result) => result,
HandshakeResult::BadClient { reader, writer } => {
HandshakeResult::BadClient {
reader: _,
writer: _,
} => {
stats.increment_connects_bad();
debug!(peer = %peer, "Valid TLS but invalid MTProto handshake");
handle_bad_client(
reader,
writer,
&mtproto_handshake,
peer,
local_addr,
&config,
&self.beobachten,
)
.await;
return Ok(HandshakeOutcome::Handled);
}
HandshakeResult::Error(e) => return Err(e),
@@ -803,6 +732,17 @@ impl RunningClientHandler {
return Err(e);
}
let registration = register_session(&user, success.dc_idx);
if registration.replaced_existing {
stats.increment_reconnect_evict_total();
warn!(
user = %user,
dc = success.dc_idx,
"Reconnect detected: replacing active session for user+dc"
);
}
let session_lease = registration.lease;
let route_snapshot = route_runtime.snapshot();
let session_id = rng.u64();
let relay_result = if config.general.use_middle_proxy
@@ -814,7 +754,7 @@ impl RunningClientHandler {
client_writer,
success,
pool.clone(),
stats.clone(),
stats,
config,
buffer_pool,
local_addr,
@@ -822,6 +762,7 @@ impl RunningClientHandler {
route_runtime.subscribe(),
route_snapshot,
session_id,
session_lease.clone(),
)
.await
} else {
@@ -831,13 +772,14 @@ impl RunningClientHandler {
client_writer,
success,
upstream_manager,
stats.clone(),
stats,
config,
buffer_pool,
rng,
route_runtime.subscribe(),
route_snapshot,
session_id,
session_lease.clone(),
)
.await
}
@@ -848,18 +790,18 @@ impl RunningClientHandler {
client_writer,
success,
upstream_manager,
stats.clone(),
stats,
config,
buffer_pool,
rng,
route_runtime.subscribe(),
route_snapshot,
session_id,
session_lease.clone(),
)
.await
};
stats.decrement_user_curr_connects(&user);
ip_tracker.remove_ip(&user, peer_addr.ip()).await;
relay_result
}
@@ -879,29 +821,9 @@ impl RunningClientHandler {
});
}
if let Some(quota) = config.access.user_data_quota.get(user)
&& stats.get_user_total_octets(user) >= *quota
{
return Err(ProxyError::DataQuotaExceeded {
user: user.to_string(),
});
}
let limit = config
.access
.user_max_tcp_conns
.get(user)
.map(|v| *v as u64);
if !stats.try_acquire_user_curr_connects(user, limit) {
return Err(ProxyError::ConnectionLimitExceeded {
user: user.to_string(),
});
}
match ip_tracker.check_and_add(user, peer_addr.ip()).await {
Ok(()) => {}
let ip_reserved = match ip_tracker.check_and_add(user, peer_addr.ip()).await {
Ok(()) => true,
Err(reason) => {
stats.decrement_user_curr_connects(user);
warn!(
user = %user,
ip = %peer_addr.ip(),
@@ -912,12 +834,33 @@ impl RunningClientHandler {
user: user.to_string(),
});
}
};
// IP limit check
if let Some(limit) = config.access.user_max_tcp_conns.get(user)
&& stats.get_user_curr_connects(user) >= *limit as u64
{
if ip_reserved {
ip_tracker.remove_ip(user, peer_addr.ip()).await;
stats.increment_ip_reservation_rollback_tcp_limit_total();
}
return Err(ProxyError::ConnectionLimitExceeded {
user: user.to_string(),
});
}
if let Some(quota) = config.access.user_data_quota.get(user)
&& stats.get_user_total_octets(user) >= *quota
{
if ip_reserved {
ip_tracker.remove_ip(user, peer_addr.ip()).await;
stats.increment_ip_reservation_rollback_quota_limit_total();
}
return Err(ProxyError::DataQuotaExceeded {
user: user.to_string(),
});
}
Ok(())
}
}
#[cfg(test)]
#[path = "client_security_tests.rs"]
mod security_tests;

File diff suppressed because it is too large Load Diff

View File

@@ -2,8 +2,6 @@ use std::fs::OpenOptions;
use std::io::Write;
use std::net::SocketAddr;
use std::sync::Arc;
use std::collections::HashSet;
use std::sync::{Mutex, OnceLock};
use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt};
use tokio::net::TcpStream;
@@ -20,49 +18,12 @@ use crate::proxy::route_mode::{
RelayRouteMode, RouteCutoverState, ROUTE_SWITCH_ERROR_MSG, affected_cutover_state,
cutover_stagger_delay,
};
use crate::proxy::adaptive_buffers;
use crate::proxy::session_eviction::SessionLease;
use crate::stats::Stats;
use crate::stream::{BufferPool, CryptoReader, CryptoWriter};
use crate::transport::UpstreamManager;
const UNKNOWN_DC_LOG_DISTINCT_LIMIT: usize = 1024;
static LOGGED_UNKNOWN_DCS: OnceLock<Mutex<HashSet<i16>>> = OnceLock::new();
// In tests, this function shares global mutable state. Callers that also use
// cache-reset helpers must hold `unknown_dc_test_lock()` to keep assertions
// deterministic under parallel execution.
fn should_log_unknown_dc(dc_idx: i16) -> bool {
let set = LOGGED_UNKNOWN_DCS.get_or_init(|| Mutex::new(HashSet::new()));
match set.lock() {
Ok(mut guard) => {
if guard.contains(&dc_idx) {
return false;
}
if guard.len() >= UNKNOWN_DC_LOG_DISTINCT_LIMIT {
return false;
}
guard.insert(dc_idx)
}
// If the lock is poisoned, keep logging rather than silently dropping
// operator-visible diagnostics.
Err(_) => true,
}
}
#[cfg(test)]
fn clear_unknown_dc_log_cache_for_testing() {
if let Some(set) = LOGGED_UNKNOWN_DCS.get()
&& let Ok(mut guard) = set.lock()
{
guard.clear();
}
}
#[cfg(test)]
fn unknown_dc_test_lock() -> &'static Mutex<()> {
static TEST_LOCK: OnceLock<Mutex<()>> = OnceLock::new();
TEST_LOCK.get_or_init(|| Mutex::new(()))
}
pub(crate) async fn handle_via_direct<R, W>(
client_reader: CryptoReader<R>,
client_writer: CryptoWriter<W>,
@@ -75,6 +36,7 @@ pub(crate) async fn handle_via_direct<R, W>(
mut route_rx: watch::Receiver<RouteCutoverState>,
route_snapshot: RouteCutoverState,
session_id: u64,
session_lease: SessionLease,
) -> Result<()>
where
R: AsyncRead + Unpin + Send + 'static,
@@ -105,18 +67,29 @@ where
debug!(peer = %success.peer, "TG handshake complete, starting relay");
stats.increment_user_connects(user);
stats.increment_user_curr_connects(user);
stats.increment_current_connections_direct();
let seed_tier = adaptive_buffers::seed_tier_for_user(user);
let (c2s_copy_buf, s2c_copy_buf) = adaptive_buffers::direct_copy_buffers_for_tier(
seed_tier,
config.general.direct_relay_copy_buf_c2s_bytes,
config.general.direct_relay_copy_buf_s2c_bytes,
);
let relay_result = relay_bidirectional(
client_reader,
client_writer,
tg_reader,
tg_writer,
config.general.direct_relay_copy_buf_c2s_bytes,
config.general.direct_relay_copy_buf_s2c_bytes,
c2s_copy_buf,
s2c_copy_buf,
user,
success.dc_idx,
Arc::clone(&stats),
buffer_pool,
session_lease,
seed_tier,
);
tokio::pin!(relay_result);
let relay_result = loop {
@@ -149,6 +122,7 @@ where
};
stats.decrement_current_connections_direct();
stats.decrement_user_curr_connects(user);
match &relay_result {
Ok(()) => debug!(user = %user, "Direct relay completed"),
@@ -199,7 +173,6 @@ fn get_dc_addr_static(dc_idx: i16, config: &ProxyConfig) -> Result<SocketAddr> {
warn!(dc_idx = dc_idx, "Requested non-standard DC with no override; falling back to default cluster");
if config.general.unknown_dc_file_log_enabled
&& let Some(path) = &config.general.unknown_dc_log_path
&& should_log_unknown_dc(dc_idx)
&& let Ok(handle) = tokio::runtime::Handle::try_current()
{
let path = path.clone();
@@ -215,7 +188,7 @@ fn get_dc_addr_static(dc_idx: i16, config: &ProxyConfig) -> Result<SocketAddr> {
let fallback_idx = if default_dc >= 1 && default_dc <= num_dcs {
default_dc - 1
} else {
0
1
};
info!(
@@ -243,6 +216,8 @@ async fn do_tg_handshake_static(
let (nonce, _tg_enc_key, _tg_enc_iv, _tg_dec_key, _tg_dec_iv) = generate_tg_nonce(
success.proto_tag,
success.dc_idx,
&success.dec_key,
success.dec_iv,
&success.enc_key,
success.enc_iv,
rng,
@@ -268,7 +243,3 @@ async fn do_tg_handshake_static(
CryptoWriter::new(write_half, tg_encryptor, max_pending),
))
}
#[cfg(test)]
#[path = "direct_relay_security_tests.rs"]
mod security_tests;

View File

@@ -1,51 +0,0 @@
use super::*;
#[test]
fn unknown_dc_log_is_deduplicated_per_dc_idx() {
let _guard = unknown_dc_test_lock()
.lock()
.expect("unknown dc test lock must be available");
clear_unknown_dc_log_cache_for_testing();
assert!(should_log_unknown_dc(777));
assert!(
!should_log_unknown_dc(777),
"same unknown dc_idx must not be logged repeatedly"
);
assert!(
should_log_unknown_dc(778),
"different unknown dc_idx must still be loggable"
);
}
#[test]
fn unknown_dc_log_respects_distinct_limit() {
let _guard = unknown_dc_test_lock()
.lock()
.expect("unknown dc test lock must be available");
clear_unknown_dc_log_cache_for_testing();
for dc in 1..=UNKNOWN_DC_LOG_DISTINCT_LIMIT {
assert!(
should_log_unknown_dc(dc as i16),
"expected first-time unknown dc_idx to be loggable"
);
}
assert!(
!should_log_unknown_dc(i16::MAX),
"distinct unknown dc_idx entries above limit must not be logged"
);
}
#[test]
fn fallback_dc_never_panics_with_single_dc_list() {
let mut cfg = ProxyConfig::default();
cfg.network.prefer = 6;
cfg.network.ipv6 = Some(true);
cfg.default_dc = Some(42);
let addr = get_dc_addr_static(999, &cfg).expect("fallback dc must resolve safely");
let expected = SocketAddr::new(TG_DATACENTERS_V6[0], TG_DATACENTER_PORT);
assert_eq!(addr, expected);
}

View File

@@ -3,12 +3,8 @@
#![allow(dead_code)]
use std::net::SocketAddr;
use std::collections::HashSet;
use std::net::IpAddr;
use std::sync::Arc;
use std::sync::{Mutex, OnceLock};
use std::time::{Duration, Instant};
use dashmap::DashMap;
use std::time::Duration;
use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt};
use tracing::{debug, warn, trace};
use zeroize::Zeroize;
@@ -23,231 +19,6 @@ use crate::stats::ReplayChecker;
use crate::config::ProxyConfig;
use crate::tls_front::{TlsFrontCache, emulator};
const ACCESS_SECRET_BYTES: usize = 16;
static INVALID_SECRET_WARNED: OnceLock<Mutex<HashSet<(String, String)>>> = OnceLock::new();
const AUTH_PROBE_TRACK_RETENTION_SECS: u64 = 10 * 60;
#[cfg(test)]
const AUTH_PROBE_TRACK_MAX_ENTRIES: usize = 256;
#[cfg(not(test))]
const AUTH_PROBE_TRACK_MAX_ENTRIES: usize = 65_536;
const AUTH_PROBE_PRUNE_SCAN_LIMIT: usize = 1_024;
const AUTH_PROBE_BACKOFF_START_FAILS: u32 = 4;
#[cfg(test)]
const AUTH_PROBE_BACKOFF_BASE_MS: u64 = 1;
#[cfg(not(test))]
const AUTH_PROBE_BACKOFF_BASE_MS: u64 = 25;
#[cfg(test)]
const AUTH_PROBE_BACKOFF_MAX_MS: u64 = 16;
#[cfg(not(test))]
const AUTH_PROBE_BACKOFF_MAX_MS: u64 = 1_000;
#[derive(Clone, Copy)]
struct AuthProbeState {
fail_streak: u32,
blocked_until: Instant,
last_seen: Instant,
}
static AUTH_PROBE_STATE: OnceLock<DashMap<IpAddr, AuthProbeState>> = OnceLock::new();
fn auth_probe_state_map() -> &'static DashMap<IpAddr, AuthProbeState> {
AUTH_PROBE_STATE.get_or_init(DashMap::new)
}
fn auth_probe_backoff(fail_streak: u32) -> Duration {
if fail_streak < AUTH_PROBE_BACKOFF_START_FAILS {
return Duration::ZERO;
}
let shift = (fail_streak - AUTH_PROBE_BACKOFF_START_FAILS).min(10);
let multiplier = 1u64.checked_shl(shift).unwrap_or(u64::MAX);
let ms = AUTH_PROBE_BACKOFF_BASE_MS
.saturating_mul(multiplier)
.min(AUTH_PROBE_BACKOFF_MAX_MS);
Duration::from_millis(ms)
}
fn auth_probe_state_expired(state: &AuthProbeState, now: Instant) -> bool {
let retention = Duration::from_secs(AUTH_PROBE_TRACK_RETENTION_SECS);
now.duration_since(state.last_seen) > retention
}
fn auth_probe_is_throttled(peer_ip: IpAddr, now: Instant) -> bool {
let state = auth_probe_state_map();
let Some(entry) = state.get(&peer_ip) else {
return false;
};
if auth_probe_state_expired(&entry, now) {
drop(entry);
state.remove(&peer_ip);
return false;
}
now < entry.blocked_until
}
fn auth_probe_record_failure(peer_ip: IpAddr, now: Instant) {
let state = auth_probe_state_map();
auth_probe_record_failure_with_state(state, peer_ip, now);
}
fn auth_probe_record_failure_with_state(
state: &DashMap<IpAddr, AuthProbeState>,
peer_ip: IpAddr,
now: Instant,
) {
if let Some(mut entry) = state.get_mut(&peer_ip) {
if auth_probe_state_expired(&entry, now) {
*entry = AuthProbeState {
fail_streak: 1,
blocked_until: now + auth_probe_backoff(1),
last_seen: now,
};
return;
}
entry.fail_streak = entry.fail_streak.saturating_add(1);
entry.last_seen = now;
entry.blocked_until = now + auth_probe_backoff(entry.fail_streak);
return;
};
if state.len() >= AUTH_PROBE_TRACK_MAX_ENTRIES {
let mut stale_keys = Vec::new();
for entry in state.iter().take(AUTH_PROBE_PRUNE_SCAN_LIMIT) {
if auth_probe_state_expired(entry.value(), now) {
stale_keys.push(*entry.key());
}
}
for stale_key in stale_keys {
state.remove(&stale_key);
}
if state.len() >= AUTH_PROBE_TRACK_MAX_ENTRIES {
return;
}
}
state.insert(peer_ip, AuthProbeState {
fail_streak: 0,
blocked_until: now,
last_seen: now,
});
if let Some(mut entry) = state.get_mut(&peer_ip) {
entry.fail_streak = 1;
entry.blocked_until = now + auth_probe_backoff(1);
}
}
fn auth_probe_record_success(peer_ip: IpAddr) {
let state = auth_probe_state_map();
state.remove(&peer_ip);
}
#[cfg(test)]
fn clear_auth_probe_state_for_testing() {
if let Some(state) = AUTH_PROBE_STATE.get() {
state.clear();
}
}
#[cfg(test)]
fn auth_probe_fail_streak_for_testing(peer_ip: IpAddr) -> Option<u32> {
let state = AUTH_PROBE_STATE.get()?;
state.get(&peer_ip).map(|entry| entry.fail_streak)
}
#[cfg(test)]
fn auth_probe_is_throttled_for_testing(peer_ip: IpAddr) -> bool {
auth_probe_is_throttled(peer_ip, Instant::now())
}
#[cfg(test)]
fn auth_probe_test_lock() -> &'static Mutex<()> {
static TEST_LOCK: OnceLock<Mutex<()>> = OnceLock::new();
TEST_LOCK.get_or_init(|| Mutex::new(()))
}
#[cfg(test)]
fn clear_warned_secrets_for_testing() {
if let Some(warned) = INVALID_SECRET_WARNED.get()
&& let Ok(mut guard) = warned.lock()
{
guard.clear();
}
}
fn warn_invalid_secret_once(name: &str, reason: &str, expected: usize, got: Option<usize>) {
let key = (name.to_string(), reason.to_string());
let warned = INVALID_SECRET_WARNED.get_or_init(|| Mutex::new(HashSet::new()));
let should_warn = match warned.lock() {
Ok(mut guard) => guard.insert(key),
Err(_) => true,
};
if !should_warn {
return;
}
match got {
Some(actual) => {
warn!(
user = %name,
expected = expected,
got = actual,
"Skipping user: access secret has unexpected length"
);
}
None => {
warn!(
user = %name,
"Skipping user: access secret is not valid hex"
);
}
}
}
fn decode_user_secret(name: &str, secret_hex: &str) -> Option<Vec<u8>> {
match hex::decode(secret_hex) {
Ok(bytes) if bytes.len() == ACCESS_SECRET_BYTES => Some(bytes),
Ok(bytes) => {
warn_invalid_secret_once(
name,
"invalid_length",
ACCESS_SECRET_BYTES,
Some(bytes.len()),
);
None
}
Err(_) => {
warn_invalid_secret_once(name, "invalid_hex", ACCESS_SECRET_BYTES, None);
None
}
}
}
// Decide whether a client-supplied proto tag is allowed given the configured
// proxy modes and the transport that carried the handshake.
//
// A common mistake is to treat `modes.tls` and `modes.secure` as interchangeable
// even though they correspond to different transport profiles: `modes.tls` is
// for the TLS-fronted (EE-TLS) path, while `modes.secure` is for direct MTProto
// over TCP (DD). Enforcing this separation prevents an attacker from using a
// TLS-capable client to bypass the operator intent for the direct MTProto mode,
// and vice versa.
fn mode_enabled_for_proto(config: &ProxyConfig, proto_tag: ProtoTag, is_tls: bool) -> bool {
match proto_tag {
ProtoTag::Secure => {
if is_tls {
config.general.modes.tls
} else {
config.general.modes.secure
}
}
ProtoTag::Intermediate | ProtoTag::Abridged => config.general.modes.classic,
}
}
fn decode_user_secrets(
config: &ProxyConfig,
preferred_user: Option<&str>,
@@ -256,7 +27,7 @@ fn decode_user_secrets(
if let Some(preferred) = preferred_user
&& let Some(secret_hex) = config.access.users.get(preferred)
&& let Some(bytes) = decode_user_secret(preferred, secret_hex)
&& let Ok(bytes) = hex::decode(secret_hex)
{
secrets.push((preferred.to_string(), bytes));
}
@@ -265,7 +36,7 @@ fn decode_user_secrets(
if preferred_user.is_some_and(|preferred| preferred == name.as_str()) {
continue;
}
if let Some(bytes) = decode_user_secret(name, secret_hex) {
if let Ok(bytes) = hex::decode(secret_hex) {
secrets.push((name.clone(), bytes));
}
}
@@ -277,7 +48,7 @@ fn decode_user_secrets(
///
/// Key material (`dec_key`, `dec_iv`, `enc_key`, `enc_iv`) is
/// zeroized on drop.
#[derive(Debug)]
#[derive(Debug, Clone)]
pub struct HandshakeSuccess {
/// Authenticated user name
pub user: String,
@@ -323,27 +94,28 @@ where
{
debug!(peer = %peer, handshake_len = handshake.len(), "Processing TLS handshake");
if auth_probe_is_throttled(peer.ip(), Instant::now()) {
debug!(peer = %peer, "TLS handshake rejected by pre-auth probe throttle");
return HandshakeResult::BadClient { reader, writer };
}
if handshake.len() < tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN + 1 {
debug!(peer = %peer, "TLS handshake too short");
return HandshakeResult::BadClient { reader, writer };
}
let digest = &handshake[tls::TLS_DIGEST_POS..tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN];
let digest_half = &digest[..tls::TLS_DIGEST_HALF_LEN];
if replay_checker.check_and_add_tls_digest(digest_half) {
warn!(peer = %peer, "TLS replay attack detected (duplicate digest)");
return HandshakeResult::BadClient { reader, writer };
}
let secrets = decode_user_secrets(config, None);
let validation = match tls::validate_tls_handshake_with_replay_window(
let validation = match tls::validate_tls_handshake(
handshake,
&secrets,
config.access.ignore_time_skew,
config.access.replay_window_secs,
) {
Some(v) => v,
None => {
auth_probe_record_failure(peer.ip(), Instant::now());
debug!(
peer = %peer,
ignore_time_skew = config.access.ignore_time_skew,
@@ -353,15 +125,6 @@ where
}
};
// Replay tracking is applied only after successful authentication to avoid
// letting unauthenticated probes evict valid entries from the replay cache.
let digest_half = &validation.digest[..tls::TLS_DIGEST_HALF_LEN];
if replay_checker.check_and_add_tls_digest(digest_half) {
auth_probe_record_failure(peer.ip(), Instant::now());
warn!(peer = %peer, "TLS replay attack detected (duplicate digest)");
return HandshakeResult::BadClient { reader, writer };
}
let secret = match secrets.iter().find(|(name, _)| *name == validation.user) {
Some((_, s)) => s,
None => return HandshakeResult::BadClient { reader, writer },
@@ -403,9 +166,6 @@ where
Some(b"h2".to_vec())
} else if alpn_list.iter().any(|p| p == b"http/1.1") {
Some(b"http/1.1".to_vec())
} else if !alpn_list.is_empty() {
debug!(peer = %peer, "Client ALPN list has no supported protocol; using masking fallback");
return HandshakeResult::BadClient { reader, writer };
} else {
None
}
@@ -468,8 +228,6 @@ where
"TLS handshake successful"
);
auth_probe_record_success(peer.ip());
HandshakeResult::Success((
FakeTlsReader::new(reader),
FakeTlsWriter::new(writer),
@@ -494,13 +252,13 @@ where
{
trace!(peer = %peer, handshake = ?hex::encode(handshake), "MTProto handshake bytes");
if auth_probe_is_throttled(peer.ip(), Instant::now()) {
debug!(peer = %peer, "MTProto handshake rejected by pre-auth probe throttle");
let dec_prekey_iv = &handshake[SKIP_LEN..SKIP_LEN + PREKEY_LEN + IV_LEN];
if replay_checker.check_and_add_handshake(dec_prekey_iv) {
warn!(peer = %peer, "MTProto replay attack detected");
return HandshakeResult::BadClient { reader, writer };
}
let dec_prekey_iv = &handshake[SKIP_LEN..SKIP_LEN + PREKEY_LEN + IV_LEN];
let enc_prekey_iv: Vec<u8> = dec_prekey_iv.iter().rev().copied().collect();
let decoded_users = decode_user_secrets(config, preferred_user);
@@ -515,33 +273,39 @@ where
dec_key_input.extend_from_slice(&secret);
let dec_key = sha256(&dec_key_input);
let mut dec_iv_arr = [0u8; IV_LEN];
dec_iv_arr.copy_from_slice(dec_iv_bytes);
let dec_iv = u128::from_be_bytes(dec_iv_arr);
let dec_iv = u128::from_be_bytes(dec_iv_bytes.try_into().unwrap());
let mut decryptor = AesCtr::new(&dec_key, dec_iv);
let decrypted = decryptor.decrypt(handshake);
let tag_bytes: [u8; 4] = [
decrypted[PROTO_TAG_POS],
decrypted[PROTO_TAG_POS + 1],
decrypted[PROTO_TAG_POS + 2],
decrypted[PROTO_TAG_POS + 3],
];
let tag_bytes: [u8; 4] = decrypted[PROTO_TAG_POS..PROTO_TAG_POS + 4]
.try_into()
.unwrap();
let proto_tag = match ProtoTag::from_bytes(tag_bytes) {
Some(tag) => tag,
None => continue,
};
let mode_ok = mode_enabled_for_proto(config, proto_tag, is_tls);
let mode_ok = match proto_tag {
ProtoTag::Secure => {
if is_tls {
config.general.modes.tls || config.general.modes.secure
} else {
config.general.modes.secure || config.general.modes.tls
}
}
ProtoTag::Intermediate | ProtoTag::Abridged => config.general.modes.classic,
};
if !mode_ok {
debug!(peer = %peer, user = %user, proto = ?proto_tag, "Mode not enabled");
continue;
}
let dc_idx = i16::from_le_bytes([decrypted[DC_IDX_POS], decrypted[DC_IDX_POS + 1]]);
let dc_idx = i16::from_le_bytes(
decrypted[DC_IDX_POS..DC_IDX_POS + 2].try_into().unwrap()
);
let enc_prekey = &enc_prekey_iv[..PREKEY_LEN];
let enc_iv_bytes = &enc_prekey_iv[PREKEY_LEN..];
@@ -551,24 +315,10 @@ where
enc_key_input.extend_from_slice(&secret);
let enc_key = sha256(&enc_key_input);
let mut enc_iv_arr = [0u8; IV_LEN];
enc_iv_arr.copy_from_slice(enc_iv_bytes);
let enc_iv = u128::from_be_bytes(enc_iv_arr);
let enc_iv = u128::from_be_bytes(enc_iv_bytes.try_into().unwrap());
let encryptor = AesCtr::new(&enc_key, enc_iv);
// Apply replay tracking only after successful authentication.
//
// This ordering prevents an attacker from producing invalid handshakes that
// still collide with a valid handshake's replay slot and thus evict a valid
// entry from the cache. We accept the cost of performing the full
// authentication check first to avoid poisoning the replay cache.
if replay_checker.check_and_add_handshake(dec_prekey_iv) {
auth_probe_record_failure(peer.ip(), Instant::now());
warn!(peer = %peer, user = %user, "MTProto replay attack detected");
return HandshakeResult::BadClient { reader, writer };
}
let success = HandshakeSuccess {
user: user.clone(),
dc_idx,
@@ -590,8 +340,6 @@ where
"MTProto handshake successful"
);
auth_probe_record_success(peer.ip());
let max_pending = config.general.crypto_pending_buffer;
return HandshakeResult::Success((
CryptoReader::new(reader, decryptor),
@@ -600,7 +348,6 @@ where
));
}
auth_probe_record_failure(peer.ip(), Instant::now());
debug!(peer = %peer, "MTProto handshake: no matching user found");
HandshakeResult::BadClient { reader, writer }
}
@@ -609,6 +356,8 @@ where
pub fn generate_tg_nonce(
proto_tag: ProtoTag,
dc_idx: i16,
_client_dec_key: &[u8; 32],
_client_dec_iv: u128,
client_enc_key: &[u8; 32],
client_enc_iv: u128,
rng: &SecureRandom,
@@ -616,16 +365,14 @@ pub fn generate_tg_nonce(
) -> ([u8; HANDSHAKE_LEN], [u8; 32], u128, [u8; 32], u128) {
loop {
let bytes = rng.bytes(HANDSHAKE_LEN);
let Ok(mut nonce): Result<[u8; HANDSHAKE_LEN], _> = bytes.try_into() else {
continue;
};
let mut nonce: [u8; HANDSHAKE_LEN] = bytes.try_into().unwrap();
if RESERVED_NONCE_FIRST_BYTES.contains(&nonce[0]) { continue; }
let first_four: [u8; 4] = [nonce[0], nonce[1], nonce[2], nonce[3]];
let first_four: [u8; 4] = nonce[..4].try_into().unwrap();
if RESERVED_NONCE_BEGINNINGS.contains(&first_four) { continue; }
let continue_four: [u8; 4] = [nonce[4], nonce[5], nonce[6], nonce[7]];
let continue_four: [u8; 4] = nonce[4..8].try_into().unwrap();
if RESERVED_NONCE_CONTINUES.contains(&continue_four) { continue; }
nonce[PROTO_TAG_POS..PROTO_TAG_POS + 4].copy_from_slice(&proto_tag.to_bytes());
@@ -643,17 +390,11 @@ pub fn generate_tg_nonce(
let enc_key_iv = &nonce[SKIP_LEN..SKIP_LEN + KEY_LEN + IV_LEN];
let dec_key_iv: Vec<u8> = enc_key_iv.iter().rev().copied().collect();
let mut tg_enc_key = [0u8; 32];
tg_enc_key.copy_from_slice(&enc_key_iv[..KEY_LEN]);
let mut tg_enc_iv_arr = [0u8; IV_LEN];
tg_enc_iv_arr.copy_from_slice(&enc_key_iv[KEY_LEN..]);
let tg_enc_iv = u128::from_be_bytes(tg_enc_iv_arr);
let tg_enc_key: [u8; 32] = enc_key_iv[..KEY_LEN].try_into().unwrap();
let tg_enc_iv = u128::from_be_bytes(enc_key_iv[KEY_LEN..].try_into().unwrap());
let mut tg_dec_key = [0u8; 32];
tg_dec_key.copy_from_slice(&dec_key_iv[..KEY_LEN]);
let mut tg_dec_iv_arr = [0u8; IV_LEN];
tg_dec_iv_arr.copy_from_slice(&dec_key_iv[KEY_LEN..]);
let tg_dec_iv = u128::from_be_bytes(tg_dec_iv_arr);
let tg_dec_key: [u8; 32] = dec_key_iv[..KEY_LEN].try_into().unwrap();
let tg_dec_iv = u128::from_be_bytes(dec_key_iv[KEY_LEN..].try_into().unwrap());
return (nonce, tg_enc_key, tg_enc_iv, tg_dec_key, tg_dec_iv);
}
@@ -664,17 +405,11 @@ pub fn encrypt_tg_nonce_with_ciphers(nonce: &[u8; HANDSHAKE_LEN]) -> (Vec<u8>, A
let enc_key_iv = &nonce[SKIP_LEN..SKIP_LEN + KEY_LEN + IV_LEN];
let dec_key_iv: Vec<u8> = enc_key_iv.iter().rev().copied().collect();
let mut enc_key = [0u8; 32];
enc_key.copy_from_slice(&enc_key_iv[..KEY_LEN]);
let mut enc_iv_arr = [0u8; IV_LEN];
enc_iv_arr.copy_from_slice(&enc_key_iv[KEY_LEN..]);
let enc_iv = u128::from_be_bytes(enc_iv_arr);
let enc_key: [u8; 32] = enc_key_iv[..KEY_LEN].try_into().unwrap();
let enc_iv = u128::from_be_bytes(enc_key_iv[KEY_LEN..].try_into().unwrap());
let mut dec_key = [0u8; 32];
dec_key.copy_from_slice(&dec_key_iv[..KEY_LEN]);
let mut dec_iv_arr = [0u8; IV_LEN];
dec_iv_arr.copy_from_slice(&dec_key_iv[KEY_LEN..]);
let dec_iv = u128::from_be_bytes(dec_iv_arr);
let dec_key: [u8; 32] = dec_key_iv[..KEY_LEN].try_into().unwrap();
let dec_iv = u128::from_be_bytes(dec_key_iv[KEY_LEN..].try_into().unwrap());
let mut encryptor = AesCtr::new(&enc_key, enc_iv);
let encrypted_full = encryptor.encrypt(nonce); // counter: 0 → 4
@@ -694,15 +429,80 @@ pub fn encrypt_tg_nonce(nonce: &[u8; HANDSHAKE_LEN]) -> Vec<u8> {
}
#[cfg(test)]
#[path = "handshake_security_tests.rs"]
mod security_tests;
mod tests {
use super::*;
/// Compile-time guard: HandshakeSuccess holds cryptographic key material and
/// must never be Copy. A Copy impl would allow silent key duplication,
/// undermining the zeroize-on-drop guarantee.
mod compile_time_security_checks {
use super::HandshakeSuccess;
use static_assertions::assert_not_impl_all;
#[test]
fn test_generate_tg_nonce() {
let client_dec_key = [0x42u8; 32];
let client_dec_iv = 12345u128;
let client_enc_key = [0x24u8; 32];
let client_enc_iv = 54321u128;
assert_not_impl_all!(HandshakeSuccess: Copy, Clone);
let rng = SecureRandom::new();
let (nonce, _tg_enc_key, _tg_enc_iv, _tg_dec_key, _tg_dec_iv) =
generate_tg_nonce(
ProtoTag::Secure,
2,
&client_dec_key,
client_dec_iv,
&client_enc_key,
client_enc_iv,
&rng,
false,
);
assert_eq!(nonce.len(), HANDSHAKE_LEN);
let tag_bytes: [u8; 4] = nonce[PROTO_TAG_POS..PROTO_TAG_POS + 4].try_into().unwrap();
assert_eq!(ProtoTag::from_bytes(tag_bytes), Some(ProtoTag::Secure));
}
#[test]
fn test_encrypt_tg_nonce() {
let client_dec_key = [0x42u8; 32];
let client_dec_iv = 12345u128;
let client_enc_key = [0x24u8; 32];
let client_enc_iv = 54321u128;
let rng = SecureRandom::new();
let (nonce, _, _, _, _) =
generate_tg_nonce(
ProtoTag::Secure,
2,
&client_dec_key,
client_dec_iv,
&client_enc_key,
client_enc_iv,
&rng,
false,
);
let encrypted = encrypt_tg_nonce(&nonce);
assert_eq!(encrypted.len(), HANDSHAKE_LEN);
assert_eq!(&encrypted[..PROTO_TAG_POS], &nonce[..PROTO_TAG_POS]);
assert_ne!(&encrypted[PROTO_TAG_POS..], &nonce[PROTO_TAG_POS..]);
}
#[test]
fn test_handshake_success_zeroize_on_drop() {
let success = HandshakeSuccess {
user: "test".to_string(),
dc_idx: 2,
proto_tag: ProtoTag::Secure,
dec_key: [0xAA; 32],
dec_iv: 0xBBBBBBBB,
enc_key: [0xCC; 32],
enc_iv: 0xDDDDDDDD,
peer: "127.0.0.1:1234".parse().unwrap(),
is_tls: true,
};
assert_eq!(success.dec_key, [0xAA; 32]);
assert_eq!(success.enc_key, [0xCC; 32]);
drop(success);
// Drop impl zeroizes key material without panic
}
}

View File

@@ -1,891 +0,0 @@
use super::*;
use crate::crypto::sha256_hmac;
use dashmap::DashMap;
use std::net::{IpAddr, Ipv4Addr};
use std::sync::Arc;
use std::time::{Duration, Instant};
fn make_valid_tls_handshake(secret: &[u8], timestamp: u32) -> Vec<u8> {
let session_id_len: usize = 32;
let len = tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN + 1 + session_id_len;
let mut handshake = vec![0x42u8; len];
handshake[tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN] = session_id_len as u8;
handshake[tls::TLS_DIGEST_POS..tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN].fill(0);
let computed = sha256_hmac(secret, &handshake);
let mut digest = computed;
let ts = timestamp.to_le_bytes();
for i in 0..4 {
digest[28 + i] ^= ts[i];
}
handshake[tls::TLS_DIGEST_POS..tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN]
.copy_from_slice(&digest);
handshake
}
fn make_valid_tls_client_hello_with_alpn(
secret: &[u8],
timestamp: u32,
alpn_protocols: &[&[u8]],
) -> Vec<u8> {
let mut body = Vec::new();
body.extend_from_slice(&TLS_VERSION);
body.extend_from_slice(&[0u8; 32]);
body.push(32);
body.extend_from_slice(&[0x42u8; 32]);
body.extend_from_slice(&2u16.to_be_bytes());
body.extend_from_slice(&[0x13, 0x01]);
body.push(1);
body.push(0);
let mut ext_blob = Vec::new();
if !alpn_protocols.is_empty() {
let mut alpn_list = Vec::new();
for proto in alpn_protocols {
alpn_list.push(proto.len() as u8);
alpn_list.extend_from_slice(proto);
}
let mut alpn_data = Vec::new();
alpn_data.extend_from_slice(&(alpn_list.len() as u16).to_be_bytes());
alpn_data.extend_from_slice(&alpn_list);
ext_blob.extend_from_slice(&0x0010u16.to_be_bytes());
ext_blob.extend_from_slice(&(alpn_data.len() as u16).to_be_bytes());
ext_blob.extend_from_slice(&alpn_data);
}
body.extend_from_slice(&(ext_blob.len() as u16).to_be_bytes());
body.extend_from_slice(&ext_blob);
let mut handshake = Vec::new();
handshake.push(0x01);
let body_len = (body.len() as u32).to_be_bytes();
handshake.extend_from_slice(&body_len[1..4]);
handshake.extend_from_slice(&body);
let mut record = Vec::new();
record.push(TLS_RECORD_HANDSHAKE);
record.extend_from_slice(&[0x03, 0x01]);
record.extend_from_slice(&(handshake.len() as u16).to_be_bytes());
record.extend_from_slice(&handshake);
record[tls::TLS_DIGEST_POS..tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN].fill(0);
let computed = sha256_hmac(secret, &record);
let mut digest = computed;
let ts = timestamp.to_le_bytes();
for i in 0..4 {
digest[28 + i] ^= ts[i];
}
record[tls::TLS_DIGEST_POS..tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN]
.copy_from_slice(&digest);
record
}
fn test_config_with_secret_hex(secret_hex: &str) -> ProxyConfig {
clear_auth_probe_state_for_testing();
let mut cfg = ProxyConfig::default();
cfg.access.users.clear();
cfg.access
.users
.insert("user".to_string(), secret_hex.to_string());
cfg.access.ignore_time_skew = true;
cfg
}
#[test]
fn test_generate_tg_nonce() {
let client_enc_key = [0x24u8; 32];
let client_enc_iv = 54321u128;
let rng = SecureRandom::new();
let (nonce, _tg_enc_key, _tg_enc_iv, _tg_dec_key, _tg_dec_iv) = generate_tg_nonce(
ProtoTag::Secure,
2,
&client_enc_key,
client_enc_iv,
&rng,
false,
);
assert_eq!(nonce.len(), HANDSHAKE_LEN);
let tag_bytes: [u8; 4] = nonce[PROTO_TAG_POS..PROTO_TAG_POS + 4].try_into().unwrap();
assert_eq!(ProtoTag::from_bytes(tag_bytes), Some(ProtoTag::Secure));
}
#[test]
fn test_encrypt_tg_nonce() {
let client_enc_key = [0x24u8; 32];
let client_enc_iv = 54321u128;
let rng = SecureRandom::new();
let (nonce, _, _, _, _) = generate_tg_nonce(
ProtoTag::Secure,
2,
&client_enc_key,
client_enc_iv,
&rng,
false,
);
let encrypted = encrypt_tg_nonce(&nonce);
assert_eq!(encrypted.len(), HANDSHAKE_LEN);
assert_eq!(&encrypted[..PROTO_TAG_POS], &nonce[..PROTO_TAG_POS]);
assert_ne!(&encrypted[PROTO_TAG_POS..], &nonce[PROTO_TAG_POS..]);
}
#[test]
fn test_handshake_success_drop_does_not_panic() {
let success = HandshakeSuccess {
user: "test".to_string(),
dc_idx: 2,
proto_tag: ProtoTag::Secure,
dec_key: [0xAA; 32],
dec_iv: 0xBBBBBBBB,
enc_key: [0xCC; 32],
enc_iv: 0xDDDDDDDD,
peer: "198.51.100.10:1234".parse().unwrap(),
is_tls: true,
};
assert_eq!(success.dec_key, [0xAA; 32]);
assert_eq!(success.enc_key, [0xCC; 32]);
drop(success);
}
#[test]
fn test_generate_tg_nonce_enc_dec_material_is_consistent() {
let client_enc_key = [0x34u8; 32];
let client_enc_iv = 0xffeeddccbbaa00998877665544332211u128;
let rng = SecureRandom::new();
let (nonce, tg_enc_key, tg_enc_iv, tg_dec_key, tg_dec_iv) = generate_tg_nonce(
ProtoTag::Secure,
7,
&client_enc_key,
client_enc_iv,
&rng,
false,
);
let enc_key_iv = &nonce[SKIP_LEN..SKIP_LEN + KEY_LEN + IV_LEN];
let dec_key_iv: Vec<u8> = enc_key_iv.iter().rev().copied().collect();
let mut expected_tg_enc_key = [0u8; 32];
expected_tg_enc_key.copy_from_slice(&enc_key_iv[..KEY_LEN]);
let mut expected_tg_enc_iv_arr = [0u8; IV_LEN];
expected_tg_enc_iv_arr.copy_from_slice(&enc_key_iv[KEY_LEN..]);
let expected_tg_enc_iv = u128::from_be_bytes(expected_tg_enc_iv_arr);
let mut expected_tg_dec_key = [0u8; 32];
expected_tg_dec_key.copy_from_slice(&dec_key_iv[..KEY_LEN]);
let mut expected_tg_dec_iv_arr = [0u8; IV_LEN];
expected_tg_dec_iv_arr.copy_from_slice(&dec_key_iv[KEY_LEN..]);
let expected_tg_dec_iv = u128::from_be_bytes(expected_tg_dec_iv_arr);
assert_eq!(tg_enc_key, expected_tg_enc_key);
assert_eq!(tg_enc_iv, expected_tg_enc_iv);
assert_eq!(tg_dec_key, expected_tg_dec_key);
assert_eq!(tg_dec_iv, expected_tg_dec_iv);
assert_eq!(
i16::from_le_bytes([nonce[DC_IDX_POS], nonce[DC_IDX_POS + 1]]),
7,
"Generated nonce must keep target dc index in protocol slot"
);
}
#[test]
fn test_generate_tg_nonce_fast_mode_embeds_reversed_client_enc_material() {
let client_enc_key = [0xABu8; 32];
let client_enc_iv = 0x11223344556677889900aabbccddeeffu128;
let rng = SecureRandom::new();
let (nonce, _, _, _, _) = generate_tg_nonce(
ProtoTag::Secure,
9,
&client_enc_key,
client_enc_iv,
&rng,
true,
);
let mut expected = Vec::with_capacity(KEY_LEN + IV_LEN);
expected.extend_from_slice(&client_enc_key);
expected.extend_from_slice(&client_enc_iv.to_be_bytes());
expected.reverse();
assert_eq!(&nonce[SKIP_LEN..SKIP_LEN + KEY_LEN + IV_LEN], expected.as_slice());
}
#[test]
fn test_encrypt_tg_nonce_with_ciphers_matches_manual_suffix_encryption() {
let client_enc_key = [0x24u8; 32];
let client_enc_iv = 54321u128;
let rng = SecureRandom::new();
let (nonce, _, _, _, _) = generate_tg_nonce(
ProtoTag::Secure,
2,
&client_enc_key,
client_enc_iv,
&rng,
false,
);
let (encrypted, _, _) = encrypt_tg_nonce_with_ciphers(&nonce);
let enc_key_iv = &nonce[SKIP_LEN..SKIP_LEN + KEY_LEN + IV_LEN];
let mut expected_enc_key = [0u8; 32];
expected_enc_key.copy_from_slice(&enc_key_iv[..KEY_LEN]);
let mut expected_enc_iv_arr = [0u8; IV_LEN];
expected_enc_iv_arr.copy_from_slice(&enc_key_iv[KEY_LEN..]);
let expected_enc_iv = u128::from_be_bytes(expected_enc_iv_arr);
let mut manual_encryptor = AesCtr::new(&expected_enc_key, expected_enc_iv);
let manual = manual_encryptor.encrypt(&nonce);
assert_eq!(encrypted.len(), HANDSHAKE_LEN);
assert_eq!(&encrypted[..PROTO_TAG_POS], &nonce[..PROTO_TAG_POS]);
assert_eq!(
&encrypted[PROTO_TAG_POS..],
&manual[PROTO_TAG_POS..],
"Encrypted nonce suffix must match AES-CTR output with derived enc key/iv"
);
}
#[tokio::test]
async fn tls_replay_second_identical_handshake_is_rejected() {
let secret = [0x11u8; 16];
let config = test_config_with_secret_hex("11111111111111111111111111111111");
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.21:44321".parse().unwrap();
let handshake = make_valid_tls_handshake(&secret, 0);
let first = handle_tls_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(first, HandshakeResult::Success(_)));
let second = handle_tls_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(second, HandshakeResult::BadClient { .. }));
}
#[tokio::test]
async fn tls_replay_concurrent_identical_handshake_allows_exactly_one_success() {
let secret = [0x77u8; 16];
let config = Arc::new(test_config_with_secret_hex("77777777777777777777777777777777"));
let replay_checker = Arc::new(ReplayChecker::new(4096, Duration::from_secs(60)));
let rng = Arc::new(SecureRandom::new());
let handshake = Arc::new(make_valid_tls_handshake(&secret, 0));
let mut tasks = Vec::new();
for _ in 0..50 {
let config = config.clone();
let replay_checker = replay_checker.clone();
let rng = rng.clone();
let handshake = handshake.clone();
tasks.push(tokio::spawn(async move {
handle_tls_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
"198.51.100.22:45000".parse().unwrap(),
&config,
&replay_checker,
&rng,
None,
)
.await
}));
}
let mut success_count = 0usize;
for task in tasks {
let result = task.await.unwrap();
if matches!(result, HandshakeResult::Success(_)) {
success_count += 1;
} else {
assert!(matches!(result, HandshakeResult::BadClient { .. }));
}
}
assert_eq!(
success_count, 1,
"Concurrent replay attempts must allow exactly one successful handshake"
);
}
#[tokio::test]
async fn invalid_tls_probe_does_not_pollute_replay_cache() {
let config = test_config_with_secret_hex("11111111111111111111111111111111");
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.23:44322".parse().unwrap();
let mut invalid = vec![0x42u8; tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN + 1 + 32];
invalid[tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN] = 32;
let before = replay_checker.stats();
let result = handle_tls_handshake(
&invalid,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
let after = replay_checker.stats();
assert!(matches!(result, HandshakeResult::BadClient { .. }));
assert_eq!(before.total_additions, after.total_additions);
assert_eq!(before.total_hits, after.total_hits);
}
#[tokio::test]
async fn empty_decoded_secret_is_rejected() {
clear_warned_secrets_for_testing();
let config = test_config_with_secret_hex("");
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.24:44323".parse().unwrap();
let handshake = make_valid_tls_handshake(&[], 0);
let result = handle_tls_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(result, HandshakeResult::BadClient { .. }));
}
#[tokio::test]
async fn wrong_length_decoded_secret_is_rejected() {
clear_warned_secrets_for_testing();
let config = test_config_with_secret_hex("aa");
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.25:44324".parse().unwrap();
let handshake = make_valid_tls_handshake(&[0xaau8], 0);
let result = handle_tls_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(result, HandshakeResult::BadClient { .. }));
}
#[tokio::test]
async fn invalid_mtproto_probe_does_not_pollute_replay_cache() {
let config = test_config_with_secret_hex("11111111111111111111111111111111");
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let peer: SocketAddr = "198.51.100.26:44325".parse().unwrap();
let handshake = [0u8; HANDSHAKE_LEN];
let before = replay_checker.stats();
let result = handle_mtproto_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
false,
None,
)
.await;
let after = replay_checker.stats();
assert!(matches!(result, HandshakeResult::BadClient { .. }));
assert_eq!(before.total_additions, after.total_additions);
assert_eq!(before.total_hits, after.total_hits);
}
#[tokio::test]
async fn mixed_secret_lengths_keep_valid_user_authenticating() {
clear_warned_secrets_for_testing();
clear_auth_probe_state_for_testing();
let good_secret = [0x22u8; 16];
let mut config = ProxyConfig::default();
config.access.users.clear();
config
.access
.users
.insert("broken_user".to_string(), "aa".to_string());
config
.access
.users
.insert("valid_user".to_string(), "22222222222222222222222222222222".to_string());
config.access.ignore_time_skew = true;
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.27:44326".parse().unwrap();
let handshake = make_valid_tls_handshake(&good_secret, 0);
let result = handle_tls_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(result, HandshakeResult::Success(_)));
}
#[tokio::test]
async fn alpn_enforce_rejects_unsupported_client_alpn() {
let secret = [0x33u8; 16];
let mut config = test_config_with_secret_hex("33333333333333333333333333333333");
config.censorship.alpn_enforce = true;
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.28:44327".parse().unwrap();
let handshake = make_valid_tls_client_hello_with_alpn(&secret, 0, &[b"h3"]);
let result = handle_tls_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(result, HandshakeResult::BadClient { .. }));
}
#[tokio::test]
async fn alpn_enforce_accepts_h2() {
let secret = [0x44u8; 16];
let mut config = test_config_with_secret_hex("44444444444444444444444444444444");
config.censorship.alpn_enforce = true;
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.29:44328".parse().unwrap();
let handshake = make_valid_tls_client_hello_with_alpn(&secret, 0, &[b"h2", b"h3"]);
let result = handle_tls_handshake(
&handshake,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(result, HandshakeResult::Success(_)));
}
#[tokio::test]
async fn malformed_tls_classes_complete_within_bounded_time() {
let secret = [0x55u8; 16];
let mut config = test_config_with_secret_hex("55555555555555555555555555555555");
config.censorship.alpn_enforce = true;
let replay_checker = ReplayChecker::new(512, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.30:44329".parse().unwrap();
let too_short = vec![0x16, 0x03, 0x01];
let mut bad_hmac = make_valid_tls_handshake(&secret, 0);
bad_hmac[tls::TLS_DIGEST_POS] ^= 0x01;
let alpn_mismatch = make_valid_tls_client_hello_with_alpn(&secret, 0, &[b"h3"]);
for probe in [too_short, bad_hmac, alpn_mismatch] {
let result = tokio::time::timeout(
Duration::from_millis(200),
handle_tls_handshake(
&probe,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
),
)
.await
.expect("Malformed TLS classes must be rejected within bounded time");
assert!(matches!(result, HandshakeResult::BadClient { .. }));
}
}
#[tokio::test]
#[ignore = "timing-sensitive; run manually on low-jitter hosts"]
async fn malformed_tls_classes_share_close_latency_buckets() {
const ITER: usize = 24;
const BUCKET_MS: u128 = 10;
let secret = [0x99u8; 16];
let mut config = test_config_with_secret_hex("99999999999999999999999999999999");
config.censorship.alpn_enforce = true;
let replay_checker = ReplayChecker::new(4096, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.31:44330".parse().unwrap();
let too_short = vec![0x16, 0x03, 0x01];
let mut bad_hmac = make_valid_tls_handshake(&secret, 0);
bad_hmac[tls::TLS_DIGEST_POS + 1] ^= 0x01;
let alpn_mismatch = make_valid_tls_client_hello_with_alpn(&secret, 0, &[b"h3"]);
let mut class_means_ms = Vec::new();
for probe in [too_short, bad_hmac, alpn_mismatch] {
let mut sum_micros: u128 = 0;
for _ in 0..ITER {
let started = Instant::now();
let result = handle_tls_handshake(
&probe,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
let elapsed = started.elapsed();
assert!(matches!(result, HandshakeResult::BadClient { .. }));
sum_micros += elapsed.as_micros();
}
class_means_ms.push(sum_micros / ITER as u128 / 1_000);
}
let min_bucket = class_means_ms
.iter()
.map(|ms| ms / BUCKET_MS)
.min()
.unwrap();
let max_bucket = class_means_ms
.iter()
.map(|ms| ms / BUCKET_MS)
.max()
.unwrap();
assert!(
max_bucket <= min_bucket + 1,
"Malformed TLS classes diverged across latency buckets: means_ms={:?}",
class_means_ms
);
}
#[test]
fn secure_tag_requires_tls_mode_on_tls_transport() {
let mut config = ProxyConfig::default();
config.general.modes.classic = false;
config.general.modes.secure = true;
config.general.modes.tls = false;
assert!(
!mode_enabled_for_proto(&config, ProtoTag::Secure, true),
"Secure tag over TLS must be rejected when tls mode is disabled"
);
config.general.modes.tls = true;
assert!(
mode_enabled_for_proto(&config, ProtoTag::Secure, true),
"Secure tag over TLS must be accepted when tls mode is enabled"
);
}
#[test]
fn secure_tag_requires_secure_mode_on_direct_transport() {
let mut config = ProxyConfig::default();
config.general.modes.classic = false;
config.general.modes.secure = false;
config.general.modes.tls = true;
assert!(
!mode_enabled_for_proto(&config, ProtoTag::Secure, false),
"Secure tag without TLS must be rejected when secure mode is disabled"
);
config.general.modes.secure = true;
assert!(
mode_enabled_for_proto(&config, ProtoTag::Secure, false),
"Secure tag without TLS must be accepted when secure mode is enabled"
);
}
#[test]
fn mode_policy_matrix_is_stable_for_all_tag_transport_mode_combinations() {
let tags = [ProtoTag::Secure, ProtoTag::Intermediate, ProtoTag::Abridged];
for classic in [false, true] {
for secure in [false, true] {
for tls in [false, true] {
let mut config = ProxyConfig::default();
config.general.modes.classic = classic;
config.general.modes.secure = secure;
config.general.modes.tls = tls;
for is_tls in [false, true] {
for tag in tags {
let expected = match (tag, is_tls) {
(ProtoTag::Secure, true) => tls,
(ProtoTag::Secure, false) => secure,
(ProtoTag::Intermediate | ProtoTag::Abridged, _) => classic,
};
assert_eq!(
mode_enabled_for_proto(&config, tag, is_tls),
expected,
"mode policy drifted for tag={:?}, transport_tls={}, modes=(classic={}, secure={}, tls={})",
tag,
is_tls,
classic,
secure,
tls
);
}
}
}
}
}
}
#[test]
fn invalid_secret_warning_keys_do_not_collide_on_colon_boundaries() {
clear_warned_secrets_for_testing();
warn_invalid_secret_once("a:b", "c", ACCESS_SECRET_BYTES, Some(1));
warn_invalid_secret_once("a", "b:c", ACCESS_SECRET_BYTES, Some(2));
let warned = INVALID_SECRET_WARNED
.get()
.expect("warned set must be initialized");
let guard = warned.lock().expect("warned set lock must be available");
assert_eq!(
guard.len(),
2,
"(name, reason) pairs that stringify to the same colon-joined key must remain distinct"
);
}
#[tokio::test]
async fn repeated_invalid_tls_probes_trigger_pre_auth_throttle() {
let _guard = auth_probe_test_lock()
.lock()
.unwrap_or_else(|poisoned| poisoned.into_inner());
clear_auth_probe_state_for_testing();
let config = test_config_with_secret_hex("11111111111111111111111111111111");
let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.61:44361".parse().unwrap();
let mut invalid = vec![0x42u8; tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN + 1 + 32];
invalid[tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN] = 32;
for _ in 0..AUTH_PROBE_BACKOFF_START_FAILS {
let result = handle_tls_handshake(
&invalid,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(result, HandshakeResult::BadClient { .. }));
}
assert!(
auth_probe_is_throttled_for_testing(peer.ip()),
"invalid probe burst must activate per-IP pre-auth throttle"
);
}
#[tokio::test]
async fn successful_tls_handshake_clears_pre_auth_failure_streak() {
let _guard = auth_probe_test_lock()
.lock()
.unwrap_or_else(|poisoned| poisoned.into_inner());
clear_auth_probe_state_for_testing();
let secret = [0x23u8; 16];
let config = test_config_with_secret_hex("23232323232323232323232323232323");
let replay_checker = ReplayChecker::new(256, Duration::from_secs(60));
let rng = SecureRandom::new();
let peer: SocketAddr = "198.51.100.62:44362".parse().unwrap();
let mut invalid = vec![0x42u8; tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN + 1 + 32];
invalid[tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN] = 32;
for expected in 1..AUTH_PROBE_BACKOFF_START_FAILS {
let result = handle_tls_handshake(
&invalid,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(result, HandshakeResult::BadClient { .. }));
assert_eq!(
auth_probe_fail_streak_for_testing(peer.ip()),
Some(expected),
"failure streak must grow before a successful authentication"
);
}
let valid = make_valid_tls_handshake(&secret, 0);
let success = handle_tls_handshake(
&valid,
tokio::io::empty(),
tokio::io::sink(),
peer,
&config,
&replay_checker,
&rng,
None,
)
.await;
assert!(matches!(success, HandshakeResult::Success(_)));
assert_eq!(
auth_probe_fail_streak_for_testing(peer.ip()),
None,
"successful authentication must clear accumulated pre-auth failures"
);
}
#[test]
fn auth_probe_capacity_prunes_stale_entries_for_new_ips() {
let state = DashMap::new();
let now = Instant::now();
let stale_seen = now - Duration::from_secs(AUTH_PROBE_TRACK_RETENTION_SECS + 1);
for idx in 0..AUTH_PROBE_TRACK_MAX_ENTRIES {
let ip = IpAddr::V4(Ipv4Addr::new(
10,
1,
((idx >> 8) & 0xff) as u8,
(idx & 0xff) as u8,
));
state.insert(
ip,
AuthProbeState {
fail_streak: 1,
blocked_until: now,
last_seen: stale_seen,
},
);
}
let newcomer = IpAddr::V4(Ipv4Addr::new(198, 51, 100, 200));
auth_probe_record_failure_with_state(&state, newcomer, now);
assert_eq!(
state.get(&newcomer).map(|entry| entry.fail_streak),
Some(1),
"stale-entry pruning must admit and track a new probe source"
);
assert!(
state.len() <= AUTH_PROBE_TRACK_MAX_ENTRIES,
"auth probe map must remain bounded after stale pruning"
);
}
#[test]
fn auth_probe_capacity_stays_fail_closed_when_map_is_fresh_and_full() {
let state = DashMap::new();
let now = Instant::now();
for idx in 0..AUTH_PROBE_TRACK_MAX_ENTRIES {
let ip = IpAddr::V4(Ipv4Addr::new(
172,
16,
((idx >> 8) & 0xff) as u8,
(idx & 0xff) as u8,
));
state.insert(
ip,
AuthProbeState {
fail_streak: 1,
blocked_until: now,
last_seen: now,
},
);
}
let newcomer = IpAddr::V4(Ipv4Addr::new(203, 0, 113, 55));
auth_probe_record_failure_with_state(&state, newcomer, now);
assert!(
state.get(&newcomer).is_none(),
"when all entries are fresh and full, new probes must not be admitted"
);
assert_eq!(
state.len(),
AUTH_PROBE_TRACK_MAX_ENTRIES,
"auth probe map must stay at the configured cap"
);
}

View File

@@ -14,41 +14,12 @@ use crate::network::dns_overrides::resolve_socket_addr;
use crate::stats::beobachten::BeobachtenStore;
use crate::transport::proxy_protocol::{ProxyProtocolV1Builder, ProxyProtocolV2Builder};
#[cfg(not(test))]
const MASK_TIMEOUT: Duration = Duration::from_secs(5);
#[cfg(test)]
const MASK_TIMEOUT: Duration = Duration::from_millis(50);
/// Maximum duration for the entire masking relay.
/// Limits resource consumption from slow-loris attacks and port scanners.
#[cfg(not(test))]
const MASK_RELAY_TIMEOUT: Duration = Duration::from_secs(60);
#[cfg(test)]
const MASK_RELAY_TIMEOUT: Duration = Duration::from_millis(200);
const MASK_BUFFER_SIZE: usize = 8192;
async fn write_proxy_header_with_timeout<W>(mask_write: &mut W, header: &[u8]) -> bool
where
W: AsyncWrite + Unpin,
{
match timeout(MASK_TIMEOUT, mask_write.write_all(header)).await {
Ok(Ok(())) => true,
Ok(Err(_)) => false,
Err(_) => {
debug!("Timeout writing proxy protocol header to mask backend");
false
}
}
}
async fn consume_client_data_with_timeout<R>(reader: R)
where
R: AsyncRead + Unpin,
{
if timeout(MASK_RELAY_TIMEOUT, consume_client_data(reader)).await.is_err() {
debug!("Timed out while consuming client data on masking fallback path");
}
}
/// Detect client type based on initial data
fn detect_client_type(data: &[u8]) -> &'static str {
// Check for HTTP request
@@ -100,7 +71,7 @@ where
if !config.censorship.mask {
// Masking disabled, just consume data
consume_client_data_with_timeout(reader).await;
consume_client_data(reader).await;
return;
}
@@ -136,7 +107,7 @@ where
}
};
if let Some(header) = proxy_header {
if !write_proxy_header_with_timeout(&mut mask_write, &header).await {
if mask_write.write_all(&header).await.is_err() {
return;
}
}
@@ -146,11 +117,11 @@ where
}
Ok(Err(e)) => {
debug!(error = %e, "Failed to connect to mask unix socket");
consume_client_data_with_timeout(reader).await;
consume_client_data(reader).await;
}
Err(_) => {
debug!("Timeout connecting to mask unix socket");
consume_client_data_with_timeout(reader).await;
consume_client_data(reader).await;
}
}
return;
@@ -195,7 +166,7 @@ where
let (mask_read, mut mask_write) = stream.into_split();
if let Some(header) = proxy_header {
if !write_proxy_header_with_timeout(&mut mask_write, &header).await {
if mask_write.write_all(&header).await.is_err() {
return;
}
}
@@ -205,11 +176,11 @@ where
}
Ok(Err(e)) => {
debug!(error = %e, "Failed to connect to mask host");
consume_client_data_with_timeout(reader).await;
consume_client_data(reader).await;
}
Err(_) => {
debug!("Timeout connecting to mask host");
consume_client_data_with_timeout(reader).await;
consume_client_data(reader).await;
}
}
}
@@ -223,51 +194,55 @@ async fn relay_to_mask<R, W, MR, MW>(
initial_data: &[u8],
)
where
R: AsyncRead + Unpin + Send,
W: AsyncWrite + Unpin + Send,
MR: AsyncRead + Unpin + Send,
MW: AsyncWrite + Unpin + Send,
R: AsyncRead + Unpin + Send + 'static,
W: AsyncWrite + Unpin + Send + 'static,
MR: AsyncRead + Unpin + Send + 'static,
MW: AsyncWrite + Unpin + Send + 'static,
{
// Send initial data to mask host
if mask_write.write_all(initial_data).await.is_err() {
return;
}
if mask_write.flush().await.is_err() {
return;
}
let mut client_buf = vec![0u8; MASK_BUFFER_SIZE];
let mut mask_buf = vec![0u8; MASK_BUFFER_SIZE];
loop {
tokio::select! {
client_read = reader.read(&mut client_buf) => {
match client_read {
Ok(0) | Err(_) => {
let _ = mask_write.shutdown().await;
break;
}
Ok(n) => {
if mask_write.write_all(&client_buf[..n]).await.is_err() {
break;
}
}
// Relay traffic
let c2m = tokio::spawn(async move {
let mut buf = vec![0u8; MASK_BUFFER_SIZE];
loop {
match reader.read(&mut buf).await {
Ok(0) | Err(_) => {
let _ = mask_write.shutdown().await;
break;
}
}
mask_read_res = mask_read.read(&mut mask_buf) => {
match mask_read_res {
Ok(0) | Err(_) => {
let _ = writer.shutdown().await;
Ok(n) => {
if mask_write.write_all(&buf[..n]).await.is_err() {
break;
}
Ok(n) => {
if writer.write_all(&mask_buf[..n]).await.is_err() {
break;
}
}
}
}
}
});
let m2c = tokio::spawn(async move {
let mut buf = vec![0u8; MASK_BUFFER_SIZE];
loop {
match mask_read.read(&mut buf).await {
Ok(0) | Err(_) => {
let _ = writer.shutdown().await;
break;
}
Ok(n) => {
if writer.write_all(&buf[..n]).await.is_err() {
break;
}
}
}
}
});
// Wait for either to complete
tokio::select! {
_ = c2m => {}
_ = m2c => {}
}
}
@@ -280,7 +255,3 @@ async fn consume_client_data<R: AsyncRead + Unpin>(mut reader: R) {
}
}
}
#[cfg(test)]
#[path = "masking_security_tests.rs"]
mod security_tests;

View File

@@ -1,550 +0,0 @@
use super::*;
use crate::config::ProxyConfig;
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{duplex, AsyncBufReadExt, BufReader};
use tokio::net::TcpListener;
#[cfg(unix)]
use tokio::net::UnixListener;
use tokio::time::{timeout, Duration};
#[tokio::test]
async fn bad_client_probe_is_forwarded_verbatim_to_mask_backend() {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let backend_addr = listener.local_addr().unwrap();
let probe = b"GET / HTTP/1.1\r\nHost: front.example\r\n\r\n".to_vec();
let backend_reply = b"HTTP/1.1 200 OK\r\nContent-Length: 2\r\n\r\nOK".to_vec();
let accept_task = tokio::spawn({
let probe = probe.clone();
let backend_reply = backend_reply.clone();
async move {
let (mut stream, _) = listener.accept().await.unwrap();
let mut received = vec![0u8; probe.len()];
stream.read_exact(&mut received).await.unwrap();
assert_eq!(received, probe);
stream.write_all(&backend_reply).await.unwrap();
}
});
let mut config = ProxyConfig::default();
config.general.beobachten = false;
config.censorship.mask = true;
config.censorship.mask_host = Some("127.0.0.1".to_string());
config.censorship.mask_port = backend_addr.port();
config.censorship.mask_unix_sock = None;
config.censorship.mask_proxy_protocol = 0;
let peer: SocketAddr = "203.0.113.10:42424".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let (client_reader, _client_writer) = duplex(256);
let (mut client_visible_reader, client_visible_writer) = duplex(2048);
let beobachten = BeobachtenStore::new();
handle_bad_client(
client_reader,
client_visible_writer,
&probe,
peer,
local_addr,
&config,
&beobachten,
)
.await;
let mut observed = vec![0u8; backend_reply.len()];
client_visible_reader.read_exact(&mut observed).await.unwrap();
assert_eq!(observed, backend_reply);
accept_task.await.unwrap();
}
#[tokio::test]
async fn tls_scanner_probe_keeps_http_like_fallback_surface() {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let backend_addr = listener.local_addr().unwrap();
let probe = vec![0x16, 0x03, 0x01, 0x00, 0x10, 0x01, 0x02, 0x03, 0x04];
let backend_reply = b"HTTP/1.1 400 Bad Request\r\nContent-Length: 0\r\n\r\n".to_vec();
let accept_task = tokio::spawn({
let probe = probe.clone();
let backend_reply = backend_reply.clone();
async move {
let (mut stream, _) = listener.accept().await.unwrap();
let mut received = vec![0u8; probe.len()];
stream.read_exact(&mut received).await.unwrap();
assert_eq!(received, probe);
stream.write_all(&backend_reply).await.unwrap();
}
});
let mut config = ProxyConfig::default();
config.general.beobachten = true;
config.general.beobachten_minutes = 1;
config.censorship.mask = true;
config.censorship.mask_host = Some("127.0.0.1".to_string());
config.censorship.mask_port = backend_addr.port();
config.censorship.mask_unix_sock = None;
config.censorship.mask_proxy_protocol = 0;
let peer: SocketAddr = "198.51.100.44:55221".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let (client_reader, _client_writer) = duplex(256);
let (mut client_visible_reader, client_visible_writer) = duplex(2048);
let beobachten = BeobachtenStore::new();
handle_bad_client(
client_reader,
client_visible_writer,
&probe,
peer,
local_addr,
&config,
&beobachten,
)
.await;
let mut observed = vec![0u8; backend_reply.len()];
client_visible_reader.read_exact(&mut observed).await.unwrap();
assert_eq!(observed, backend_reply);
let snapshot = beobachten.snapshot_text(Duration::from_secs(60));
assert!(snapshot.contains("[TLS-scanner]"));
assert!(snapshot.contains("198.51.100.44-1"));
accept_task.await.unwrap();
}
#[test]
fn detect_client_type_covers_ssh_port_scanner_and_unknown() {
assert_eq!(detect_client_type(b"SSH-2.0-OpenSSH_9.7"), "SSH");
assert_eq!(detect_client_type(b"\x01\x02\x03"), "port-scanner");
assert_eq!(detect_client_type(b"random-binary-payload"), "unknown");
}
#[test]
fn detect_client_type_len_boundary_9_vs_10_bytes() {
assert_eq!(detect_client_type(b"123456789"), "port-scanner");
assert_eq!(detect_client_type(b"1234567890"), "unknown");
}
#[tokio::test]
async fn beobachten_records_scanner_class_when_mask_is_disabled() {
let mut config = ProxyConfig::default();
config.general.beobachten = true;
config.general.beobachten_minutes = 1;
config.censorship.mask = false;
let peer: SocketAddr = "203.0.113.99:41234".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let initial = b"SSH-2.0-probe";
let (mut client_reader_side, client_reader) = duplex(256);
let (_client_visible_reader, client_visible_writer) = duplex(256);
let beobachten = BeobachtenStore::new();
let task = tokio::spawn(async move {
handle_bad_client(
client_reader,
client_visible_writer,
initial,
peer,
local_addr,
&config,
&beobachten,
)
.await;
beobachten
});
client_reader_side.write_all(b"noise").await.unwrap();
drop(client_reader_side);
let beobachten = timeout(Duration::from_secs(3), task).await.unwrap().unwrap();
let snapshot = beobachten.snapshot_text(Duration::from_secs(60));
assert!(snapshot.contains("[SSH]"));
assert!(snapshot.contains("203.0.113.99-1"));
}
#[tokio::test]
async fn backend_unavailable_falls_back_to_silent_consume() {
let temp_listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let unused_port = temp_listener.local_addr().unwrap().port();
drop(temp_listener);
let mut config = ProxyConfig::default();
config.general.beobachten = false;
config.censorship.mask = true;
config.censorship.mask_host = Some("127.0.0.1".to_string());
config.censorship.mask_port = unused_port;
config.censorship.mask_unix_sock = None;
config.censorship.mask_proxy_protocol = 0;
let peer: SocketAddr = "203.0.113.11:42425".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let probe = b"GET /probe HTTP/1.1\r\nHost: x\r\n\r\n";
let (mut client_reader_side, client_reader) = duplex(256);
let (mut client_visible_reader, client_visible_writer) = duplex(256);
let beobachten = BeobachtenStore::new();
let task = tokio::spawn(async move {
handle_bad_client(
client_reader,
client_visible_writer,
probe,
peer,
local_addr,
&config,
&beobachten,
)
.await;
});
client_reader_side.write_all(b"noise").await.unwrap();
drop(client_reader_side);
timeout(Duration::from_secs(3), task).await.unwrap().unwrap();
let mut buf = [0u8; 1];
let n = timeout(Duration::from_secs(1), client_visible_reader.read(&mut buf))
.await
.unwrap()
.unwrap();
assert_eq!(n, 0);
}
#[tokio::test]
async fn mask_disabled_consumes_client_data_without_response() {
let mut config = ProxyConfig::default();
config.general.beobachten = false;
config.censorship.mask = false;
let peer: SocketAddr = "198.51.100.12:45454".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let initial = b"scanner";
let (mut client_reader_side, client_reader) = duplex(256);
let (mut client_visible_reader, client_visible_writer) = duplex(256);
let beobachten = BeobachtenStore::new();
let task = tokio::spawn(async move {
handle_bad_client(
client_reader,
client_visible_writer,
initial,
peer,
local_addr,
&config,
&beobachten,
)
.await;
});
client_reader_side.write_all(b"untrusted payload").await.unwrap();
drop(client_reader_side);
timeout(Duration::from_secs(3), task).await.unwrap().unwrap();
let mut buf = [0u8; 1];
let n = timeout(Duration::from_secs(1), client_visible_reader.read(&mut buf))
.await
.unwrap()
.unwrap();
assert_eq!(n, 0);
}
#[tokio::test]
async fn proxy_protocol_v1_header_is_sent_before_probe() {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let backend_addr = listener.local_addr().unwrap();
let probe = b"GET / HTTP/1.1\r\nHost: front.example\r\n\r\n".to_vec();
let backend_reply = b"HTTP/1.1 204 No Content\r\nContent-Length: 0\r\n\r\n".to_vec();
let accept_task = tokio::spawn({
let probe = probe.clone();
let backend_reply = backend_reply.clone();
async move {
let (stream, _) = listener.accept().await.unwrap();
let mut reader = BufReader::new(stream);
let mut header_line = Vec::new();
reader.read_until(b'\n', &mut header_line).await.unwrap();
let header_text = String::from_utf8(header_line.clone()).unwrap();
assert!(header_text.starts_with("PROXY TCP4 "));
assert!(header_text.ends_with("\r\n"));
let mut received_probe = vec![0u8; probe.len()];
reader.read_exact(&mut received_probe).await.unwrap();
assert_eq!(received_probe, probe);
let mut stream = reader.into_inner();
stream.write_all(&backend_reply).await.unwrap();
}
});
let mut config = ProxyConfig::default();
config.general.beobachten = false;
config.censorship.mask = true;
config.censorship.mask_host = Some("127.0.0.1".to_string());
config.censorship.mask_port = backend_addr.port();
config.censorship.mask_unix_sock = None;
config.censorship.mask_proxy_protocol = 1;
let peer: SocketAddr = "203.0.113.15:50001".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let (client_reader, _client_writer) = duplex(256);
let (mut client_visible_reader, client_visible_writer) = duplex(2048);
let beobachten = BeobachtenStore::new();
handle_bad_client(
client_reader,
client_visible_writer,
&probe,
peer,
local_addr,
&config,
&beobachten,
)
.await;
let mut observed = vec![0u8; backend_reply.len()];
client_visible_reader.read_exact(&mut observed).await.unwrap();
assert_eq!(observed, backend_reply);
accept_task.await.unwrap();
}
#[tokio::test]
async fn proxy_protocol_v2_header_is_sent_before_probe() {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let backend_addr = listener.local_addr().unwrap();
let probe = b"GET / HTTP/1.1\r\nHost: front.example\r\n\r\n".to_vec();
let backend_reply = b"HTTP/1.1 200 OK\r\nContent-Length: 0\r\n\r\n".to_vec();
let accept_task = tokio::spawn({
let probe = probe.clone();
let backend_reply = backend_reply.clone();
async move {
let (mut stream, _) = listener.accept().await.unwrap();
let mut sig = [0u8; 12];
stream.read_exact(&mut sig).await.unwrap();
assert_eq!(&sig, b"\r\n\r\n\0\r\nQUIT\n");
let mut fixed = [0u8; 4];
stream.read_exact(&mut fixed).await.unwrap();
let addr_len = u16::from_be_bytes([fixed[2], fixed[3]]) as usize;
let mut addr_block = vec![0u8; addr_len];
stream.read_exact(&mut addr_block).await.unwrap();
let mut received_probe = vec![0u8; probe.len()];
stream.read_exact(&mut received_probe).await.unwrap();
assert_eq!(received_probe, probe);
stream.write_all(&backend_reply).await.unwrap();
}
});
let mut config = ProxyConfig::default();
config.general.beobachten = false;
config.censorship.mask = true;
config.censorship.mask_host = Some("127.0.0.1".to_string());
config.censorship.mask_port = backend_addr.port();
config.censorship.mask_unix_sock = None;
config.censorship.mask_proxy_protocol = 2;
let peer: SocketAddr = "203.0.113.18:50004".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let (client_reader, _client_writer) = duplex(256);
let (mut client_visible_reader, client_visible_writer) = duplex(2048);
let beobachten = BeobachtenStore::new();
handle_bad_client(
client_reader,
client_visible_writer,
&probe,
peer,
local_addr,
&config,
&beobachten,
)
.await;
let mut observed = vec![0u8; backend_reply.len()];
client_visible_reader.read_exact(&mut observed).await.unwrap();
assert_eq!(observed, backend_reply);
accept_task.await.unwrap();
}
#[tokio::test]
async fn proxy_protocol_v1_mixed_family_falls_back_to_unknown_header() {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let backend_addr = listener.local_addr().unwrap();
let probe = b"GET /mix HTTP/1.1\r\nHost: front.example\r\n\r\n".to_vec();
let backend_reply = b"HTTP/1.1 204 No Content\r\nContent-Length: 0\r\n\r\n".to_vec();
let accept_task = tokio::spawn({
let probe = probe.clone();
let backend_reply = backend_reply.clone();
async move {
let (stream, _) = listener.accept().await.unwrap();
let mut reader = BufReader::new(stream);
let mut header_line = Vec::new();
reader.read_until(b'\n', &mut header_line).await.unwrap();
let header_text = String::from_utf8(header_line).unwrap();
assert_eq!(header_text, "PROXY UNKNOWN\r\n");
let mut received_probe = vec![0u8; probe.len()];
reader.read_exact(&mut received_probe).await.unwrap();
assert_eq!(received_probe, probe);
let mut stream = reader.into_inner();
stream.write_all(&backend_reply).await.unwrap();
}
});
let mut config = ProxyConfig::default();
config.general.beobachten = false;
config.censorship.mask = true;
config.censorship.mask_host = Some("127.0.0.1".to_string());
config.censorship.mask_port = backend_addr.port();
config.censorship.mask_unix_sock = None;
config.censorship.mask_proxy_protocol = 1;
let peer: SocketAddr = "203.0.113.20:50006".parse().unwrap();
let local_addr: SocketAddr = "[::1]:443".parse().unwrap();
let (client_reader, _client_writer) = duplex(256);
let (mut client_visible_reader, client_visible_writer) = duplex(2048);
let beobachten = BeobachtenStore::new();
handle_bad_client(
client_reader,
client_visible_writer,
&probe,
peer,
local_addr,
&config,
&beobachten,
)
.await;
let mut observed = vec![0u8; backend_reply.len()];
client_visible_reader.read_exact(&mut observed).await.unwrap();
assert_eq!(observed, backend_reply);
accept_task.await.unwrap();
}
#[cfg(unix)]
#[tokio::test]
async fn unix_socket_mask_path_forwards_probe_and_response() {
let sock_path = format!("/tmp/telemt-mask-test-{}-{}.sock", std::process::id(), rand::random::<u64>());
let _ = std::fs::remove_file(&sock_path);
let listener = UnixListener::bind(&sock_path).unwrap();
let probe = b"GET /unix HTTP/1.1\r\nHost: front.example\r\n\r\n".to_vec();
let backend_reply = b"HTTP/1.1 200 OK\r\nContent-Length: 2\r\n\r\nOK".to_vec();
let accept_task = tokio::spawn({
let probe = probe.clone();
let backend_reply = backend_reply.clone();
async move {
let (mut stream, _) = listener.accept().await.unwrap();
let mut received = vec![0u8; probe.len()];
stream.read_exact(&mut received).await.unwrap();
assert_eq!(received, probe);
stream.write_all(&backend_reply).await.unwrap();
}
});
let mut config = ProxyConfig::default();
config.general.beobachten = false;
config.censorship.mask = true;
config.censorship.mask_unix_sock = Some(sock_path.clone());
config.censorship.mask_proxy_protocol = 0;
let peer: SocketAddr = "203.0.113.30:50010".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let (client_reader, _client_writer) = duplex(256);
let (mut client_visible_reader, client_visible_writer) = duplex(2048);
let beobachten = BeobachtenStore::new();
handle_bad_client(
client_reader,
client_visible_writer,
&probe,
peer,
local_addr,
&config,
&beobachten,
)
.await;
let mut observed = vec![0u8; backend_reply.len()];
client_visible_reader.read_exact(&mut observed).await.unwrap();
assert_eq!(observed, backend_reply);
accept_task.await.unwrap();
let _ = std::fs::remove_file(sock_path);
}
#[tokio::test]
async fn mask_disabled_slowloris_connection_is_closed_by_consume_timeout() {
let mut config = ProxyConfig::default();
config.general.beobachten = false;
config.censorship.mask = false;
let peer: SocketAddr = "198.51.100.33:45455".parse().unwrap();
let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
let (_client_reader_side, client_reader) = duplex(256);
let (_client_visible_reader, client_visible_writer) = duplex(256);
let beobachten = BeobachtenStore::new();
let task = tokio::spawn(async move {
handle_bad_client(
client_reader,
client_visible_writer,
b"slowloris",
peer,
local_addr,
&config,
&beobachten,
)
.await;
});
timeout(Duration::from_secs(1), task).await.unwrap().unwrap();
}
struct PendingWriter;
impl tokio::io::AsyncWrite for PendingWriter {
fn poll_write(
self: Pin<&mut Self>,
_cx: &mut Context<'_>,
_buf: &[u8],
) -> Poll<std::io::Result<usize>> {
Poll::Pending
}
fn poll_flush(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<std::io::Result<()>> {
Poll::Ready(Ok(()))
}
fn poll_shutdown(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<std::io::Result<()>> {
Poll::Ready(Ok(()))
}
}
#[tokio::test]
async fn proxy_header_write_timeout_returns_false() {
let mut writer = PendingWriter;
let ok = write_proxy_header_with_timeout(&mut writer, b"PROXY UNKNOWN\r\n").await;
assert!(!ok, "Proxy header writes that never complete must time out");
}

View File

@@ -1,17 +1,14 @@
use std::collections::HashMap;
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};
use std::net::{IpAddr, SocketAddr};
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::{Arc, OnceLock};
use std::sync::{Arc, Mutex, OnceLock};
use std::time::{Duration, Instant};
#[cfg(test)]
use std::sync::Mutex;
use bytes::{Bytes, BytesMut};
use dashmap::DashMap;
use bytes::Bytes;
use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite, AsyncWriteExt};
use tokio::sync::{mpsc, oneshot, watch};
use tokio::time::timeout;
use tracing::{debug, trace, warn};
use crate::config::ProxyConfig;
@@ -23,6 +20,8 @@ use crate::proxy::route_mode::{
RelayRouteMode, RouteCutoverState, ROUTE_SWITCH_ERROR_MSG, affected_cutover_state,
cutover_stagger_delay,
};
use crate::proxy::adaptive_buffers::{self, AdaptiveTier};
use crate::proxy::session_eviction::SessionLease;
use crate::stats::Stats;
use crate::stream::{BufferPool, CryptoReader, CryptoWriter};
use crate::transport::middle_proxy::{MePool, MeResponse, proto_flags_for_tag};
@@ -33,15 +32,13 @@ enum C2MeCommand {
}
const DESYNC_DEDUP_WINDOW: Duration = Duration::from_secs(60);
const DESYNC_DEDUP_MAX_ENTRIES: usize = 65_536;
const DESYNC_DEDUP_PRUNE_SCAN_LIMIT: usize = 1024;
const DESYNC_ERROR_CLASS: &str = "frame_too_large_crypto_desync";
const C2ME_CHANNEL_CAPACITY_FALLBACK: usize = 128;
const C2ME_SOFT_PRESSURE_MIN_FREE_SLOTS: usize = 64;
const C2ME_SENDER_FAIRNESS_BUDGET: usize = 32;
const ME_D2C_FLUSH_BATCH_MAX_FRAMES_MIN: usize = 1;
const ME_D2C_FLUSH_BATCH_MAX_BYTES_MIN: usize = 4096;
static DESYNC_DEDUP: OnceLock<DashMap<u64, Instant>> = OnceLock::new();
static DESYNC_DEDUP: OnceLock<Mutex<HashMap<u64, Instant>>> = OnceLock::new();
struct RelayForensicsState {
trace_id: u64,
@@ -64,8 +61,8 @@ struct MeD2cFlushPolicy {
}
impl MeD2cFlushPolicy {
fn from_config(config: &ProxyConfig) -> Self {
Self {
fn from_config(config: &ProxyConfig, tier: AdaptiveTier) -> Self {
let base = Self {
max_frames: config
.general
.me_d2c_flush_batch_max_frames
@@ -76,6 +73,18 @@ impl MeD2cFlushPolicy {
.max(ME_D2C_FLUSH_BATCH_MAX_BYTES_MIN),
max_delay: Duration::from_micros(config.general.me_d2c_flush_batch_max_delay_us),
ack_flush_immediate: config.general.me_d2c_ack_flush_immediate,
};
let (max_frames, max_bytes, max_delay) = adaptive_buffers::me_flush_policy_for_tier(
tier,
base.max_frames,
base.max_bytes,
base.max_delay,
);
Self {
max_frames,
max_bytes,
max_delay,
ack_flush_immediate: base.ack_flush_immediate,
}
}
}
@@ -95,46 +104,24 @@ fn should_emit_full_desync(key: u64, all_full: bool, now: Instant) -> bool {
return true;
}
let dedup = DESYNC_DEDUP.get_or_init(DashMap::new);
let dedup = DESYNC_DEDUP.get_or_init(|| Mutex::new(HashMap::new()));
let mut guard = dedup.lock().expect("desync dedup mutex poisoned");
guard.retain(|_, seen_at| now.duration_since(*seen_at) < DESYNC_DEDUP_WINDOW);
if let Some(mut seen_at) = dedup.get_mut(&key) {
if now.duration_since(*seen_at) >= DESYNC_DEDUP_WINDOW {
*seen_at = now;
return true;
}
return false;
}
if dedup.len() >= DESYNC_DEDUP_MAX_ENTRIES {
let mut stale_keys = Vec::new();
for entry in dedup.iter().take(DESYNC_DEDUP_PRUNE_SCAN_LIMIT) {
if now.duration_since(*entry.value()) >= DESYNC_DEDUP_WINDOW {
stale_keys.push(*entry.key());
match guard.get_mut(&key) {
Some(seen_at) => {
if now.duration_since(*seen_at) >= DESYNC_DEDUP_WINDOW {
*seen_at = now;
true
} else {
false
}
}
for stale_key in stale_keys {
dedup.remove(&stale_key);
}
if dedup.len() >= DESYNC_DEDUP_MAX_ENTRIES {
return false;
None => {
guard.insert(key, now);
true
}
}
dedup.insert(key, now);
true
}
#[cfg(test)]
fn clear_desync_dedup_for_testing() {
if let Some(dedup) = DESYNC_DEDUP.get() {
dedup.clear();
}
}
#[cfg(test)]
fn desync_dedup_test_lock() -> &'static Mutex<()> {
static TEST_LOCK: OnceLock<Mutex<()>> = OnceLock::new();
TEST_LOCK.get_or_init(|| Mutex::new(()))
}
fn report_desync_frame_too_large(
@@ -256,12 +243,13 @@ pub(crate) async fn handle_via_middle_proxy<R, W>(
me_pool: Arc<MePool>,
stats: Arc<Stats>,
config: Arc<ProxyConfig>,
buffer_pool: Arc<BufferPool>,
_buffer_pool: Arc<BufferPool>,
local_addr: SocketAddr,
rng: Arc<SecureRandom>,
mut route_rx: watch::Receiver<RouteCutoverState>,
route_snapshot: RouteCutoverState,
session_id: u64,
session_lease: SessionLease,
) -> Result<()>
where
R: AsyncRead + Unpin + Send + 'static,
@@ -271,6 +259,7 @@ where
let peer = success.peer;
let proto_tag = success.proto_tag;
let pool_generation = me_pool.current_generation();
let seed_tier = adaptive_buffers::seed_tier_for_user(&user);
debug!(
user = %user,
@@ -298,6 +287,7 @@ where
};
stats.increment_user_connects(&user);
stats.increment_user_curr_connects(&user);
stats.increment_current_connections_me();
if let Some(cutover) = affected_cutover_state(
@@ -317,9 +307,19 @@ where
let _ = me_pool.send_close(conn_id).await;
me_pool.registry().unregister(conn_id).await;
stats.decrement_current_connections_me();
stats.decrement_user_curr_connects(&user);
return Err(ProxyError::Proxy(ROUTE_SWITCH_ERROR_MSG.to_string()));
}
if session_lease.is_stale() {
stats.increment_reconnect_stale_close_total();
let _ = me_pool.send_close(conn_id).await;
me_pool.registry().unregister(conn_id).await;
stats.decrement_current_connections_me();
stats.decrement_user_curr_connects(&user);
return Err(ProxyError::Proxy("Session evicted by reconnect".to_string()));
}
// Per-user ad_tag from access.user_ad_tags; fallback to general.ad_tag (hot-reloadable)
let user_tag: Option<Vec<u8>> = config
.access
@@ -393,7 +393,7 @@ where
let rng_clone = rng.clone();
let user_clone = user.clone();
let bytes_me2c_clone = bytes_me2c.clone();
let d2c_flush_policy = MeD2cFlushPolicy::from_config(&config);
let d2c_flush_policy = MeD2cFlushPolicy::from_config(&config, seed_tier);
let me_writer = tokio::spawn(async move {
let mut writer = crypto_writer;
let mut frame_buf = Vec::with_capacity(16 * 1024);
@@ -553,6 +553,12 @@ where
let mut frame_counter: u64 = 0;
let mut route_watch_open = true;
loop {
if session_lease.is_stale() {
stats.increment_reconnect_stale_close_total();
let _ = enqueue_c2me_command(&c2me_tx, C2MeCommand::Close).await;
main_result = Err(ProxyError::Proxy("Session evicted by reconnect".to_string()));
break;
}
if let Some(cutover) = affected_cutover_state(
&route_rx,
RelayRouteMode::Middle,
@@ -582,8 +588,6 @@ where
&mut crypto_reader,
proto_tag,
frame_limit,
Duration::from_secs(config.timeouts.client_handshake.max(1)),
&buffer_pool,
&forensics,
&mut frame_counter,
&stats,
@@ -663,8 +667,10 @@ where
frames_ok = frame_counter,
"ME relay cleanup"
);
adaptive_buffers::record_user_tier(&user, seed_tier);
me_pool.registry().unregister(conn_id).await;
stats.decrement_current_connections_me();
stats.decrement_user_curr_connects(&user);
result
}
@@ -672,8 +678,6 @@ async fn read_client_payload<R>(
client_reader: &mut CryptoReader<R>,
proto_tag: ProtoTag,
max_frame: usize,
frame_read_timeout: Duration,
buffer_pool: &Arc<BufferPool>,
forensics: &RelayForensicsState,
frame_counter: &mut u64,
stats: &Stats,
@@ -681,40 +685,23 @@ async fn read_client_payload<R>(
where
R: AsyncRead + Unpin + Send + 'static,
{
async fn read_exact_with_timeout<R>(
client_reader: &mut CryptoReader<R>,
buf: &mut [u8],
frame_read_timeout: Duration,
) -> Result<()>
where
R: AsyncRead + Unpin + Send + 'static,
{
match timeout(frame_read_timeout, client_reader.read_exact(buf)).await {
Ok(Ok(_)) => Ok(()),
Ok(Err(e)) => Err(ProxyError::Io(e)),
Err(_) => Err(ProxyError::Io(std::io::Error::new(
std::io::ErrorKind::TimedOut,
"middle-relay client frame read timeout",
))),
}
}
loop {
let (len, quickack, raw_len_bytes) = match proto_tag {
ProtoTag::Abridged => {
let mut first = [0u8; 1];
match read_exact_with_timeout(client_reader, &mut first, frame_read_timeout).await {
Ok(()) => {}
Err(ProxyError::Io(e)) if e.kind() == std::io::ErrorKind::UnexpectedEof => {
return Ok(None);
}
Err(e) => return Err(e),
match client_reader.read_exact(&mut first).await {
Ok(_) => {}
Err(e) if e.kind() == std::io::ErrorKind::UnexpectedEof => return Ok(None),
Err(e) => return Err(ProxyError::Io(e)),
}
let quickack = (first[0] & 0x80) != 0;
let len_words = if (first[0] & 0x7f) == 0x7f {
let mut ext = [0u8; 3];
read_exact_with_timeout(client_reader, &mut ext, frame_read_timeout).await?;
client_reader
.read_exact(&mut ext)
.await
.map_err(ProxyError::Io)?;
u32::from_le_bytes([ext[0], ext[1], ext[2], 0]) as usize
} else {
(first[0] & 0x7f) as usize
@@ -727,12 +714,10 @@ where
}
ProtoTag::Intermediate | ProtoTag::Secure => {
let mut len_buf = [0u8; 4];
match read_exact_with_timeout(client_reader, &mut len_buf, frame_read_timeout).await {
Ok(()) => {}
Err(ProxyError::Io(e)) if e.kind() == std::io::ErrorKind::UnexpectedEof => {
return Ok(None);
}
Err(e) => return Err(e),
match client_reader.read_exact(&mut len_buf).await {
Ok(_) => {}
Err(e) if e.kind() == std::io::ErrorKind::UnexpectedEof => return Ok(None),
Err(e) => return Err(ProxyError::Io(e)),
}
let quickack = (len_buf[3] & 0x80) != 0;
(
@@ -784,25 +769,18 @@ where
len
};
let chunk_cap = buffer_pool.buffer_size().max(1024);
let mut payload = BytesMut::with_capacity(len.min(chunk_cap));
let mut remaining = len;
while remaining > 0 {
let chunk_len = remaining.min(chunk_cap);
let mut chunk = buffer_pool.get();
chunk.resize(chunk_len, 0);
read_exact_with_timeout(client_reader, &mut chunk[..chunk_len], frame_read_timeout)
.await?;
payload.extend_from_slice(&chunk[..chunk_len]);
remaining -= chunk_len;
}
let mut payload = vec![0u8; len];
client_reader
.read_exact(&mut payload)
.await
.map_err(ProxyError::Io)?;
// Secure Intermediate: strip validated trailing padding bytes.
if proto_tag == ProtoTag::Secure {
payload.truncate(secure_payload_len);
}
*frame_counter += 1;
return Ok(Some((payload.freeze(), quickack)));
return Ok(Some((Bytes::from(payload), quickack)));
}
}
@@ -994,5 +972,82 @@ where
}
#[cfg(test)]
#[path = "middle_relay_security_tests.rs"]
mod security_tests;
mod tests {
use super::*;
use tokio::time::{Duration as TokioDuration, timeout};
#[test]
fn should_yield_sender_only_on_budget_with_backlog() {
assert!(!should_yield_c2me_sender(0, true));
assert!(!should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET - 1, true));
assert!(!should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET, false));
assert!(should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET, true));
}
#[tokio::test]
async fn enqueue_c2me_command_uses_try_send_fast_path() {
let (tx, mut rx) = mpsc::channel::<C2MeCommand>(2);
enqueue_c2me_command(
&tx,
C2MeCommand::Data {
payload: Bytes::from_static(&[1, 2, 3]),
flags: 0,
},
)
.await
.unwrap();
let recv = timeout(TokioDuration::from_millis(50), rx.recv())
.await
.unwrap()
.unwrap();
match recv {
C2MeCommand::Data { payload, flags } => {
assert_eq!(payload.as_ref(), &[1, 2, 3]);
assert_eq!(flags, 0);
}
C2MeCommand::Close => panic!("unexpected close command"),
}
}
#[tokio::test]
async fn enqueue_c2me_command_falls_back_to_send_when_queue_is_full() {
let (tx, mut rx) = mpsc::channel::<C2MeCommand>(1);
tx.send(C2MeCommand::Data {
payload: Bytes::from_static(&[9]),
flags: 9,
})
.await
.unwrap();
let tx2 = tx.clone();
let producer = tokio::spawn(async move {
enqueue_c2me_command(
&tx2,
C2MeCommand::Data {
payload: Bytes::from_static(&[7, 7]),
flags: 7,
},
)
.await
.unwrap();
});
let _ = timeout(TokioDuration::from_millis(100), rx.recv())
.await
.unwrap();
producer.await.unwrap();
let recv = timeout(TokioDuration::from_millis(100), rx.recv())
.await
.unwrap()
.unwrap();
match recv {
C2MeCommand::Data { payload, flags } => {
assert_eq!(payload.as_ref(), &[7, 7]);
assert_eq!(flags, 7);
}
C2MeCommand::Close => panic!("unexpected close command"),
}
}
}

View File

@@ -1,201 +0,0 @@
use super::*;
use crate::crypto::AesCtr;
use crate::stats::Stats;
use crate::stream::{BufferPool, CryptoReader};
use std::net::SocketAddr;
use std::sync::Arc;
use std::sync::atomic::AtomicU64;
use tokio::io::AsyncWriteExt;
use tokio::io::duplex;
use tokio::time::{Duration as TokioDuration, timeout};
#[test]
fn should_yield_sender_only_on_budget_with_backlog() {
assert!(!should_yield_c2me_sender(0, true));
assert!(!should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET - 1, true));
assert!(!should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET, false));
assert!(should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET, true));
}
#[tokio::test]
async fn enqueue_c2me_command_uses_try_send_fast_path() {
let (tx, mut rx) = mpsc::channel::<C2MeCommand>(2);
enqueue_c2me_command(
&tx,
C2MeCommand::Data {
payload: Bytes::from_static(&[1, 2, 3]),
flags: 0,
},
)
.await
.unwrap();
let recv = timeout(TokioDuration::from_millis(50), rx.recv())
.await
.unwrap()
.unwrap();
match recv {
C2MeCommand::Data { payload, flags } => {
assert_eq!(payload.as_ref(), &[1, 2, 3]);
assert_eq!(flags, 0);
}
C2MeCommand::Close => panic!("unexpected close command"),
}
}
#[tokio::test]
async fn enqueue_c2me_command_falls_back_to_send_when_queue_is_full() {
let (tx, mut rx) = mpsc::channel::<C2MeCommand>(1);
tx.send(C2MeCommand::Data {
payload: Bytes::from_static(&[9]),
flags: 9,
})
.await
.unwrap();
let tx2 = tx.clone();
let producer = tokio::spawn(async move {
enqueue_c2me_command(
&tx2,
C2MeCommand::Data {
payload: Bytes::from_static(&[7, 7]),
flags: 7,
},
)
.await
.unwrap();
});
let _ = timeout(TokioDuration::from_millis(100), rx.recv())
.await
.unwrap();
producer.await.unwrap();
let recv = timeout(TokioDuration::from_millis(100), rx.recv())
.await
.unwrap()
.unwrap();
match recv {
C2MeCommand::Data { payload, flags } => {
assert_eq!(payload.as_ref(), &[7, 7]);
assert_eq!(flags, 7);
}
C2MeCommand::Close => panic!("unexpected close command"),
}
}
#[test]
fn desync_dedup_cache_is_bounded() {
let _guard = desync_dedup_test_lock()
.lock()
.expect("desync dedup test lock must be available");
clear_desync_dedup_for_testing();
let now = Instant::now();
for key in 0..DESYNC_DEDUP_MAX_ENTRIES as u64 {
assert!(
should_emit_full_desync(key, false, now),
"unique keys up to cap must be tracked"
);
}
assert!(
!should_emit_full_desync(u64::MAX, false, now),
"new key above cap must be suppressed to bound memory"
);
assert!(
!should_emit_full_desync(7, false, now),
"already tracked key inside dedup window must stay suppressed"
);
}
fn make_forensics_state() -> RelayForensicsState {
RelayForensicsState {
trace_id: 1,
conn_id: 2,
user: "test-user".to_string(),
peer: "127.0.0.1:50000".parse::<SocketAddr>().unwrap(),
peer_hash: 3,
started_at: Instant::now(),
bytes_c2me: 0,
bytes_me2c: Arc::new(AtomicU64::new(0)),
desync_all_full: false,
}
}
fn make_crypto_reader(reader: tokio::io::DuplexStream) -> CryptoReader<tokio::io::DuplexStream> {
let key = [0u8; 32];
let iv = 0u128;
CryptoReader::new(reader, AesCtr::new(&key, iv))
}
fn encrypt_for_reader(plaintext: &[u8]) -> Vec<u8> {
let key = [0u8; 32];
let iv = 0u128;
let mut cipher = AesCtr::new(&key, iv);
cipher.encrypt(plaintext)
}
#[tokio::test]
async fn read_client_payload_times_out_on_header_stall() {
let _guard = desync_dedup_test_lock()
.lock()
.expect("middle relay test lock must be available");
let (reader, _writer) = duplex(1024);
let mut crypto_reader = make_crypto_reader(reader);
let buffer_pool = Arc::new(BufferPool::new());
let stats = Stats::new();
let forensics = make_forensics_state();
let mut frame_counter = 0;
let result = read_client_payload(
&mut crypto_reader,
ProtoTag::Intermediate,
1024,
TokioDuration::from_millis(25),
&buffer_pool,
&forensics,
&mut frame_counter,
&stats,
)
.await;
assert!(
matches!(result, Err(ProxyError::Io(ref e)) if e.kind() == std::io::ErrorKind::TimedOut),
"stalled header read must time out"
);
}
#[tokio::test]
async fn read_client_payload_times_out_on_payload_stall() {
let _guard = desync_dedup_test_lock()
.lock()
.expect("middle relay test lock must be available");
let (reader, mut writer) = duplex(1024);
let encrypted_len = encrypt_for_reader(&[8, 0, 0, 0]);
writer.write_all(&encrypted_len).await.unwrap();
let mut crypto_reader = make_crypto_reader(reader);
let buffer_pool = Arc::new(BufferPool::new());
let stats = Stats::new();
let forensics = make_forensics_state();
let mut frame_counter = 0;
let result = read_client_payload(
&mut crypto_reader,
ProtoTag::Intermediate,
1024,
TokioDuration::from_millis(25),
&buffer_pool,
&forensics,
&mut frame_counter,
&stats,
)
.await;
assert!(
matches!(result, Err(ProxyError::Io(ref e)) if e.kind() == std::io::ErrorKind::TimedOut),
"stalled payload body read must time out"
);
}

View File

@@ -1,5 +1,6 @@
//! Proxy Defs
pub mod adaptive_buffers;
pub mod client;
pub mod direct_relay;
pub mod handshake;
@@ -7,6 +8,7 @@ pub mod masking;
pub mod middle_relay;
pub mod route_mode;
pub mod relay;
pub mod session_eviction;
pub use client::ClientHandler;
#[allow(unused_imports)]

View File

@@ -63,6 +63,10 @@ use tokio::io::{
use tokio::time::Instant;
use tracing::{debug, trace, warn};
use crate::error::Result;
use crate::proxy::adaptive_buffers::{
self, AdaptiveTier, RelaySignalSample, SessionAdaptiveController, TierTransitionReason,
};
use crate::proxy::session_eviction::SessionLease;
use crate::stats::Stats;
use crate::stream::BufferPool;
@@ -79,6 +83,7 @@ const ACTIVITY_TIMEOUT: Duration = Duration::from_secs(1800);
/// 10 seconds gives responsive timeout detection (±10s accuracy)
/// without measurable overhead from atomic reads.
const WATCHDOG_INTERVAL: Duration = Duration::from_secs(10);
const ADAPTIVE_TICK: Duration = Duration::from_millis(250);
// ============= CombinedStream =============
@@ -155,6 +160,16 @@ struct SharedCounters {
s2c_ops: AtomicU64,
/// Milliseconds since relay epoch of last I/O activity
last_activity_ms: AtomicU64,
/// Bytes requested to write to client (S→C direction).
s2c_requested_bytes: AtomicU64,
/// Total write operations for S→C direction.
s2c_write_ops: AtomicU64,
/// Number of partial writes to client.
s2c_partial_writes: AtomicU64,
/// Number of times S→C poll_write returned Pending.
s2c_pending_writes: AtomicU64,
/// Consecutive pending writes in S→C direction.
s2c_consecutive_pending_writes: AtomicU64,
}
impl SharedCounters {
@@ -165,6 +180,11 @@ impl SharedCounters {
c2s_ops: AtomicU64::new(0),
s2c_ops: AtomicU64::new(0),
last_activity_ms: AtomicU64::new(0),
s2c_requested_bytes: AtomicU64::new(0),
s2c_write_ops: AtomicU64::new(0),
s2c_partial_writes: AtomicU64::new(0),
s2c_pending_writes: AtomicU64::new(0),
s2c_consecutive_pending_writes: AtomicU64::new(0),
}
}
@@ -259,9 +279,21 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
buf: &[u8],
) -> Poll<io::Result<usize>> {
let this = self.get_mut();
this.counters
.s2c_requested_bytes
.fetch_add(buf.len() as u64, Ordering::Relaxed);
match Pin::new(&mut this.inner).poll_write(cx, buf) {
Poll::Ready(Ok(n)) => {
this.counters.s2c_write_ops.fetch_add(1, Ordering::Relaxed);
this.counters
.s2c_consecutive_pending_writes
.store(0, Ordering::Relaxed);
if n < buf.len() {
this.counters
.s2c_partial_writes
.fetch_add(1, Ordering::Relaxed);
}
if n > 0 {
// S→C: data written to client
this.counters.s2c_bytes.fetch_add(n as u64, Ordering::Relaxed);
@@ -275,6 +307,15 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
}
Poll::Ready(Ok(n))
}
Poll::Pending => {
this.counters
.s2c_pending_writes
.fetch_add(1, Ordering::Relaxed);
this.counters
.s2c_consecutive_pending_writes
.fetch_add(1, Ordering::Relaxed);
Poll::Pending
}
other => other,
}
}
@@ -316,8 +357,11 @@ pub async fn relay_bidirectional<CR, CW, SR, SW>(
c2s_buf_size: usize,
s2c_buf_size: usize,
user: &str,
dc_idx: i16,
stats: Arc<Stats>,
_buffer_pool: Arc<BufferPool>,
session_lease: SessionLease,
seed_tier: AdaptiveTier,
) -> Result<()>
where
CR: AsyncRead + Unpin + Send + 'static,
@@ -345,13 +389,33 @@ where
// ── Watchdog: activity timeout + periodic rate logging ──────────
let wd_counters = Arc::clone(&counters);
let wd_user = user_owned.clone();
let wd_dc = dc_idx;
let wd_stats = Arc::clone(&stats);
let wd_session = session_lease.clone();
let watchdog = async {
let mut prev_c2s: u64 = 0;
let mut prev_s2c: u64 = 0;
let mut prev_c2s_log: u64 = 0;
let mut prev_s2c_log: u64 = 0;
let mut prev_c2s_sample: u64 = 0;
let mut prev_s2c_requested_sample: u64 = 0;
let mut prev_s2c_written_sample: u64 = 0;
let mut prev_s2c_write_ops_sample: u64 = 0;
let mut prev_s2c_partial_sample: u64 = 0;
let mut accumulated_log = Duration::ZERO;
let mut adaptive = SessionAdaptiveController::new(seed_tier);
loop {
tokio::time::sleep(WATCHDOG_INTERVAL).await;
tokio::time::sleep(ADAPTIVE_TICK).await;
if wd_session.is_stale() {
wd_stats.increment_reconnect_stale_close_total();
warn!(
user = %wd_user,
dc = wd_dc,
"Session evicted by reconnect"
);
return;
}
let now = Instant::now();
let idle = wd_counters.idle_duration(now, epoch);
@@ -370,11 +434,80 @@ where
return; // Causes select! to cancel copy_bidirectional
}
let c2s_total = wd_counters.c2s_bytes.load(Ordering::Relaxed);
let s2c_requested_total = wd_counters
.s2c_requested_bytes
.load(Ordering::Relaxed);
let s2c_written_total = wd_counters.s2c_bytes.load(Ordering::Relaxed);
let s2c_write_ops_total = wd_counters
.s2c_write_ops
.load(Ordering::Relaxed);
let s2c_partial_total = wd_counters
.s2c_partial_writes
.load(Ordering::Relaxed);
let consecutive_pending = wd_counters
.s2c_consecutive_pending_writes
.load(Ordering::Relaxed) as u32;
let sample = RelaySignalSample {
c2s_bytes: c2s_total.saturating_sub(prev_c2s_sample),
s2c_requested_bytes: s2c_requested_total
.saturating_sub(prev_s2c_requested_sample),
s2c_written_bytes: s2c_written_total
.saturating_sub(prev_s2c_written_sample),
s2c_write_ops: s2c_write_ops_total
.saturating_sub(prev_s2c_write_ops_sample),
s2c_partial_writes: s2c_partial_total
.saturating_sub(prev_s2c_partial_sample),
s2c_consecutive_pending_writes: consecutive_pending,
};
if let Some(transition) = adaptive.observe(sample, ADAPTIVE_TICK.as_secs_f64()) {
match transition.reason {
TierTransitionReason::SoftConfirmed => {
wd_stats.increment_relay_adaptive_promotions_total();
}
TierTransitionReason::HardPressure => {
wd_stats.increment_relay_adaptive_promotions_total();
wd_stats.increment_relay_adaptive_hard_promotions_total();
}
TierTransitionReason::QuietDemotion => {
wd_stats.increment_relay_adaptive_demotions_total();
}
}
adaptive_buffers::record_user_tier(&wd_user, adaptive.max_tier_seen());
debug!(
user = %wd_user,
dc = wd_dc,
from_tier = transition.from.as_u8(),
to_tier = transition.to.as_u8(),
reason = ?transition.reason,
throughput_ema_bps = sample
.c2s_bytes
.max(sample.s2c_written_bytes)
.saturating_mul(8)
.saturating_mul(4),
"Adaptive relay tier transition"
);
}
prev_c2s_sample = c2s_total;
prev_s2c_requested_sample = s2c_requested_total;
prev_s2c_written_sample = s2c_written_total;
prev_s2c_write_ops_sample = s2c_write_ops_total;
prev_s2c_partial_sample = s2c_partial_total;
accumulated_log = accumulated_log.saturating_add(ADAPTIVE_TICK);
if accumulated_log < WATCHDOG_INTERVAL {
continue;
}
accumulated_log = Duration::ZERO;
// ── Periodic rate logging ───────────────────────────────
let c2s = wd_counters.c2s_bytes.load(Ordering::Relaxed);
let s2c = wd_counters.s2c_bytes.load(Ordering::Relaxed);
let c2s_delta = c2s - prev_c2s;
let s2c_delta = s2c - prev_s2c;
let c2s_delta = c2s.saturating_sub(prev_c2s_log);
let s2c_delta = s2c.saturating_sub(prev_s2c_log);
if c2s_delta > 0 || s2c_delta > 0 {
let secs = WATCHDOG_INTERVAL.as_secs_f64();
@@ -388,8 +521,8 @@ where
);
}
prev_c2s = c2s;
prev_s2c = s2c;
prev_c2s_log = c2s;
prev_s2c_log = s2c;
}
};
@@ -424,6 +557,7 @@ where
let c2s_ops = counters.c2s_ops.load(Ordering::Relaxed);
let s2c_ops = counters.s2c_ops.load(Ordering::Relaxed);
let duration = epoch.elapsed();
adaptive_buffers::record_user_tier(&user_owned, seed_tier);
match copy_result {
Some(Ok((c2s, s2c))) => {

View File

@@ -0,0 +1,46 @@
/// Session eviction is intentionally disabled in runtime.
///
/// The initial `user+dc` single-lease model caused valid parallel client
/// connections to evict each other. Keep the API shape for compatibility,
/// but make it a no-op until a safer policy is introduced.
#[derive(Debug, Clone, Default)]
pub struct SessionLease;
impl SessionLease {
pub fn is_stale(&self) -> bool {
false
}
#[allow(dead_code)]
pub fn release(&self) {}
}
pub struct RegistrationResult {
pub lease: SessionLease,
pub replaced_existing: bool,
}
pub fn register_session(_user: &str, _dc_idx: i16) -> RegistrationResult {
RegistrationResult {
lease: SessionLease,
replaced_existing: false,
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_session_eviction_disabled_behavior() {
let first = register_session("alice", 2);
let second = register_session("alice", 2);
assert!(!first.replaced_existing);
assert!(!second.replaced_existing);
assert!(!first.lease.is_stale());
assert!(!second.lease.is_stale());
first.lease.release();
second.lease.release();
}
}

View File

@@ -120,6 +120,8 @@ pub struct Stats {
pool_swap_total: AtomicU64,
pool_drain_active: AtomicU64,
pool_force_close_total: AtomicU64,
pool_drain_soft_evict_total: AtomicU64,
pool_drain_soft_evict_writer_total: AtomicU64,
pool_stale_pick_total: AtomicU64,
me_writer_removed_total: AtomicU64,
me_writer_removed_unexpected_total: AtomicU64,
@@ -133,6 +135,11 @@ pub struct Stats {
me_inline_recovery_total: AtomicU64,
ip_reservation_rollback_tcp_limit_total: AtomicU64,
ip_reservation_rollback_quota_limit_total: AtomicU64,
relay_adaptive_promotions_total: AtomicU64,
relay_adaptive_demotions_total: AtomicU64,
relay_adaptive_hard_promotions_total: AtomicU64,
reconnect_evict_total: AtomicU64,
reconnect_stale_close_total: AtomicU64,
telemetry_core_enabled: AtomicBool,
telemetry_user_enabled: AtomicBool,
telemetry_me_level: AtomicU8,
@@ -285,6 +292,36 @@ impl Stats {
pub fn decrement_current_connections_me(&self) {
Self::decrement_atomic_saturating(&self.current_connections_me);
}
pub fn increment_relay_adaptive_promotions_total(&self) {
if self.telemetry_core_enabled() {
self.relay_adaptive_promotions_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_relay_adaptive_demotions_total(&self) {
if self.telemetry_core_enabled() {
self.relay_adaptive_demotions_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_relay_adaptive_hard_promotions_total(&self) {
if self.telemetry_core_enabled() {
self.relay_adaptive_hard_promotions_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_reconnect_evict_total(&self) {
if self.telemetry_core_enabled() {
self.reconnect_evict_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_reconnect_stale_close_total(&self) {
if self.telemetry_core_enabled() {
self.reconnect_stale_close_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_handshake_timeouts(&self) {
if self.telemetry_core_enabled() {
self.handshake_timeouts.fetch_add(1, Ordering::Relaxed);
@@ -680,6 +717,18 @@ impl Stats {
self.pool_force_close_total.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_pool_drain_soft_evict_total(&self) {
if self.telemetry_me_allows_normal() {
self.pool_drain_soft_evict_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_pool_drain_soft_evict_writer_total(&self) {
if self.telemetry_me_allows_normal() {
self.pool_drain_soft_evict_writer_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_pool_stale_pick_total(&self) {
if self.telemetry_me_allows_normal() {
self.pool_stale_pick_total.fetch_add(1, Ordering::Relaxed);
@@ -933,6 +982,22 @@ impl Stats {
self.get_current_connections_direct()
.saturating_add(self.get_current_connections_me())
}
pub fn get_relay_adaptive_promotions_total(&self) -> u64 {
self.relay_adaptive_promotions_total.load(Ordering::Relaxed)
}
pub fn get_relay_adaptive_demotions_total(&self) -> u64 {
self.relay_adaptive_demotions_total.load(Ordering::Relaxed)
}
pub fn get_relay_adaptive_hard_promotions_total(&self) -> u64 {
self.relay_adaptive_hard_promotions_total
.load(Ordering::Relaxed)
}
pub fn get_reconnect_evict_total(&self) -> u64 {
self.reconnect_evict_total.load(Ordering::Relaxed)
}
pub fn get_reconnect_stale_close_total(&self) -> u64 {
self.reconnect_stale_close_total.load(Ordering::Relaxed)
}
pub fn get_me_keepalive_sent(&self) -> u64 { self.me_keepalive_sent.load(Ordering::Relaxed) }
pub fn get_me_keepalive_failed(&self) -> u64 { self.me_keepalive_failed.load(Ordering::Relaxed) }
pub fn get_me_keepalive_pong(&self) -> u64 { self.me_keepalive_pong.load(Ordering::Relaxed) }
@@ -1185,6 +1250,12 @@ impl Stats {
pub fn get_pool_force_close_total(&self) -> u64 {
self.pool_force_close_total.load(Ordering::Relaxed)
}
pub fn get_pool_drain_soft_evict_total(&self) -> u64 {
self.pool_drain_soft_evict_total.load(Ordering::Relaxed)
}
pub fn get_pool_drain_soft_evict_writer_total(&self) -> u64 {
self.pool_drain_soft_evict_writer_total.load(Ordering::Relaxed)
}
pub fn get_pool_stale_pick_total(&self) -> u64 {
self.pool_stale_pick_total.load(Ordering::Relaxed)
}
@@ -1256,35 +1327,11 @@ impl Stats {
Self::touch_user_stats(stats.value());
stats.curr_connects.fetch_add(1, Ordering::Relaxed);
}
pub fn try_acquire_user_curr_connects(&self, user: &str, limit: Option<u64>) -> bool {
if !self.telemetry_user_enabled() {
return true;
}
self.maybe_cleanup_user_stats();
let stats = self.user_stats.entry(user.to_string()).or_default();
Self::touch_user_stats(stats.value());
let counter = &stats.curr_connects;
let mut current = counter.load(Ordering::Relaxed);
loop {
if let Some(max) = limit && current >= max {
return false;
}
match counter.compare_exchange_weak(
current,
current.saturating_add(1),
Ordering::Relaxed,
Ordering::Relaxed,
) {
Ok(_) => return true,
Err(actual) => current = actual,
}
}
}
pub fn decrement_user_curr_connects(&self, user: &str) {
if !self.telemetry_user_enabled() {
return;
}
self.maybe_cleanup_user_stats();
if let Some(stats) = self.user_stats.get(user) {
Self::touch_user_stats(stats.value());

View File

@@ -14,8 +14,7 @@ use std::sync::Arc;
// ============= Configuration =============
/// Default buffer size
/// CHANGED: Reduced from 64KB to 16KB to match TLS record size and prevent bufferbloat.
pub const DEFAULT_BUFFER_SIZE: usize = 16 * 1024;
pub const DEFAULT_BUFFER_SIZE: usize = 64 * 1024;
/// Default maximum number of pooled buffers
pub const DEFAULT_MAX_BUFFERS: usize = 1024;

View File

@@ -513,7 +513,6 @@ impl FrameCodecTrait for SecureCodec {
#[cfg(test)]
mod tests {
use super::*;
use std::collections::HashSet;
use tokio_util::codec::{FramedRead, FramedWrite};
use tokio::io::duplex;
use futures::{SinkExt, StreamExt};
@@ -631,31 +630,4 @@ mod tests {
let result = codec.decode(&mut buf);
assert!(result.is_err());
}
#[test]
fn secure_codec_always_adds_padding_and_jitters_wire_length() {
let codec = SecureCodec::new(Arc::new(SecureRandom::new()));
let payload = Bytes::from_static(&[1, 2, 3, 4, 5, 6, 7, 8]);
let mut wire_lens = HashSet::new();
for _ in 0..64 {
let frame = Frame::new(payload.clone());
let mut out = BytesMut::new();
codec.encode(&frame, &mut out).unwrap();
assert!(out.len() >= 4 + payload.len() + 1);
let wire_len = u32::from_le_bytes([out[0], out[1], out[2], out[3]]) as usize;
assert!(
(payload.len() + 1..=payload.len() + 3).contains(&wire_len),
"Secure wire length must be payload+1..3, got {wire_len}"
);
assert_ne!(wire_len % 4, 0, "Secure wire length must be non-4-aligned");
wire_lens.insert(wire_len);
}
assert!(
wire_lens.len() >= 2,
"Secure padding should create observable wire-length jitter"
);
}
}

View File

@@ -299,6 +299,11 @@ async fn run_update_cycle(
cfg.general.hardswap,
cfg.general.me_pool_drain_ttl_secs,
cfg.general.me_pool_drain_threshold,
cfg.general.me_pool_drain_soft_evict_enabled,
cfg.general.me_pool_drain_soft_evict_grace_secs,
cfg.general.me_pool_drain_soft_evict_per_writer,
cfg.general.me_pool_drain_soft_evict_budget_per_core,
cfg.general.me_pool_drain_soft_evict_cooldown_ms,
cfg.general.effective_me_pool_force_close_secs(),
cfg.general.me_pool_min_fresh_ratio,
cfg.general.me_hardswap_warmup_delay_min_ms,
@@ -526,6 +531,11 @@ pub async fn me_config_updater(
cfg.general.hardswap,
cfg.general.me_pool_drain_ttl_secs,
cfg.general.me_pool_drain_threshold,
cfg.general.me_pool_drain_soft_evict_enabled,
cfg.general.me_pool_drain_soft_evict_grace_secs,
cfg.general.me_pool_drain_soft_evict_per_writer,
cfg.general.me_pool_drain_soft_evict_budget_per_core,
cfg.general.me_pool_drain_soft_evict_cooldown_ms,
cfg.general.effective_me_pool_force_close_secs(),
cfg.general.me_pool_min_fresh_ratio,
cfg.general.me_hardswap_warmup_delay_min_ms,

View File

@@ -25,6 +25,11 @@ const HEALTH_RECONNECT_BUDGET_PER_CORE: usize = 2;
const HEALTH_RECONNECT_BUDGET_PER_DC: usize = 1;
const HEALTH_RECONNECT_BUDGET_MIN: usize = 4;
const HEALTH_RECONNECT_BUDGET_MAX: usize = 128;
const HEALTH_DRAIN_CLOSE_BUDGET_PER_CORE: usize = 16;
const HEALTH_DRAIN_CLOSE_BUDGET_MIN: usize = 16;
const HEALTH_DRAIN_CLOSE_BUDGET_MAX: usize = 256;
const HEALTH_DRAIN_SOFT_EVICT_BUDGET_MIN: usize = 8;
const HEALTH_DRAIN_SOFT_EVICT_BUDGET_MAX: usize = 256;
#[derive(Debug, Clone)]
struct DcFloorPlanEntry {
@@ -63,6 +68,7 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
let mut adaptive_recover_until: HashMap<(i32, IpFamily), Instant> = HashMap::new();
let mut floor_warn_next_allowed: HashMap<(i32, IpFamily), Instant> = HashMap::new();
let mut drain_warn_next_allowed: HashMap<u64, Instant> = HashMap::new();
let mut drain_soft_evict_next_allowed: HashMap<u64, Instant> = HashMap::new();
let mut degraded_interval = true;
loop {
let interval = if degraded_interval {
@@ -72,7 +78,12 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
};
tokio::time::sleep(interval).await;
pool.prune_closed_writers().await;
reap_draining_writers(&pool, &mut drain_warn_next_allowed).await;
reap_draining_writers(
&pool,
&mut drain_warn_next_allowed,
&mut drain_soft_evict_next_allowed,
)
.await;
let v4_degraded = check_family(
IpFamily::V4,
&pool,
@@ -111,9 +122,10 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
}
}
async fn reap_draining_writers(
pub(super) async fn reap_draining_writers(
pool: &Arc<MePool>,
warn_next_allowed: &mut HashMap<u64, Instant>,
soft_evict_next_allowed: &mut HashMap<u64, Instant>,
) {
let now_epoch_secs = MePool::now_epoch_secs();
let now = Instant::now();
@@ -122,14 +134,22 @@ async fn reap_draining_writers(
.me_pool_drain_threshold
.load(std::sync::atomic::Ordering::Relaxed);
let writers = pool.writers.read().await.clone();
let activity = pool.registry.writer_activity_snapshot().await;
let mut draining_writers = Vec::new();
let mut empty_writer_ids = Vec::<u64>::new();
let mut force_close_writer_ids = Vec::<u64>::new();
for writer in writers {
if !writer.draining.load(std::sync::atomic::Ordering::Relaxed) {
continue;
}
let is_empty = pool.registry.is_writer_empty(writer.id).await;
if is_empty {
pool.remove_writer_and_close_clients(writer.id).await;
if activity
.bound_clients_by_writer
.get(&writer.id)
.copied()
.unwrap_or(0)
== 0
{
empty_writer_ids.push(writer.id);
continue;
}
draining_writers.push(writer);
@@ -156,12 +176,13 @@ async fn reap_draining_writers(
"ME draining writer threshold exceeded, force-closing oldest draining writers"
);
for writer in draining_writers.drain(..overflow) {
pool.stats.increment_pool_force_close_total();
pool.remove_writer_and_close_clients(writer.id).await;
force_close_writer_ids.push(writer.id);
}
}
for writer in draining_writers {
let mut active_draining_writer_ids = HashSet::with_capacity(draining_writers.len());
for writer in &draining_writers {
active_draining_writer_ids.insert(writer.id);
let drain_started_at_epoch_secs = writer
.draining_started_at_epoch_secs
.load(std::sync::atomic::Ordering::Relaxed);
@@ -191,10 +212,152 @@ async fn reap_draining_writers(
.load(std::sync::atomic::Ordering::Relaxed);
if deadline_epoch_secs != 0 && now_epoch_secs >= deadline_epoch_secs {
warn!(writer_id = writer.id, "Drain timeout, force-closing");
pool.stats.increment_pool_force_close_total();
pool.remove_writer_and_close_clients(writer.id).await;
force_close_writer_ids.push(writer.id);
active_draining_writer_ids.remove(&writer.id);
}
}
warn_next_allowed.retain(|writer_id, _| active_draining_writer_ids.contains(writer_id));
soft_evict_next_allowed.retain(|writer_id, _| active_draining_writer_ids.contains(writer_id));
if pool.drain_soft_evict_enabled() && drain_ttl_secs > 0 && !draining_writers.is_empty() {
let mut force_close_ids = HashSet::<u64>::with_capacity(force_close_writer_ids.len());
for writer_id in &force_close_writer_ids {
force_close_ids.insert(*writer_id);
}
let soft_grace_secs = pool.drain_soft_evict_grace_secs();
let soft_trigger_age_secs = drain_ttl_secs.saturating_add(soft_grace_secs);
let per_writer_limit = pool.drain_soft_evict_per_writer();
let soft_budget = health_drain_soft_evict_budget(pool);
let soft_cooldown = pool.drain_soft_evict_cooldown();
let mut soft_evicted_total = 0usize;
for writer in &draining_writers {
if soft_evicted_total >= soft_budget {
break;
}
if force_close_ids.contains(&writer.id) {
continue;
}
if pool.writer_accepts_new_binding(writer) {
continue;
}
let started_epoch_secs = writer
.draining_started_at_epoch_secs
.load(std::sync::atomic::Ordering::Relaxed);
if started_epoch_secs == 0
|| now_epoch_secs.saturating_sub(started_epoch_secs) < soft_trigger_age_secs
{
continue;
}
if !should_emit_writer_warn(
soft_evict_next_allowed,
writer.id,
now,
soft_cooldown,
) {
continue;
}
let remaining_budget = soft_budget.saturating_sub(soft_evicted_total);
let limit = per_writer_limit.min(remaining_budget);
if limit == 0 {
break;
}
let conn_ids = pool
.registry
.bound_conn_ids_for_writer_limited(writer.id, limit)
.await;
if conn_ids.is_empty() {
continue;
}
let mut evicted_for_writer = 0usize;
for conn_id in conn_ids {
if pool.registry.evict_bound_conn_if_writer(conn_id, writer.id).await {
evicted_for_writer = evicted_for_writer.saturating_add(1);
soft_evicted_total = soft_evicted_total.saturating_add(1);
pool.stats.increment_pool_drain_soft_evict_total();
if soft_evicted_total >= soft_budget {
break;
}
}
}
if evicted_for_writer > 0 {
pool.stats.increment_pool_drain_soft_evict_writer_total();
info!(
writer_id = writer.id,
writer_dc = writer.writer_dc,
endpoint = %writer.addr,
drained_connections = evicted_for_writer,
soft_budget,
soft_trigger_age_secs,
"ME draining writer soft-evicted bound clients"
);
}
}
}
let close_budget = health_drain_close_budget();
let requested_force_close = force_close_writer_ids.len();
let requested_empty_close = empty_writer_ids.len();
let requested_close_total = requested_force_close.saturating_add(requested_empty_close);
let mut closed_writer_ids = HashSet::<u64>::new();
let mut closed_total = 0usize;
for writer_id in force_close_writer_ids {
if closed_total >= close_budget {
break;
}
if !closed_writer_ids.insert(writer_id) {
continue;
}
pool.stats.increment_pool_force_close_total();
pool.remove_writer_and_close_clients(writer_id).await;
closed_total = closed_total.saturating_add(1);
}
for writer_id in empty_writer_ids {
if closed_total >= close_budget {
break;
}
if !closed_writer_ids.insert(writer_id) {
continue;
}
pool.remove_writer_and_close_clients(writer_id).await;
closed_total = closed_total.saturating_add(1);
}
let pending_close_total = requested_close_total.saturating_sub(closed_total);
if pending_close_total > 0 {
warn!(
close_budget,
closed_total,
pending_close_total,
"ME draining close backlog deferred to next health cycle"
);
}
}
pub(super) fn health_drain_close_budget() -> usize {
let cpu_cores = std::thread::available_parallelism()
.map(std::num::NonZeroUsize::get)
.unwrap_or(1);
cpu_cores
.saturating_mul(HEALTH_DRAIN_CLOSE_BUDGET_PER_CORE)
.clamp(HEALTH_DRAIN_CLOSE_BUDGET_MIN, HEALTH_DRAIN_CLOSE_BUDGET_MAX)
}
pub(super) fn health_drain_soft_evict_budget(pool: &MePool) -> usize {
let cpu_cores = std::thread::available_parallelism()
.map(std::num::NonZeroUsize::get)
.unwrap_or(1);
let per_core = pool.drain_soft_evict_budget_per_core();
cpu_cores
.saturating_mul(per_core)
.clamp(
HEALTH_DRAIN_SOFT_EVICT_BUDGET_MIN,
HEALTH_DRAIN_SOFT_EVICT_BUDGET_MAX,
)
}
fn should_emit_writer_warn(
@@ -1382,6 +1545,11 @@ mod tests {
general.hardswap,
general.me_pool_drain_ttl_secs,
general.me_pool_drain_threshold,
general.me_pool_drain_soft_evict_enabled,
general.me_pool_drain_soft_evict_grace_secs,
general.me_pool_drain_soft_evict_per_writer,
general.me_pool_drain_soft_evict_budget_per_core,
general.me_pool_drain_soft_evict_cooldown_ms,
general.effective_me_pool_force_close_secs(),
general.me_pool_min_fresh_ratio,
general.me_hardswap_warmup_delay_min_ms,
@@ -1463,8 +1631,9 @@ mod tests {
let conn_b = insert_draining_writer(&pool, 20, now_epoch_secs.saturating_sub(20)).await;
let conn_c = insert_draining_writer(&pool, 30, now_epoch_secs.saturating_sub(10)).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed).await;
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
let writer_ids: Vec<u64> = pool.writers.read().await.iter().map(|writer| writer.id).collect();
assert_eq!(writer_ids, vec![20, 30]);
@@ -1481,8 +1650,9 @@ mod tests {
let conn_b = insert_draining_writer(&pool, 20, now_epoch_secs.saturating_sub(20)).await;
let conn_c = insert_draining_writer(&pool, 30, now_epoch_secs.saturating_sub(10)).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed).await;
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
let writer_ids: Vec<u64> = pool.writers.read().await.iter().map(|writer| writer.id).collect();
assert_eq!(writer_ids, vec![10, 20, 30]);

View File

@@ -0,0 +1,450 @@
use std::collections::HashMap;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, AtomicU8, AtomicU32, AtomicU64, Ordering};
use std::time::{Duration, Instant};
use tokio::sync::mpsc;
use tokio_util::sync::CancellationToken;
use super::codec::WriterCommand;
use super::health::{health_drain_close_budget, reap_draining_writers};
use super::pool::{MePool, MeWriter, WriterContour};
use super::registry::ConnMeta;
use super::me_health_monitor;
use crate::config::{GeneralConfig, MeRouteNoWriterMode, MeSocksKdfPolicy, MeWriterPickMode};
use crate::crypto::SecureRandom;
use crate::network::probe::NetworkDecision;
use crate::stats::Stats;
async fn make_pool(
me_pool_drain_threshold: u64,
me_health_interval_ms_unhealthy: u64,
me_health_interval_ms_healthy: u64,
) -> (Arc<MePool>, Arc<SecureRandom>) {
let general = GeneralConfig {
me_pool_drain_threshold,
me_health_interval_ms_unhealthy,
me_health_interval_ms_healthy,
..GeneralConfig::default()
};
let rng = Arc::new(SecureRandom::new());
let pool = MePool::new(
None,
vec![1u8; 32],
None,
false,
None,
Vec::new(),
1,
None,
12,
1200,
HashMap::new(),
HashMap::new(),
None,
NetworkDecision::default(),
None,
rng.clone(),
Arc::new(Stats::default()),
general.me_keepalive_enabled,
general.me_keepalive_interval_secs,
general.me_keepalive_jitter_secs,
general.me_keepalive_payload_random,
general.rpc_proxy_req_every,
general.me_warmup_stagger_enabled,
general.me_warmup_step_delay_ms,
general.me_warmup_step_jitter_ms,
general.me_reconnect_max_concurrent_per_dc,
general.me_reconnect_backoff_base_ms,
general.me_reconnect_backoff_cap_ms,
general.me_reconnect_fast_retry_count,
general.me_single_endpoint_shadow_writers,
general.me_single_endpoint_outage_mode_enabled,
general.me_single_endpoint_outage_disable_quarantine,
general.me_single_endpoint_outage_backoff_min_ms,
general.me_single_endpoint_outage_backoff_max_ms,
general.me_single_endpoint_shadow_rotate_every_secs,
general.me_floor_mode,
general.me_adaptive_floor_idle_secs,
general.me_adaptive_floor_min_writers_single_endpoint,
general.me_adaptive_floor_min_writers_multi_endpoint,
general.me_adaptive_floor_recover_grace_secs,
general.me_adaptive_floor_writers_per_core_total,
general.me_adaptive_floor_cpu_cores_override,
general.me_adaptive_floor_max_extra_writers_single_per_core,
general.me_adaptive_floor_max_extra_writers_multi_per_core,
general.me_adaptive_floor_max_active_writers_per_core,
general.me_adaptive_floor_max_warm_writers_per_core,
general.me_adaptive_floor_max_active_writers_global,
general.me_adaptive_floor_max_warm_writers_global,
general.hardswap,
general.me_pool_drain_ttl_secs,
general.me_pool_drain_threshold,
general.me_pool_drain_soft_evict_enabled,
general.me_pool_drain_soft_evict_grace_secs,
general.me_pool_drain_soft_evict_per_writer,
general.me_pool_drain_soft_evict_budget_per_core,
general.me_pool_drain_soft_evict_cooldown_ms,
general.effective_me_pool_force_close_secs(),
general.me_pool_min_fresh_ratio,
general.me_hardswap_warmup_delay_min_ms,
general.me_hardswap_warmup_delay_max_ms,
general.me_hardswap_warmup_extra_passes,
general.me_hardswap_warmup_pass_backoff_base_ms,
general.me_bind_stale_mode,
general.me_bind_stale_ttl_secs,
general.me_secret_atomic_snapshot,
general.me_deterministic_writer_sort,
MeWriterPickMode::default(),
general.me_writer_pick_sample_size,
MeSocksKdfPolicy::default(),
general.me_writer_cmd_channel_capacity,
general.me_route_channel_capacity,
general.me_route_backpressure_base_timeout_ms,
general.me_route_backpressure_high_timeout_ms,
general.me_route_backpressure_high_watermark_pct,
general.me_reader_route_data_wait_ms,
general.me_health_interval_ms_unhealthy,
general.me_health_interval_ms_healthy,
general.me_warn_rate_limit_ms,
MeRouteNoWriterMode::default(),
general.me_route_no_writer_wait_ms,
general.me_route_inline_recovery_attempts,
general.me_route_inline_recovery_wait_ms,
);
(pool, rng)
}
async fn insert_draining_writer(
pool: &Arc<MePool>,
writer_id: u64,
drain_started_at_epoch_secs: u64,
bound_clients: usize,
drain_deadline_epoch_secs: u64,
) {
let (tx, _writer_rx) = mpsc::channel::<WriterCommand>(8);
let writer = MeWriter {
id: writer_id,
addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 6000 + writer_id as u16),
source_ip: IpAddr::V4(Ipv4Addr::LOCALHOST),
writer_dc: 2,
generation: 1,
contour: Arc::new(AtomicU8::new(WriterContour::Draining.as_u8())),
created_at: Instant::now() - Duration::from_secs(writer_id),
tx: tx.clone(),
cancel: CancellationToken::new(),
degraded: Arc::new(AtomicBool::new(false)),
rtt_ema_ms_x10: Arc::new(AtomicU32::new(0)),
draining: Arc::new(AtomicBool::new(true)),
draining_started_at_epoch_secs: Arc::new(AtomicU64::new(drain_started_at_epoch_secs)),
drain_deadline_epoch_secs: Arc::new(AtomicU64::new(drain_deadline_epoch_secs)),
allow_drain_fallback: Arc::new(AtomicBool::new(false)),
};
pool.writers.write().await.push(writer);
pool.registry.register_writer(writer_id, tx).await;
pool.conn_count.fetch_add(1, Ordering::Relaxed);
for idx in 0..bound_clients {
let (conn_id, _rx) = pool.registry.register().await;
assert!(
pool.registry
.bind_writer(
conn_id,
writer_id,
ConnMeta {
target_dc: 2,
client_addr: SocketAddr::new(
IpAddr::V4(Ipv4Addr::LOCALHOST),
8000 + idx as u16,
),
our_addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443),
proto_flags: 0,
},
)
.await
);
}
}
async fn writer_count(pool: &Arc<MePool>) -> usize {
pool.writers.read().await.len()
}
async fn sorted_writer_ids(pool: &Arc<MePool>) -> Vec<u64> {
let mut ids = pool
.writers
.read()
.await
.iter()
.map(|writer| writer.id)
.collect::<Vec<_>>();
ids.sort_unstable();
ids
}
#[tokio::test]
async fn reap_draining_writers_clears_warn_state_when_pool_empty() {
let (pool, _rng) = make_pool(128, 1, 1).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
warn_next_allowed.insert(11, Instant::now() + Duration::from_secs(5));
warn_next_allowed.insert(22, Instant::now() + Duration::from_secs(5));
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(warn_next_allowed.is_empty());
}
#[tokio::test]
async fn reap_draining_writers_respects_threshold_across_multiple_overflow_cycles() {
let threshold = 3u64;
let (pool, _rng) = make_pool(threshold, 1, 1).await;
pool.me_pool_drain_soft_evict_enabled
.store(false, Ordering::Relaxed);
let now_epoch_secs = MePool::now_epoch_secs();
for writer_id in 1..=60u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(600).saturating_add(writer_id),
1,
0,
)
.await;
}
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
for _ in 0..64 {
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
if writer_count(&pool).await <= threshold as usize {
break;
}
}
assert_eq!(writer_count(&pool).await, threshold as usize);
assert_eq!(sorted_writer_ids(&pool).await, vec![58, 59, 60]);
}
#[tokio::test]
async fn reap_draining_writers_handles_large_empty_writer_population() {
let (pool, _rng) = make_pool(128, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
let total = health_drain_close_budget().saturating_mul(3).saturating_add(27);
for writer_id in 1..=total as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(120),
0,
0,
)
.await;
}
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
for _ in 0..24 {
if writer_count(&pool).await == 0 {
break;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
}
assert_eq!(writer_count(&pool).await, 0);
}
#[tokio::test]
async fn reap_draining_writers_processes_mass_deadline_expiry_without_unbounded_growth() {
let (pool, _rng) = make_pool(128, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
let total = health_drain_close_budget().saturating_mul(4).saturating_add(31);
for writer_id in 1..=total as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(180),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
}
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
for _ in 0..40 {
if writer_count(&pool).await == 0 {
break;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
}
assert_eq!(writer_count(&pool).await, 0);
}
#[tokio::test]
async fn reap_draining_writers_maintains_warn_state_subset_property_under_bulk_churn() {
let (pool, _rng) = make_pool(128, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
for wave in 0..40u64 {
for offset in 0..8u64 {
insert_draining_writer(
&pool,
wave * 100 + offset,
now_epoch_secs.saturating_sub(400 + offset),
1,
0,
)
.await;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(warn_next_allowed.len() <= writer_count(&pool).await);
let ids = sorted_writer_ids(&pool).await;
for writer_id in ids.into_iter().take(3) {
let _ = pool.remove_writer_and_close_clients(writer_id).await;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(warn_next_allowed.len() <= writer_count(&pool).await);
}
}
#[tokio::test]
async fn reap_draining_writers_budgeted_cleanup_never_increases_pool_size() {
let (pool, _rng) = make_pool(5, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
for writer_id in 1..=200u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(240).saturating_add(writer_id),
1,
0,
)
.await;
}
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
let mut previous = writer_count(&pool).await;
for _ in 0..32 {
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
let current = writer_count(&pool).await;
assert!(current <= previous);
previous = current;
}
}
#[tokio::test]
async fn me_health_monitor_converges_to_threshold_under_live_injection_churn() {
let threshold = 7u64;
let (pool, rng) = make_pool(threshold, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
for writer_id in 1..=40u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(300).saturating_add(writer_id),
1,
0,
)
.await;
}
let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
for wave in 0..8u64 {
for offset in 0..10u64 {
insert_draining_writer(
&pool,
1000 + wave * 100 + offset,
now_epoch_secs.saturating_sub(120).saturating_add(offset),
1,
0,
)
.await;
}
tokio::time::sleep(Duration::from_millis(5)).await;
}
tokio::time::sleep(Duration::from_millis(120)).await;
monitor.abort();
let _ = monitor.await;
assert!(writer_count(&pool).await <= threshold as usize);
}
#[tokio::test]
async fn me_health_monitor_drains_deadline_storm_with_budgeted_progress() {
let (pool, rng) = make_pool(128, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
for writer_id in 1..=220u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(120),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
}
let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
tokio::time::sleep(Duration::from_millis(120)).await;
monitor.abort();
let _ = monitor.await;
assert_eq!(writer_count(&pool).await, 0);
}
#[tokio::test]
async fn me_health_monitor_eliminates_mixed_empty_and_deadline_backlog() {
let threshold = 12u64;
let (pool, rng) = make_pool(threshold, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
for writer_id in 1..=180u64 {
let bound_clients = if writer_id % 3 == 0 { 0 } else { 1 };
let deadline = if writer_id % 2 == 0 {
now_epoch_secs.saturating_sub(1)
} else {
0
};
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(250).saturating_add(writer_id),
bound_clients,
deadline,
)
.await;
}
let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
tokio::time::sleep(Duration::from_millis(140)).await;
monitor.abort();
let _ = monitor.await;
assert!(writer_count(&pool).await <= threshold as usize);
}
#[test]
fn health_drain_close_budget_is_within_expected_bounds() {
let budget = health_drain_close_budget();
assert!((16..=256).contains(&budget));
}

View File

@@ -0,0 +1,232 @@
use std::collections::HashMap;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, AtomicU8, AtomicU32, AtomicU64, Ordering};
use std::time::{Duration, Instant};
use tokio::sync::mpsc;
use tokio_util::sync::CancellationToken;
use super::codec::WriterCommand;
use super::health::health_drain_close_budget;
use super::pool::{MePool, MeWriter, WriterContour};
use super::registry::ConnMeta;
use super::me_health_monitor;
use crate::config::{GeneralConfig, MeRouteNoWriterMode, MeSocksKdfPolicy, MeWriterPickMode};
use crate::crypto::SecureRandom;
use crate::network::probe::NetworkDecision;
use crate::stats::Stats;
async fn make_pool(
me_pool_drain_threshold: u64,
me_health_interval_ms_unhealthy: u64,
me_health_interval_ms_healthy: u64,
) -> (Arc<MePool>, Arc<SecureRandom>) {
let general = GeneralConfig {
me_pool_drain_threshold,
me_health_interval_ms_unhealthy,
me_health_interval_ms_healthy,
..GeneralConfig::default()
};
let rng = Arc::new(SecureRandom::new());
let pool = MePool::new(
None,
vec![1u8; 32],
None,
false,
None,
Vec::new(),
1,
None,
12,
1200,
HashMap::new(),
HashMap::new(),
None,
NetworkDecision::default(),
None,
rng.clone(),
Arc::new(Stats::default()),
general.me_keepalive_enabled,
general.me_keepalive_interval_secs,
general.me_keepalive_jitter_secs,
general.me_keepalive_payload_random,
general.rpc_proxy_req_every,
general.me_warmup_stagger_enabled,
general.me_warmup_step_delay_ms,
general.me_warmup_step_jitter_ms,
general.me_reconnect_max_concurrent_per_dc,
general.me_reconnect_backoff_base_ms,
general.me_reconnect_backoff_cap_ms,
general.me_reconnect_fast_retry_count,
general.me_single_endpoint_shadow_writers,
general.me_single_endpoint_outage_mode_enabled,
general.me_single_endpoint_outage_disable_quarantine,
general.me_single_endpoint_outage_backoff_min_ms,
general.me_single_endpoint_outage_backoff_max_ms,
general.me_single_endpoint_shadow_rotate_every_secs,
general.me_floor_mode,
general.me_adaptive_floor_idle_secs,
general.me_adaptive_floor_min_writers_single_endpoint,
general.me_adaptive_floor_min_writers_multi_endpoint,
general.me_adaptive_floor_recover_grace_secs,
general.me_adaptive_floor_writers_per_core_total,
general.me_adaptive_floor_cpu_cores_override,
general.me_adaptive_floor_max_extra_writers_single_per_core,
general.me_adaptive_floor_max_extra_writers_multi_per_core,
general.me_adaptive_floor_max_active_writers_per_core,
general.me_adaptive_floor_max_warm_writers_per_core,
general.me_adaptive_floor_max_active_writers_global,
general.me_adaptive_floor_max_warm_writers_global,
general.hardswap,
general.me_pool_drain_ttl_secs,
general.me_pool_drain_threshold,
general.me_pool_drain_soft_evict_enabled,
general.me_pool_drain_soft_evict_grace_secs,
general.me_pool_drain_soft_evict_per_writer,
general.me_pool_drain_soft_evict_budget_per_core,
general.me_pool_drain_soft_evict_cooldown_ms,
general.effective_me_pool_force_close_secs(),
general.me_pool_min_fresh_ratio,
general.me_hardswap_warmup_delay_min_ms,
general.me_hardswap_warmup_delay_max_ms,
general.me_hardswap_warmup_extra_passes,
general.me_hardswap_warmup_pass_backoff_base_ms,
general.me_bind_stale_mode,
general.me_bind_stale_ttl_secs,
general.me_secret_atomic_snapshot,
general.me_deterministic_writer_sort,
MeWriterPickMode::default(),
general.me_writer_pick_sample_size,
MeSocksKdfPolicy::default(),
general.me_writer_cmd_channel_capacity,
general.me_route_channel_capacity,
general.me_route_backpressure_base_timeout_ms,
general.me_route_backpressure_high_timeout_ms,
general.me_route_backpressure_high_watermark_pct,
general.me_reader_route_data_wait_ms,
general.me_health_interval_ms_unhealthy,
general.me_health_interval_ms_healthy,
general.me_warn_rate_limit_ms,
MeRouteNoWriterMode::default(),
general.me_route_no_writer_wait_ms,
general.me_route_inline_recovery_attempts,
general.me_route_inline_recovery_wait_ms,
);
(pool, rng)
}
async fn insert_draining_writer(
pool: &Arc<MePool>,
writer_id: u64,
drain_started_at_epoch_secs: u64,
bound_clients: usize,
drain_deadline_epoch_secs: u64,
) {
let (tx, _writer_rx) = mpsc::channel::<WriterCommand>(8);
let writer = MeWriter {
id: writer_id,
addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 5500 + writer_id as u16),
source_ip: IpAddr::V4(Ipv4Addr::LOCALHOST),
writer_dc: 2,
generation: 1,
contour: Arc::new(AtomicU8::new(WriterContour::Draining.as_u8())),
created_at: Instant::now() - Duration::from_secs(writer_id),
tx: tx.clone(),
cancel: CancellationToken::new(),
degraded: Arc::new(AtomicBool::new(false)),
rtt_ema_ms_x10: Arc::new(AtomicU32::new(0)),
draining: Arc::new(AtomicBool::new(true)),
draining_started_at_epoch_secs: Arc::new(AtomicU64::new(drain_started_at_epoch_secs)),
drain_deadline_epoch_secs: Arc::new(AtomicU64::new(drain_deadline_epoch_secs)),
allow_drain_fallback: Arc::new(AtomicBool::new(false)),
};
pool.writers.write().await.push(writer);
pool.registry.register_writer(writer_id, tx).await;
pool.conn_count.fetch_add(1, Ordering::Relaxed);
for idx in 0..bound_clients {
let (conn_id, _rx) = pool.registry.register().await;
assert!(
pool.registry
.bind_writer(
conn_id,
writer_id,
ConnMeta {
target_dc: 2,
client_addr: SocketAddr::new(
IpAddr::V4(Ipv4Addr::LOCALHOST),
7200 + idx as u16,
),
our_addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443),
proto_flags: 0,
},
)
.await
);
}
}
#[tokio::test]
async fn me_health_monitor_drains_expired_backlog_over_multiple_cycles() {
let (pool, rng) = make_pool(128, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
let writer_total = health_drain_close_budget().saturating_mul(2).saturating_add(9);
for writer_id in 1..=writer_total as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(120),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
}
let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
tokio::time::sleep(Duration::from_millis(60)).await;
monitor.abort();
let _ = monitor.await;
assert!(pool.writers.read().await.is_empty());
}
#[tokio::test]
async fn me_health_monitor_cleans_empty_draining_writers_without_force_close() {
let (pool, rng) = make_pool(128, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
for writer_id in 1..=24u64 {
insert_draining_writer(&pool, writer_id, now_epoch_secs.saturating_sub(60), 0, 0).await;
}
let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
tokio::time::sleep(Duration::from_millis(30)).await;
monitor.abort();
let _ = monitor.await;
assert!(pool.writers.read().await.is_empty());
}
#[tokio::test]
async fn me_health_monitor_converges_retry_like_threshold_backlog_to_empty() {
let threshold = 4u64;
let (pool, rng) = make_pool(threshold, 1, 1).await;
let now_epoch_secs = MePool::now_epoch_secs();
let writer_total = threshold as usize + health_drain_close_budget().saturating_add(11);
for writer_id in 1..=writer_total as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(300).saturating_add(writer_id),
1,
0,
)
.await;
}
let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
tokio::time::sleep(Duration::from_millis(60)).await;
monitor.abort();
let _ = monitor.await;
assert!(pool.writers.read().await.is_empty());
}

View File

@@ -0,0 +1,533 @@
use std::collections::HashMap;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, AtomicU8, AtomicU32, AtomicU64, Ordering};
use std::time::{Duration, Instant};
use tokio::sync::mpsc;
use tokio_util::sync::CancellationToken;
use super::codec::WriterCommand;
use super::health::{health_drain_close_budget, reap_draining_writers};
use super::pool::{MePool, MeWriter, WriterContour};
use super::registry::ConnMeta;
use crate::config::{GeneralConfig, MeRouteNoWriterMode, MeSocksKdfPolicy, MeWriterPickMode};
use crate::crypto::SecureRandom;
use crate::network::probe::NetworkDecision;
use crate::stats::Stats;
async fn make_pool(me_pool_drain_threshold: u64) -> Arc<MePool> {
let general = GeneralConfig {
me_pool_drain_threshold,
..GeneralConfig::default()
};
MePool::new(
None,
vec![1u8; 32],
None,
false,
None,
Vec::new(),
1,
None,
12,
1200,
HashMap::new(),
HashMap::new(),
None,
NetworkDecision::default(),
None,
Arc::new(SecureRandom::new()),
Arc::new(Stats::new()),
general.me_keepalive_enabled,
general.me_keepalive_interval_secs,
general.me_keepalive_jitter_secs,
general.me_keepalive_payload_random,
general.rpc_proxy_req_every,
general.me_warmup_stagger_enabled,
general.me_warmup_step_delay_ms,
general.me_warmup_step_jitter_ms,
general.me_reconnect_max_concurrent_per_dc,
general.me_reconnect_backoff_base_ms,
general.me_reconnect_backoff_cap_ms,
general.me_reconnect_fast_retry_count,
general.me_single_endpoint_shadow_writers,
general.me_single_endpoint_outage_mode_enabled,
general.me_single_endpoint_outage_disable_quarantine,
general.me_single_endpoint_outage_backoff_min_ms,
general.me_single_endpoint_outage_backoff_max_ms,
general.me_single_endpoint_shadow_rotate_every_secs,
general.me_floor_mode,
general.me_adaptive_floor_idle_secs,
general.me_adaptive_floor_min_writers_single_endpoint,
general.me_adaptive_floor_min_writers_multi_endpoint,
general.me_adaptive_floor_recover_grace_secs,
general.me_adaptive_floor_writers_per_core_total,
general.me_adaptive_floor_cpu_cores_override,
general.me_adaptive_floor_max_extra_writers_single_per_core,
general.me_adaptive_floor_max_extra_writers_multi_per_core,
general.me_adaptive_floor_max_active_writers_per_core,
general.me_adaptive_floor_max_warm_writers_per_core,
general.me_adaptive_floor_max_active_writers_global,
general.me_adaptive_floor_max_warm_writers_global,
general.hardswap,
general.me_pool_drain_ttl_secs,
general.me_pool_drain_threshold,
general.me_pool_drain_soft_evict_enabled,
general.me_pool_drain_soft_evict_grace_secs,
general.me_pool_drain_soft_evict_per_writer,
general.me_pool_drain_soft_evict_budget_per_core,
general.me_pool_drain_soft_evict_cooldown_ms,
general.effective_me_pool_force_close_secs(),
general.me_pool_min_fresh_ratio,
general.me_hardswap_warmup_delay_min_ms,
general.me_hardswap_warmup_delay_max_ms,
general.me_hardswap_warmup_extra_passes,
general.me_hardswap_warmup_pass_backoff_base_ms,
general.me_bind_stale_mode,
general.me_bind_stale_ttl_secs,
general.me_secret_atomic_snapshot,
general.me_deterministic_writer_sort,
MeWriterPickMode::default(),
general.me_writer_pick_sample_size,
MeSocksKdfPolicy::default(),
general.me_writer_cmd_channel_capacity,
general.me_route_channel_capacity,
general.me_route_backpressure_base_timeout_ms,
general.me_route_backpressure_high_timeout_ms,
general.me_route_backpressure_high_watermark_pct,
general.me_reader_route_data_wait_ms,
general.me_health_interval_ms_unhealthy,
general.me_health_interval_ms_healthy,
general.me_warn_rate_limit_ms,
MeRouteNoWriterMode::default(),
general.me_route_no_writer_wait_ms,
general.me_route_inline_recovery_attempts,
general.me_route_inline_recovery_wait_ms,
)
}
async fn insert_draining_writer(
pool: &Arc<MePool>,
writer_id: u64,
drain_started_at_epoch_secs: u64,
bound_clients: usize,
drain_deadline_epoch_secs: u64,
) -> Vec<u64> {
let mut conn_ids = Vec::with_capacity(bound_clients);
let (tx, _writer_rx) = mpsc::channel::<WriterCommand>(8);
let writer = MeWriter {
id: writer_id,
addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 4500 + writer_id as u16),
source_ip: IpAddr::V4(Ipv4Addr::LOCALHOST),
writer_dc: 2,
generation: 1,
contour: Arc::new(AtomicU8::new(WriterContour::Draining.as_u8())),
created_at: Instant::now() - Duration::from_secs(writer_id),
tx: tx.clone(),
cancel: CancellationToken::new(),
degraded: Arc::new(AtomicBool::new(false)),
rtt_ema_ms_x10: Arc::new(AtomicU32::new(0)),
draining: Arc::new(AtomicBool::new(true)),
draining_started_at_epoch_secs: Arc::new(AtomicU64::new(drain_started_at_epoch_secs)),
drain_deadline_epoch_secs: Arc::new(AtomicU64::new(drain_deadline_epoch_secs)),
allow_drain_fallback: Arc::new(AtomicBool::new(false)),
};
pool.writers.write().await.push(writer);
pool.registry.register_writer(writer_id, tx).await;
pool.conn_count.fetch_add(1, Ordering::Relaxed);
for idx in 0..bound_clients {
let (conn_id, _rx) = pool.registry.register().await;
assert!(
pool.registry
.bind_writer(
conn_id,
writer_id,
ConnMeta {
target_dc: 2,
client_addr: SocketAddr::new(
IpAddr::V4(Ipv4Addr::LOCALHOST),
6200 + idx as u16,
),
our_addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443),
proto_flags: 0,
},
)
.await
);
conn_ids.push(conn_id);
}
conn_ids
}
async fn current_writer_ids(pool: &Arc<MePool>) -> Vec<u64> {
let mut writer_ids = pool
.writers
.read()
.await
.iter()
.map(|writer| writer.id)
.collect::<Vec<_>>();
writer_ids.sort_unstable();
writer_ids
}
#[tokio::test]
async fn reap_draining_writers_drops_warn_state_for_removed_writer() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
let conn_ids =
insert_draining_writer(&pool, 7, now_epoch_secs.saturating_sub(180), 1, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(warn_next_allowed.contains_key(&7));
let _ = pool.remove_writer_and_close_clients(7).await;
assert!(pool.registry.get_writer(conn_ids[0]).await.is_none());
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(!warn_next_allowed.contains_key(&7));
}
#[tokio::test]
async fn reap_draining_writers_removes_empty_draining_writers() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 1, now_epoch_secs.saturating_sub(40), 0, 0).await;
insert_draining_writer(&pool, 2, now_epoch_secs.saturating_sub(30), 0, 0).await;
insert_draining_writer(&pool, 3, now_epoch_secs.saturating_sub(20), 1, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert_eq!(current_writer_ids(&pool).await, vec![3]);
}
#[tokio::test]
async fn reap_draining_writers_overflow_closes_oldest_non_empty_writers() {
let pool = make_pool(2).await;
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 11, now_epoch_secs.saturating_sub(40), 1, 0).await;
insert_draining_writer(&pool, 22, now_epoch_secs.saturating_sub(30), 1, 0).await;
insert_draining_writer(&pool, 33, now_epoch_secs.saturating_sub(20), 1, 0).await;
insert_draining_writer(&pool, 44, now_epoch_secs.saturating_sub(10), 1, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert_eq!(current_writer_ids(&pool).await, vec![33, 44]);
}
#[tokio::test]
async fn reap_draining_writers_deadline_force_close_applies_under_threshold() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(
&pool,
50,
now_epoch_secs.saturating_sub(15),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(current_writer_ids(&pool).await.is_empty());
}
#[tokio::test]
async fn reap_draining_writers_limits_closes_per_health_tick() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
let close_budget = health_drain_close_budget();
let writer_total = close_budget.saturating_add(19);
for writer_id in 1..=writer_total as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(20),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
}
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert_eq!(pool.writers.read().await.len(), writer_total - close_budget);
}
#[tokio::test]
async fn reap_draining_writers_backlog_drains_across_ticks() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
let close_budget = health_drain_close_budget();
let writer_total = close_budget.saturating_mul(2).saturating_add(7);
for writer_id in 1..=writer_total as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(20),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
}
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
for _ in 0..8 {
if pool.writers.read().await.is_empty() {
break;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
}
assert!(pool.writers.read().await.is_empty());
}
#[tokio::test]
async fn reap_draining_writers_threshold_backlog_converges_to_threshold() {
let threshold = 5u64;
let pool = make_pool(threshold).await;
let now_epoch_secs = MePool::now_epoch_secs();
let close_budget = health_drain_close_budget();
let writer_total = threshold as usize + close_budget.saturating_add(12);
for writer_id in 1..=writer_total as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(200).saturating_add(writer_id),
1,
0,
)
.await;
}
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
for _ in 0..16 {
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
if pool.writers.read().await.len() <= threshold as usize {
break;
}
}
assert_eq!(pool.writers.read().await.len(), threshold as usize);
}
#[tokio::test]
async fn reap_draining_writers_threshold_zero_preserves_non_expired_non_empty_writers() {
let pool = make_pool(0).await;
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 10, now_epoch_secs.saturating_sub(40), 1, 0).await;
insert_draining_writer(&pool, 20, now_epoch_secs.saturating_sub(30), 1, 0).await;
insert_draining_writer(&pool, 30, now_epoch_secs.saturating_sub(20), 1, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert_eq!(current_writer_ids(&pool).await, vec![10, 20, 30]);
}
#[tokio::test]
async fn reap_draining_writers_prioritizes_force_close_before_empty_cleanup() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
let close_budget = health_drain_close_budget();
for writer_id in 1..=close_budget as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(20),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
}
let empty_writer_id = close_budget as u64 + 1;
insert_draining_writer(&pool, empty_writer_id, now_epoch_secs.saturating_sub(20), 0, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert_eq!(current_writer_ids(&pool).await, vec![empty_writer_id]);
}
#[tokio::test]
async fn reap_draining_writers_empty_cleanup_does_not_increment_force_close_metric() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 1, now_epoch_secs.saturating_sub(60), 0, 0).await;
insert_draining_writer(&pool, 2, now_epoch_secs.saturating_sub(50), 0, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(current_writer_ids(&pool).await.is_empty());
assert_eq!(pool.stats.get_pool_force_close_total(), 0);
}
#[tokio::test]
async fn reap_draining_writers_handles_duplicate_force_close_requests_for_same_writer() {
let pool = make_pool(1).await;
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(
&pool,
10,
now_epoch_secs.saturating_sub(30),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
insert_draining_writer(
&pool,
20,
now_epoch_secs.saturating_sub(20),
1,
now_epoch_secs.saturating_sub(1),
)
.await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(current_writer_ids(&pool).await.is_empty());
}
#[tokio::test]
async fn reap_draining_writers_warn_state_never_exceeds_live_draining_population_under_churn() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
for wave in 0..12u64 {
for offset in 0..9u64 {
insert_draining_writer(
&pool,
wave * 100 + offset,
now_epoch_secs.saturating_sub(120 + offset),
1,
0,
)
.await;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
let existing_writer_ids = current_writer_ids(&pool).await;
for writer_id in existing_writer_ids.into_iter().take(4) {
let _ = pool.remove_writer_and_close_clients(writer_id).await;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
}
}
#[tokio::test]
async fn reap_draining_writers_mixed_backlog_converges_without_leaking_warn_state() {
let pool = make_pool(6).await;
let now_epoch_secs = MePool::now_epoch_secs();
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
for writer_id in 1..=18u64 {
let bound_clients = if writer_id % 3 == 0 { 0 } else { 1 };
let deadline = if writer_id % 2 == 0 {
now_epoch_secs.saturating_sub(1)
} else {
0
};
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(300).saturating_add(writer_id),
bound_clients,
deadline,
)
.await;
}
for _ in 0..16 {
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
if pool.writers.read().await.len() <= 6 {
break;
}
}
assert!(pool.writers.read().await.len() <= 6);
assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
}
#[tokio::test]
async fn reap_draining_writers_soft_evicts_stuck_writer_with_per_writer_cap() {
let pool = make_pool(128).await;
pool.me_pool_drain_soft_evict_enabled.store(true, Ordering::Relaxed);
pool.me_pool_drain_soft_evict_grace_secs.store(0, Ordering::Relaxed);
pool.me_pool_drain_soft_evict_per_writer.store(1, Ordering::Relaxed);
pool.me_pool_drain_soft_evict_budget_per_core.store(8, Ordering::Relaxed);
pool.me_pool_drain_soft_evict_cooldown_ms
.store(1, Ordering::Relaxed);
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 77, now_epoch_secs.saturating_sub(240), 3, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
let activity = pool.registry.writer_activity_snapshot().await;
assert_eq!(activity.bound_clients_by_writer.get(&77), Some(&2));
assert_eq!(pool.stats.get_pool_drain_soft_evict_total(), 1);
assert_eq!(pool.stats.get_pool_drain_soft_evict_writer_total(), 1);
assert_eq!(current_writer_ids(&pool).await, vec![77]);
}
#[tokio::test]
async fn reap_draining_writers_soft_evict_respects_cooldown_per_writer() {
let pool = make_pool(128).await;
pool.me_pool_drain_soft_evict_enabled.store(true, Ordering::Relaxed);
pool.me_pool_drain_soft_evict_grace_secs.store(0, Ordering::Relaxed);
pool.me_pool_drain_soft_evict_per_writer.store(1, Ordering::Relaxed);
pool.me_pool_drain_soft_evict_budget_per_core.store(8, Ordering::Relaxed);
pool.me_pool_drain_soft_evict_cooldown_ms
.store(60_000, Ordering::Relaxed);
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 88, now_epoch_secs.saturating_sub(240), 3, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
let activity = pool.registry.writer_activity_snapshot().await;
assert_eq!(activity.bound_clients_by_writer.get(&88), Some(&2));
assert_eq!(pool.stats.get_pool_drain_soft_evict_total(), 1);
assert_eq!(pool.stats.get_pool_drain_soft_evict_writer_total(), 1);
}
#[test]
fn general_config_default_drain_threshold_remains_enabled() {
assert_eq!(GeneralConfig::default().me_pool_drain_threshold, 128);
assert!(GeneralConfig::default().me_pool_drain_soft_evict_enabled);
assert_eq!(
GeneralConfig::default().me_pool_drain_soft_evict_per_writer,
1
);
}

View File

@@ -21,6 +21,12 @@ mod secret;
mod selftest;
mod wire;
mod pool_status;
#[cfg(test)]
mod health_regression_tests;
#[cfg(test)]
mod health_integration_tests;
#[cfg(test)]
mod health_adversarial_tests;
use bytes::Bytes;

View File

@@ -172,6 +172,11 @@ pub struct MePool {
pub(super) kdf_material_fingerprint: Arc<RwLock<HashMap<SocketAddr, (u64, u16)>>>,
pub(super) me_pool_drain_ttl_secs: AtomicU64,
pub(super) me_pool_drain_threshold: AtomicU64,
pub(super) me_pool_drain_soft_evict_enabled: AtomicBool,
pub(super) me_pool_drain_soft_evict_grace_secs: AtomicU64,
pub(super) me_pool_drain_soft_evict_per_writer: AtomicU8,
pub(super) me_pool_drain_soft_evict_budget_per_core: AtomicU32,
pub(super) me_pool_drain_soft_evict_cooldown_ms: AtomicU64,
pub(super) me_pool_force_close_secs: AtomicU64,
pub(super) me_pool_min_fresh_ratio_permille: AtomicU32,
pub(super) me_hardswap_warmup_delay_min_ms: AtomicU64,
@@ -273,6 +278,11 @@ impl MePool {
hardswap: bool,
me_pool_drain_ttl_secs: u64,
me_pool_drain_threshold: u64,
me_pool_drain_soft_evict_enabled: bool,
me_pool_drain_soft_evict_grace_secs: u64,
me_pool_drain_soft_evict_per_writer: u8,
me_pool_drain_soft_evict_budget_per_core: u16,
me_pool_drain_soft_evict_cooldown_ms: u64,
me_pool_force_close_secs: u64,
me_pool_min_fresh_ratio: f32,
me_hardswap_warmup_delay_min_ms: u64,
@@ -449,6 +459,17 @@ impl MePool {
kdf_material_fingerprint: Arc::new(RwLock::new(HashMap::new())),
me_pool_drain_ttl_secs: AtomicU64::new(me_pool_drain_ttl_secs),
me_pool_drain_threshold: AtomicU64::new(me_pool_drain_threshold),
me_pool_drain_soft_evict_enabled: AtomicBool::new(me_pool_drain_soft_evict_enabled),
me_pool_drain_soft_evict_grace_secs: AtomicU64::new(me_pool_drain_soft_evict_grace_secs),
me_pool_drain_soft_evict_per_writer: AtomicU8::new(
me_pool_drain_soft_evict_per_writer.max(1),
),
me_pool_drain_soft_evict_budget_per_core: AtomicU32::new(
me_pool_drain_soft_evict_budget_per_core.max(1) as u32,
),
me_pool_drain_soft_evict_cooldown_ms: AtomicU64::new(
me_pool_drain_soft_evict_cooldown_ms.max(1),
),
me_pool_force_close_secs: AtomicU64::new(me_pool_force_close_secs),
me_pool_min_fresh_ratio_permille: AtomicU32::new(Self::ratio_to_permille(
me_pool_min_fresh_ratio,
@@ -496,6 +517,11 @@ impl MePool {
hardswap: bool,
drain_ttl_secs: u64,
pool_drain_threshold: u64,
pool_drain_soft_evict_enabled: bool,
pool_drain_soft_evict_grace_secs: u64,
pool_drain_soft_evict_per_writer: u8,
pool_drain_soft_evict_budget_per_core: u16,
pool_drain_soft_evict_cooldown_ms: u64,
force_close_secs: u64,
min_fresh_ratio: f32,
hardswap_warmup_delay_min_ms: u64,
@@ -536,6 +562,18 @@ impl MePool {
.store(drain_ttl_secs, Ordering::Relaxed);
self.me_pool_drain_threshold
.store(pool_drain_threshold, Ordering::Relaxed);
self.me_pool_drain_soft_evict_enabled
.store(pool_drain_soft_evict_enabled, Ordering::Relaxed);
self.me_pool_drain_soft_evict_grace_secs
.store(pool_drain_soft_evict_grace_secs, Ordering::Relaxed);
self.me_pool_drain_soft_evict_per_writer
.store(pool_drain_soft_evict_per_writer.max(1), Ordering::Relaxed);
self.me_pool_drain_soft_evict_budget_per_core.store(
pool_drain_soft_evict_budget_per_core.max(1) as u32,
Ordering::Relaxed,
);
self.me_pool_drain_soft_evict_cooldown_ms
.store(pool_drain_soft_evict_cooldown_ms.max(1), Ordering::Relaxed);
self.me_pool_force_close_secs
.store(force_close_secs, Ordering::Relaxed);
self.me_pool_min_fresh_ratio_permille
@@ -690,6 +728,36 @@ impl MePool {
}
}
pub(super) fn drain_soft_evict_enabled(&self) -> bool {
self.me_pool_drain_soft_evict_enabled
.load(Ordering::Relaxed)
}
pub(super) fn drain_soft_evict_grace_secs(&self) -> u64 {
self.me_pool_drain_soft_evict_grace_secs
.load(Ordering::Relaxed)
}
pub(super) fn drain_soft_evict_per_writer(&self) -> usize {
self.me_pool_drain_soft_evict_per_writer
.load(Ordering::Relaxed)
.max(1) as usize
}
pub(super) fn drain_soft_evict_budget_per_core(&self) -> usize {
self.me_pool_drain_soft_evict_budget_per_core
.load(Ordering::Relaxed)
.max(1) as usize
}
pub(super) fn drain_soft_evict_cooldown(&self) -> Duration {
Duration::from_millis(
self.me_pool_drain_soft_evict_cooldown_ms
.load(Ordering::Relaxed)
.max(1),
)
}
pub(super) async fn key_selector(&self) -> u32 {
self.proxy_secret.read().await.key_selector
}

View File

@@ -70,10 +70,12 @@ impl MePool {
let mut missing_dc = Vec::<i32>::new();
let mut covered = 0usize;
let mut total = 0usize;
for (dc, endpoints) in desired_by_dc {
if endpoints.is_empty() {
continue;
}
total += 1;
if endpoints
.iter()
.any(|addr| active_writer_addrs.contains(&(*dc, *addr)))
@@ -85,7 +87,9 @@ impl MePool {
}
missing_dc.sort_unstable();
let total = desired_by_dc.len().max(1);
if total == 0 {
return (1.0, missing_dc);
}
let ratio = (covered as f32) / (total as f32);
(ratio, missing_dc)
}
@@ -399,29 +403,21 @@ impl MePool {
}
if hardswap {
let mut fresh_missing_dc = Vec::<(i32, usize, usize)>::new();
for (dc, endpoints) in &desired_by_dc {
if endpoints.is_empty() {
continue;
}
let required = self.required_writers_for_dc(endpoints.len());
let fresh_count = writers
.iter()
.filter(|w| !w.draining.load(Ordering::Relaxed))
.filter(|w| w.generation == generation)
.filter(|w| w.writer_dc == *dc)
.filter(|w| endpoints.contains(&w.addr))
.count();
if fresh_count < required {
fresh_missing_dc.push((*dc, fresh_count, required));
}
}
let fresh_writer_addrs: HashSet<(i32, SocketAddr)> = writers
.iter()
.filter(|w| !w.draining.load(Ordering::Relaxed))
.filter(|w| w.generation == generation)
.map(|w| (w.writer_dc, w.addr))
.collect();
let (fresh_coverage_ratio, fresh_missing_dc) =
Self::coverage_ratio(&desired_by_dc, &fresh_writer_addrs);
if !fresh_missing_dc.is_empty() {
warn!(
previous_generation,
generation,
fresh_coverage_ratio = format_args!("{fresh_coverage_ratio:.3}"),
missing_dc = ?fresh_missing_dc,
"ME hardswap pending: fresh generation coverage incomplete"
"ME hardswap pending: fresh generation DC coverage incomplete"
);
return;
}
@@ -491,3 +487,61 @@ impl MePool {
self.zero_downtime_reinit_after_map_change(rng).await;
}
}
#[cfg(test)]
mod tests {
use std::collections::{HashMap, HashSet};
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use super::MePool;
fn addr(octet: u8, port: u16) -> SocketAddr {
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, octet)), port)
}
#[test]
fn coverage_ratio_counts_dc_coverage_not_floor() {
let dc1 = addr(1, 2001);
let dc2 = addr(2, 2002);
let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
desired_by_dc.insert(1, HashSet::from([dc1]));
desired_by_dc.insert(2, HashSet::from([dc2]));
let active_writer_addrs = HashSet::from([(1, dc1)]);
let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &active_writer_addrs);
assert_eq!(ratio, 0.5);
assert_eq!(missing_dc, vec![2]);
}
#[test]
fn coverage_ratio_ignores_empty_dc_groups() {
let dc1 = addr(1, 2001);
let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
desired_by_dc.insert(1, HashSet::from([dc1]));
desired_by_dc.insert(2, HashSet::new());
let active_writer_addrs = HashSet::from([(1, dc1)]);
let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &active_writer_addrs);
assert_eq!(ratio, 1.0);
assert!(missing_dc.is_empty());
}
#[test]
fn coverage_ratio_reports_missing_dcs_sorted() {
let dc1 = addr(1, 2001);
let dc2 = addr(2, 2002);
let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
desired_by_dc.insert(2, HashSet::from([dc2]));
desired_by_dc.insert(1, HashSet::from([dc1]));
let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &HashSet::new());
assert_eq!(ratio, 0.0);
assert_eq!(missing_dc, vec![1, 2]);
}
}

View File

@@ -40,6 +40,7 @@ pub(crate) struct MeApiDcStatusSnapshot {
pub floor_max: usize,
pub floor_capped: bool,
pub alive_writers: usize,
pub coverage_ratio: f64,
pub coverage_pct: f64,
pub fresh_alive_writers: usize,
pub fresh_coverage_pct: f64,
@@ -62,6 +63,7 @@ pub(crate) struct MeApiStatusSnapshot {
pub available_pct: f64,
pub required_writers: usize,
pub alive_writers: usize,
pub coverage_ratio: f64,
pub coverage_pct: f64,
pub fresh_alive_writers: usize,
pub fresh_coverage_pct: f64,
@@ -124,6 +126,11 @@ pub(crate) struct MeApiRuntimeSnapshot {
pub me_reconnect_backoff_cap_ms: u64,
pub me_reconnect_fast_retry_count: u32,
pub me_pool_drain_ttl_secs: u64,
pub me_pool_drain_soft_evict_enabled: bool,
pub me_pool_drain_soft_evict_grace_secs: u64,
pub me_pool_drain_soft_evict_per_writer: u8,
pub me_pool_drain_soft_evict_budget_per_core: u16,
pub me_pool_drain_soft_evict_cooldown_ms: u64,
pub me_pool_force_close_secs: u64,
pub me_pool_min_fresh_ratio: f32,
pub me_bind_stale_mode: &'static str,
@@ -337,6 +344,8 @@ impl MePool {
let mut available_endpoints = 0usize;
let mut alive_writers = 0usize;
let mut fresh_alive_writers = 0usize;
let mut coverage_ratio_dcs_total = 0usize;
let mut coverage_ratio_dcs_covered = 0usize;
let floor_mode = self.floor_mode();
let adaptive_cpu_cores = (self
.me_adaptive_floor_cpu_cores_effective
@@ -388,6 +397,12 @@ impl MePool {
available_endpoints += dc_available_endpoints;
alive_writers += dc_alive_writers;
fresh_alive_writers += dc_fresh_alive_writers;
if endpoint_count > 0 {
coverage_ratio_dcs_total += 1;
if dc_alive_writers > 0 {
coverage_ratio_dcs_covered += 1;
}
}
dcs.push(MeApiDcStatusSnapshot {
dc,
@@ -410,6 +425,11 @@ impl MePool {
floor_max,
floor_capped,
alive_writers: dc_alive_writers,
coverage_ratio: if endpoint_count > 0 && dc_alive_writers > 0 {
100.0
} else {
0.0
},
coverage_pct: ratio_pct(dc_alive_writers, dc_required_writers),
fresh_alive_writers: dc_fresh_alive_writers,
fresh_coverage_pct: ratio_pct(dc_fresh_alive_writers, dc_required_writers),
@@ -426,6 +446,7 @@ impl MePool {
available_pct: ratio_pct(available_endpoints, configured_endpoints),
required_writers,
alive_writers,
coverage_ratio: ratio_pct(coverage_ratio_dcs_covered, coverage_ratio_dcs_total),
coverage_pct: ratio_pct(alive_writers, required_writers),
fresh_alive_writers,
fresh_coverage_pct: ratio_pct(fresh_alive_writers, required_writers),
@@ -562,6 +583,22 @@ impl MePool {
me_reconnect_backoff_cap_ms: self.me_reconnect_backoff_cap.as_millis() as u64,
me_reconnect_fast_retry_count: self.me_reconnect_fast_retry_count,
me_pool_drain_ttl_secs: self.me_pool_drain_ttl_secs.load(Ordering::Relaxed),
me_pool_drain_soft_evict_enabled: self
.me_pool_drain_soft_evict_enabled
.load(Ordering::Relaxed),
me_pool_drain_soft_evict_grace_secs: self
.me_pool_drain_soft_evict_grace_secs
.load(Ordering::Relaxed),
me_pool_drain_soft_evict_per_writer: self
.me_pool_drain_soft_evict_per_writer
.load(Ordering::Relaxed),
me_pool_drain_soft_evict_budget_per_core: self
.me_pool_drain_soft_evict_budget_per_core
.load(Ordering::Relaxed)
.min(u16::MAX as u32) as u16,
me_pool_drain_soft_evict_cooldown_ms: self
.me_pool_drain_soft_evict_cooldown_ms
.load(Ordering::Relaxed),
me_pool_force_close_secs: self.me_pool_force_close_secs.load(Ordering::Relaxed),
me_pool_min_fresh_ratio: Self::permille_to_ratio(
self.me_pool_min_fresh_ratio_permille.load(Ordering::Relaxed),

View File

@@ -394,6 +394,56 @@ impl ConnRegistry {
inner.writer_for_conn.keys().copied().collect()
}
pub(super) async fn bound_conn_ids_for_writer_limited(
&self,
writer_id: u64,
limit: usize,
) -> Vec<u64> {
if limit == 0 {
return Vec::new();
}
let inner = self.inner.read().await;
let Some(conn_ids) = inner.conns_for_writer.get(&writer_id) else {
return Vec::new();
};
let mut out = conn_ids.iter().copied().collect::<Vec<_>>();
out.sort_unstable();
out.truncate(limit);
out
}
pub(super) async fn evict_bound_conn_if_writer(&self, conn_id: u64, writer_id: u64) -> bool {
let maybe_client_tx = {
let mut inner = self.inner.write().await;
if inner.writer_for_conn.get(&conn_id).copied() != Some(writer_id) {
return false;
}
let client_tx = inner.map.get(&conn_id).cloned();
inner.map.remove(&conn_id);
inner.meta.remove(&conn_id);
inner.writer_for_conn.remove(&conn_id);
let became_empty = if let Some(set) = inner.conns_for_writer.get_mut(&writer_id) {
set.remove(&conn_id);
set.is_empty()
} else {
false
};
if became_empty {
inner
.writer_idle_since_epoch_secs
.insert(writer_id, Self::now_epoch_secs());
}
client_tx
};
if let Some(client_tx) = maybe_client_tx {
let _ = client_tx.try_send(MeResponse::Close);
}
true
}
pub async fn writer_lost(&self, writer_id: u64) -> Vec<BoundConn> {
let mut inner = self.inner.write().await;
inner.writers.remove(&writer_id);
@@ -444,6 +494,7 @@ mod tests {
use super::ConnMeta;
use super::ConnRegistry;
use super::MeResponse;
#[tokio::test]
async fn writer_activity_snapshot_tracks_writer_and_dc_load() {
@@ -634,4 +685,86 @@ mod tests {
);
assert!(registry.get_writer(conn_id).await.is_none());
}
#[tokio::test]
async fn bound_conn_ids_for_writer_limited_is_sorted_and_bounded() {
let registry = ConnRegistry::new();
let (writer_tx, _writer_rx) = tokio::sync::mpsc::channel(8);
registry.register_writer(10, writer_tx).await;
let addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443);
let mut conn_ids = Vec::new();
for _ in 0..5 {
let (conn_id, _rx) = registry.register().await;
assert!(
registry
.bind_writer(
conn_id,
10,
ConnMeta {
target_dc: 2,
client_addr: addr,
our_addr: addr,
proto_flags: 0,
},
)
.await
);
conn_ids.push(conn_id);
}
conn_ids.sort_unstable();
let limited = registry.bound_conn_ids_for_writer_limited(10, 3).await;
assert_eq!(limited.len(), 3);
assert_eq!(limited, conn_ids.into_iter().take(3).collect::<Vec<_>>());
}
#[tokio::test]
async fn evict_bound_conn_if_writer_does_not_touch_rebound_conn() {
let registry = ConnRegistry::new();
let (conn_id, mut rx) = registry.register().await;
let (writer_tx_a, _writer_rx_a) = tokio::sync::mpsc::channel(8);
let (writer_tx_b, _writer_rx_b) = tokio::sync::mpsc::channel(8);
registry.register_writer(10, writer_tx_a).await;
registry.register_writer(20, writer_tx_b).await;
let addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443);
assert!(
registry
.bind_writer(
conn_id,
10,
ConnMeta {
target_dc: 2,
client_addr: addr,
our_addr: addr,
proto_flags: 0,
},
)
.await
);
assert!(
registry
.bind_writer(
conn_id,
20,
ConnMeta {
target_dc: 2,
client_addr: addr,
our_addr: addr,
proto_flags: 1,
},
)
.await
);
let evicted = registry.evict_bound_conn_if_writer(conn_id, 10).await;
assert!(!evicted);
assert_eq!(registry.get_writer(conn_id).await.expect("writer").writer_id, 20);
assert!(rx.try_recv().is_err());
let evicted = registry.evict_bound_conn_if_writer(conn_id, 20).await;
assert!(evicted);
assert!(registry.get_writer(conn_id).await.is_none());
assert!(matches!(rx.try_recv(), Ok(MeResponse::Close)));
}
}

View File

@@ -11,6 +11,8 @@ use tokio::net::TcpStream;
use socket2::{Socket, TcpKeepalive, Domain, Type, Protocol};
use tracing::debug;
const DEFAULT_SOCKET_BUFFER_BYTES: usize = 256 * 1024;
/// Configure TCP socket with recommended settings for proxy use
#[allow(dead_code)]
pub fn configure_tcp_socket(
@@ -34,10 +36,10 @@ pub fn configure_tcp_socket(
socket.set_tcp_keepalive(&keepalive)?;
}
// CHANGED: Removed manual buffer size setting (was 256KB).
// Allowing the OS kernel to handle TCP window scaling (Autotuning) is critical
// for mobile clients to avoid bufferbloat and stalled connections during uploads.
// Use explicit baseline buffers to reduce slow-start stalls on high RTT links.
socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
Ok(())
}
@@ -62,6 +64,10 @@ pub fn configure_client_socket(
let keepalive = keepalive.with_interval(Duration::from_secs(keepalive_secs));
socket.set_tcp_keepalive(&keepalive)?;
// Keep explicit baseline buffers for predictable throughput across busy hosts.
socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
// Set TCP user timeout (Linux only)
// NOTE: iOS does not support TCP_USER_TIMEOUT - application-level timeout
@@ -124,6 +130,8 @@ pub fn create_outgoing_socket_bound(addr: SocketAddr, bind_addr: Option<IpAddr>)
// Disable Nagle
socket.set_nodelay(true)?;
socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
if let Some(bind_ip) = bind_addr {
let bind_sock_addr = SocketAddr::new(bind_ip, 0);

728
tools/telemt_api.py Normal file
View File

@@ -0,0 +1,728 @@
"""
Telemt Control API Python Client
Full-coverage client for https://github.com/telemt/telemt
Usage:
client = TelemtAPI("http://127.0.0.1:9091", auth_header="your-secret")
client.health()
client.create_user("alice", max_tcp_conns=10)
client.patch_user("alice", data_quota_bytes=1_000_000_000)
client.delete_user("alice")
"""
from __future__ import annotations
import json
import secrets
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional, Union
from urllib.error import HTTPError, URLError
from urllib.request import Request, urlopen
# ---------------------------------------------------------------------------
# Exceptions
# ---------------------------------------------------------------------------
class TememtAPIError(Exception):
"""Raised when the API returns an error envelope or a transport error."""
def __init__(self, message: str, code: str | None = None,
http_status: int | None = None, request_id: int | None = None):
super().__init__(message)
self.code = code
self.http_status = http_status
self.request_id = request_id
def __repr__(self) -> str:
return (f"TememtAPIError(message={str(self)!r}, code={self.code!r}, "
f"http_status={self.http_status}, request_id={self.request_id})")
# ---------------------------------------------------------------------------
# Response wrapper
# ---------------------------------------------------------------------------
@dataclass
class APIResponse:
"""Wraps a successful API response envelope."""
ok: bool
data: Any
revision: str | None = None
def __repr__(self) -> str: # pragma: no cover
return f"APIResponse(ok={self.ok}, revision={self.revision!r}, data={self.data!r})"
# ---------------------------------------------------------------------------
# Main client
# ---------------------------------------------------------------------------
class TememtAPI:
"""
HTTP client for the Telemt Control API.
Parameters
----------
base_url:
Scheme + host + port, e.g. ``"http://127.0.0.1:9091"``.
Trailing slash is stripped automatically.
auth_header:
Exact value for the ``Authorization`` header.
Leave *None* when ``auth_header`` is not configured server-side.
timeout:
Socket timeout in seconds for every request (default 10).
"""
def __init__(
self,
base_url: str = "http://127.0.0.1:9091",
auth_header: str | None = None,
timeout: int = 10,
) -> None:
self.base_url = base_url.rstrip("/")
self.auth_header = auth_header
self.timeout = timeout
# ------------------------------------------------------------------
# Low-level HTTP helpers
# ------------------------------------------------------------------
def _headers(self, extra: dict | None = None) -> dict:
h = {"Content-Type": "application/json; charset=utf-8",
"Accept": "application/json"}
if self.auth_header:
h["Authorization"] = self.auth_header
if extra:
h.update(extra)
return h
def _request(
self,
method: str,
path: str,
body: dict | None = None,
if_match: str | None = None,
query: dict | None = None,
) -> APIResponse:
url = self.base_url + path
if query:
qs = "&".join(f"{k}={v}" for k, v in query.items())
url = f"{url}?{qs}"
raw_body: bytes | None = None
if body is not None:
raw_body = json.dumps(body).encode()
extra_headers: dict = {}
if if_match is not None:
extra_headers["If-Match"] = if_match
req = Request(
url,
data=raw_body,
headers=self._headers(extra_headers),
method=method,
)
try:
with urlopen(req, timeout=self.timeout) as resp:
payload = json.loads(resp.read())
except HTTPError as exc:
raw = exc.read()
try:
payload = json.loads(raw)
except Exception:
raise TememtAPIError(
str(exc), http_status=exc.code
) from exc
err = payload.get("error", {})
raise TememtAPIError(
err.get("message", str(exc)),
code=err.get("code"),
http_status=exc.code,
request_id=payload.get("request_id"),
) from exc
except URLError as exc:
raise TememtAPIError(str(exc)) from exc
if not payload.get("ok"):
err = payload.get("error", {})
raise TememtAPIError(
err.get("message", "unknown error"),
code=err.get("code"),
request_id=payload.get("request_id"),
)
return APIResponse(
ok=True,
data=payload.get("data"),
revision=payload.get("revision"),
)
def _get(self, path: str, query: dict | None = None) -> APIResponse:
return self._request("GET", path, query=query)
def _post(self, path: str, body: dict | None = None,
if_match: str | None = None) -> APIResponse:
return self._request("POST", path, body=body, if_match=if_match)
def _patch(self, path: str, body: dict,
if_match: str | None = None) -> APIResponse:
return self._request("PATCH", path, body=body, if_match=if_match)
def _delete(self, path: str, if_match: str | None = None) -> APIResponse:
return self._request("DELETE", path, if_match=if_match)
# ------------------------------------------------------------------
# Health & system
# ------------------------------------------------------------------
def health(self) -> APIResponse:
"""GET /v1/health — liveness probe."""
return self._get("/v1/health")
def system_info(self) -> APIResponse:
"""GET /v1/system/info — binary version, uptime, config hash."""
return self._get("/v1/system/info")
# ------------------------------------------------------------------
# Runtime gates & initialization
# ------------------------------------------------------------------
def runtime_gates(self) -> APIResponse:
"""GET /v1/runtime/gates — admission gates and startup progress."""
return self._get("/v1/runtime/gates")
def runtime_initialization(self) -> APIResponse:
"""GET /v1/runtime/initialization — detailed startup timeline."""
return self._get("/v1/runtime/initialization")
# ------------------------------------------------------------------
# Limits & security
# ------------------------------------------------------------------
def limits_effective(self) -> APIResponse:
"""GET /v1/limits/effective — effective timeout/upstream/ME limits."""
return self._get("/v1/limits/effective")
def security_posture(self) -> APIResponse:
"""GET /v1/security/posture — API auth, telemetry, log-level summary."""
return self._get("/v1/security/posture")
def security_whitelist(self) -> APIResponse:
"""GET /v1/security/whitelist — current IP whitelist CIDRs."""
return self._get("/v1/security/whitelist")
# ------------------------------------------------------------------
# Stats
# ------------------------------------------------------------------
def stats_summary(self) -> APIResponse:
"""GET /v1/stats/summary — uptime, connection totals, user count."""
return self._get("/v1/stats/summary")
def stats_zero_all(self) -> APIResponse:
"""GET /v1/stats/zero/all — zero-cost counters (core, upstream, ME, pool, desync)."""
return self._get("/v1/stats/zero/all")
def stats_upstreams(self) -> APIResponse:
"""GET /v1/stats/upstreams — upstream health + zero counters."""
return self._get("/v1/stats/upstreams")
def stats_minimal_all(self) -> APIResponse:
"""GET /v1/stats/minimal/all — ME writers + DC snapshot (requires minimal_runtime_enabled)."""
return self._get("/v1/stats/minimal/all")
def stats_me_writers(self) -> APIResponse:
"""GET /v1/stats/me-writers — per-writer ME status (requires minimal_runtime_enabled)."""
return self._get("/v1/stats/me-writers")
def stats_dcs(self) -> APIResponse:
"""GET /v1/stats/dcs — per-DC coverage and writer counts (requires minimal_runtime_enabled)."""
return self._get("/v1/stats/dcs")
# ------------------------------------------------------------------
# Runtime deep-dive
# ------------------------------------------------------------------
def runtime_me_pool_state(self) -> APIResponse:
"""GET /v1/runtime/me_pool_state — ME pool generation/writer/refill snapshot."""
return self._get("/v1/runtime/me_pool_state")
def runtime_me_quality(self) -> APIResponse:
"""GET /v1/runtime/me_quality — ME KDF, route-drop, and per-DC RTT counters."""
return self._get("/v1/runtime/me_quality")
def runtime_upstream_quality(self) -> APIResponse:
"""GET /v1/runtime/upstream_quality — per-upstream health, latency, DC preferences."""
return self._get("/v1/runtime/upstream_quality")
def runtime_nat_stun(self) -> APIResponse:
"""GET /v1/runtime/nat_stun — NAT probe state, STUN servers, reflected IPs."""
return self._get("/v1/runtime/nat_stun")
def runtime_me_selftest(self) -> APIResponse:
"""GET /v1/runtime/me-selftest — KDF/timeskew/IP/PID/BND health state."""
return self._get("/v1/runtime/me-selftest")
def runtime_connections_summary(self) -> APIResponse:
"""GET /v1/runtime/connections/summary — live connection totals + top-N users (requires runtime_edge_enabled)."""
return self._get("/v1/runtime/connections/summary")
def runtime_events_recent(self, limit: int | None = None) -> APIResponse:
"""GET /v1/runtime/events/recent — recent ring-buffer events (requires runtime_edge_enabled).
Parameters
----------
limit:
Optional cap on returned events (11000, server default 50).
"""
query = {"limit": str(limit)} if limit is not None else None
return self._get("/v1/runtime/events/recent", query=query)
# ------------------------------------------------------------------
# Users (read)
# ------------------------------------------------------------------
def list_users(self) -> APIResponse:
"""GET /v1/users — list all users with connection/traffic info."""
return self._get("/v1/users")
def get_user(self, username: str) -> APIResponse:
"""GET /v1/users/{username} — single user info."""
return self._get(f"/v1/users/{_safe(username)}")
# ------------------------------------------------------------------
# Users (write)
# ------------------------------------------------------------------
def create_user(
self,
username: str,
*,
secret: str | None = None,
user_ad_tag: str | None = None,
max_tcp_conns: int | None = None,
expiration_rfc3339: str | None = None,
data_quota_bytes: int | None = None,
max_unique_ips: int | None = None,
if_match: str | None = None,
) -> APIResponse:
"""POST /v1/users — create a new user.
Parameters
----------
username:
``[A-Za-z0-9_.-]``, length 164.
secret:
Exactly 32 hex chars. Auto-generated if omitted.
user_ad_tag:
Exactly 32 hex chars.
max_tcp_conns:
Per-user concurrent TCP limit.
expiration_rfc3339:
RFC3339 expiration timestamp, e.g. ``"2025-12-31T23:59:59Z"``.
data_quota_bytes:
Per-user traffic quota in bytes.
max_unique_ips:
Per-user unique source IP limit.
if_match:
Optional ``If-Match`` revision for optimistic concurrency.
"""
body: Dict[str, Any] = {"username": username}
_opt(body, "secret", secret)
_opt(body, "user_ad_tag", user_ad_tag)
_opt(body, "max_tcp_conns", max_tcp_conns)
_opt(body, "expiration_rfc3339", expiration_rfc3339)
_opt(body, "data_quota_bytes", data_quota_bytes)
_opt(body, "max_unique_ips", max_unique_ips)
return self._post("/v1/users", body=body, if_match=if_match)
def patch_user(
self,
username: str,
*,
secret: str | None = None,
user_ad_tag: str | None = None,
max_tcp_conns: int | None = None,
expiration_rfc3339: str | None = None,
data_quota_bytes: int | None = None,
max_unique_ips: int | None = None,
if_match: str | None = None,
) -> APIResponse:
"""PATCH /v1/users/{username} — partial update; only provided fields change.
Parameters
----------
username:
Existing username to update.
secret:
New secret (32 hex chars).
user_ad_tag:
New ad tag (32 hex chars).
max_tcp_conns:
New TCP concurrency limit.
expiration_rfc3339:
New expiration timestamp.
data_quota_bytes:
New quota in bytes.
max_unique_ips:
New unique IP limit.
if_match:
Optional ``If-Match`` revision.
"""
body: Dict[str, Any] = {}
_opt(body, "secret", secret)
_opt(body, "user_ad_tag", user_ad_tag)
_opt(body, "max_tcp_conns", max_tcp_conns)
_opt(body, "expiration_rfc3339", expiration_rfc3339)
_opt(body, "data_quota_bytes", data_quota_bytes)
_opt(body, "max_unique_ips", max_unique_ips)
if not body:
raise ValueError("patch_user: at least one field must be provided")
return self._patch(f"/v1/users/{_safe(username)}", body=body,
if_match=if_match)
def delete_user(
self,
username: str,
*,
if_match: str | None = None,
) -> APIResponse:
"""DELETE /v1/users/{username} — remove user; blocks deletion of last user.
Parameters
----------
if_match:
Optional ``If-Match`` revision for optimistic concurrency.
"""
return self._delete(f"/v1/users/{_safe(username)}", if_match=if_match)
# NOTE: POST /v1/users/{username}/rotate-secret currently returns 404
# in the route matcher (documented limitation). The method is provided
# for completeness and future compatibility.
def rotate_secret(
self,
username: str,
*,
secret: str | None = None,
if_match: str | None = None,
) -> APIResponse:
"""POST /v1/users/{username}/rotate-secret — rotate user secret.
.. warning::
This endpoint currently returns ``404 not_found`` in all released
versions (documented route matcher limitation). The method is
included for future compatibility.
Parameters
----------
secret:
New secret (32 hex chars). Auto-generated if omitted.
"""
body: Dict[str, Any] = {}
_opt(body, "secret", secret)
return self._post(f"/v1/users/{_safe(username)}/rotate-secret",
body=body or None, if_match=if_match)
# ------------------------------------------------------------------
# Convenience helpers
# ------------------------------------------------------------------
@staticmethod
def generate_secret() -> str:
"""Generate a random 32-character hex secret suitable for user creation."""
return secrets.token_hex(16) # 16 bytes → 32 hex chars
# ---------------------------------------------------------------------------
# Internal helpers
# ---------------------------------------------------------------------------
def _safe(username: str) -> str:
"""Minimal guard: reject obvious path-injection attempts."""
if "/" in username or "\\" in username:
raise ValueError(f"Invalid username: {username!r}")
return username
def _opt(d: dict, key: str, value: Any) -> None:
"""Add key to dict only when value is not None."""
if value is not None:
d[key] = value
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def _print(resp: APIResponse) -> None:
print(json.dumps(resp.data, indent=2))
if resp.revision:
print(f"# revision: {resp.revision}", flush=True)
def _build_parser():
import argparse
p = argparse.ArgumentParser(
prog="telemt_api.py",
description="Telemt Control API CLI",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
COMMANDS (read)
health Liveness check
info System info (version, uptime, config hash)
status Runtime gates + startup progress
init Runtime initialization timeline
limits Effective limits (timeouts, upstream, ME)
posture Security posture summary
whitelist IP whitelist entries
summary Stats summary (conns, uptime, users)
zero Zero-cost counters (core/upstream/ME/pool/desync)
upstreams Upstream health + zero counters
minimal ME writers + DC snapshot [minimal_runtime_enabled]
me-writers Per-writer ME status [minimal_runtime_enabled]
dcs Per-DC coverage [minimal_runtime_enabled]
me-pool ME pool generation/writer/refill snapshot
me-quality ME KDF, route-drops, per-DC RTT
upstream-quality Per-upstream health + latency
nat-stun NAT probe state + STUN servers
me-selftest KDF/timeskew/IP/PID/BND health
connections Live connection totals + top-N [runtime_edge_enabled]
events [--limit N] Recent ring-buffer events [runtime_edge_enabled]
COMMANDS (users)
users List all users
user <username> Get single user
create <username> [OPTIONS] Create user
patch <username> [OPTIONS] Partial update user
delete <username> Delete user
secret <username> [--secret S] Rotate secret (reserved; returns 404 in current release)
gen-secret Print a random 32-hex secret and exit
USER OPTIONS (for create / patch)
--secret S 32 hex chars
--ad-tag S 32 hex chars (ad tag)
--max-conns N Max concurrent TCP connections
--expires DATETIME RFC3339 expiration (e.g. 2026-12-31T23:59:59Z)
--quota N Data quota in bytes
--max-ips N Max unique source IPs
EXAMPLES
telemt_api.py health
telemt_api.py -u http://10.0.0.1:9091 -a mysecret users
telemt_api.py create alice --max-conns 5 --quota 10000000000
telemt_api.py patch alice --expires 2027-01-01T00:00:00Z
telemt_api.py delete alice
telemt_api.py events --limit 20
""",
)
p.add_argument("-u", "--url", default="http://127.0.0.1:9091",
metavar="URL", help="API base URL (default: http://127.0.0.1:9091)")
p.add_argument("-a", "--auth", default=None, metavar="TOKEN",
help="Authorization header value")
p.add_argument("-t", "--timeout", type=int, default=10, metavar="SEC",
help="Request timeout in seconds (default: 10)")
p.add_argument("command", nargs="?", default="help",
help="Command to run (see COMMANDS below)")
p.add_argument("arg", nargs="?", default=None, metavar="USERNAME",
help="Username for user commands")
# user create/patch fields
p.add_argument("--secret", default=None)
p.add_argument("--ad-tag", dest="ad_tag", default=None)
p.add_argument("--max-conns", dest="max_conns", type=int, default=None)
p.add_argument("--expires", default=None)
p.add_argument("--quota", type=int, default=None)
p.add_argument("--max-ips", dest="max_ips", type=int, default=None)
# events
p.add_argument("--limit", type=int, default=None,
help="Max events for `events` command")
# optimistic concurrency
p.add_argument("--if-match", dest="if_match", default=None,
metavar="REVISION", help="If-Match revision header")
return p
if __name__ == "__main__":
import sys
parser = _build_parser()
args = parser.parse_args()
cmd = (args.command or "help").lower()
if cmd in ("help", "--help"):
parser.print_help()
sys.exit(0)
if cmd == "gen-secret":
print(TememtAPI.generate_secret())
sys.exit(0)
api = TememtAPI(args.url, auth_header=args.auth, timeout=args.timeout)
try:
# -- read endpoints --------------------------------------------------
if cmd == "health":
_print(api.health())
elif cmd == "info":
_print(api.system_info())
elif cmd == "status":
_print(api.runtime_gates())
elif cmd == "init":
_print(api.runtime_initialization())
elif cmd == "limits":
_print(api.limits_effective())
elif cmd == "posture":
_print(api.security_posture())
elif cmd == "whitelist":
_print(api.security_whitelist())
elif cmd == "summary":
_print(api.stats_summary())
elif cmd == "zero":
_print(api.stats_zero_all())
elif cmd == "upstreams":
_print(api.stats_upstreams())
elif cmd == "minimal":
_print(api.stats_minimal_all())
elif cmd == "me-writers":
_print(api.stats_me_writers())
elif cmd == "dcs":
_print(api.stats_dcs())
elif cmd == "me-pool":
_print(api.runtime_me_pool_state())
elif cmd == "me-quality":
_print(api.runtime_me_quality())
elif cmd == "upstream-quality":
_print(api.runtime_upstream_quality())
elif cmd == "nat-stun":
_print(api.runtime_nat_stun())
elif cmd == "me-selftest":
_print(api.runtime_me_selftest())
elif cmd == "connections":
_print(api.runtime_connections_summary())
elif cmd == "events":
_print(api.runtime_events_recent(limit=args.limit))
# -- user read -------------------------------------------------------
elif cmd == "users":
resp = api.list_users()
users = resp.data or []
if not users:
print("No users configured.")
else:
fmt = "{:<24} {:>7} {:>14} {}"
print(fmt.format("USERNAME", "CONNS", "OCTETS", "LINKS"))
print("-" * 72)
for u in users:
links = (u.get("links") or {})
all_links = (links.get("classic") or []) + \
(links.get("secure") or []) + \
(links.get("tls") or [])
link_str = all_links[0] if all_links else "-"
print(fmt.format(
u["username"],
u.get("current_connections", 0),
u.get("total_octets", 0),
link_str,
))
if resp.revision:
print(f"# revision: {resp.revision}")
elif cmd == "user":
if not args.arg:
parser.error("user command requires <username>")
_print(api.get_user(args.arg))
# -- user write ------------------------------------------------------
elif cmd == "create":
if not args.arg:
parser.error("create command requires <username>")
resp = api.create_user(
args.arg,
secret=args.secret,
user_ad_tag=args.ad_tag,
max_tcp_conns=args.max_conns,
expiration_rfc3339=args.expires,
data_quota_bytes=args.quota,
max_unique_ips=args.max_ips,
if_match=args.if_match,
)
d = resp.data or {}
print(f"Created: {d.get('user', {}).get('username')}")
print(f"Secret: {d.get('secret')}")
links = (d.get("user") or {}).get("links") or {}
for kind, lst in links.items():
for link in (lst or []):
print(f"Link ({kind}): {link}")
if resp.revision:
print(f"# revision: {resp.revision}")
elif cmd == "patch":
if not args.arg:
parser.error("patch command requires <username>")
if not any([args.secret, args.ad_tag, args.max_conns,
args.expires, args.quota, args.max_ips]):
parser.error("patch requires at least one field (--secret, --max-conns, --expires, --quota, --max-ips, --ad-tag)")
_print(api.patch_user(
args.arg,
secret=args.secret,
user_ad_tag=args.ad_tag,
max_tcp_conns=args.max_conns,
expiration_rfc3339=args.expires,
data_quota_bytes=args.quota,
max_unique_ips=args.max_ips,
if_match=args.if_match,
))
elif cmd == "delete":
if not args.arg:
parser.error("delete command requires <username>")
resp = api.delete_user(args.arg, if_match=args.if_match)
print(f"Deleted: {resp.data}")
if resp.revision:
print(f"# revision: {resp.revision}")
elif cmd == "secret":
if not args.arg:
parser.error("secret command requires <username>")
_print(api.rotate_secret(args.arg, secret=args.secret,
if_match=args.if_match))
else:
print(f"Unknown command: {cmd!r}\nRun with 'help' to see available commands.",
file=sys.stderr)
sys.exit(1)
except TememtAPIError as exc:
print(f"API error [{exc.http_status}] {exc.code}: {exc}", file=sys.stderr)
sys.exit(1)
except KeyboardInterrupt:
sys.exit(130)

View File

@@ -1165,6 +1165,60 @@ zabbix_export:
tags:
- tag: Application
value: 'Users connections'
graph_prototypes:
- uuid: 4199de3dcea943d8a1ec62dc297b2e9f
name: 'User {#TELEMT_USER}: Connections'
graph_items:
- color: 1A7C11
item:
host: Telemt
key: 'telemt.active_conn_[{#TELEMT_USER}]'
- color: F63100
sortorder: '1'
item:
host: Telemt
key: 'telemt.total_conn_[{#TELEMT_USER}]'
- uuid: 84b8f22d891e49768891f497cac12fb3
name: 'User {#TELEMT_USER}: IPs'
graph_items:
- color: 0080FF
item:
host: Telemt
key: 'telemt.ips_current_[{#TELEMT_USER}]'
- color: FF8000
sortorder: '1'
item:
host: Telemt
key: 'telemt.ips_limit_[{#TELEMT_USER}]'
- color: AA00FF
sortorder: '2'
item:
host: Telemt
key: 'telemt.ips_utilization_[{#TELEMT_USER}]'
- uuid: 09dabe7125114e36a6ce40788a7cb888
name: 'User {#TELEMT_USER}: Traffic'
graph_items:
- color: 00AA00
item:
host: Telemt
key: 'telemt.octets_from_[{#TELEMT_USER}]'
- color: AA0000
sortorder: '1'
item:
host: Telemt
key: 'telemt.octets_to_[{#TELEMT_USER}]'
- uuid: 367f458962574b0ab3c02278a4cd7ecb
name: 'User {#TELEMT_USER}: Messages'
graph_items:
- color: 00AAFF
item:
host: Telemt
key: 'telemt.msgs_from_[{#TELEMT_USER}]'
- color: FF5500
sortorder: '1'
item:
host: Telemt
key: 'telemt.msgs_to_[{#TELEMT_USER}]'
master_item:
key: telemt.prom_metrics
lld_macro_paths:
@@ -1177,3 +1231,206 @@ zabbix_export:
tags:
- tag: target
value: Telemt
graphs:
- uuid: f162658049ca4f50893c5cc02515ff10
name: 'Telemt: Server Connections Overview'
graph_items:
- color: 1A7C11
item:
host: Telemt
key: telemt.conn_total
- color: F63100
sortorder: '1'
item:
host: Telemt
key: telemt.conn_bad_total
- color: FC6EA3
sortorder: '2'
item:
host: Telemt
key: telemt.handshake_timeouts_total
- uuid: 759eca5e687142f19248f9d9343e1adf
name: 'Telemt: Uptime'
graph_items:
- color: 0080FF
item:
host: Telemt
key: telemt.uptime
- uuid: 0a27dbd0490d4a508c03ed39fa18545d
name: 'Telemt: ME Keepalive'
graph_items:
- color: 1A7C11
item:
host: Telemt
key: telemt.me_keepalive_sent_total
- color: 00AA00
sortorder: '1'
item:
host: Telemt
key: telemt.me_keepalive_pong_total
- color: F63100
sortorder: '2'
item:
host: Telemt
key: telemt.me_keepalive_failed_total
- color: FF8000
sortorder: '3'
item:
host: Telemt
key: telemt.me_keepalive_timeout_total
- uuid: 4015e24ff70b49f484e884d1dde687c0
name: 'Telemt: ME Reconnects'
graph_items:
- color: 0080FF
item:
host: Telemt
key: telemt.me_reconnect_attempts_total
- color: 1A7C11
sortorder: '1'
item:
host: Telemt
key: telemt.me_reconnect_success_total
- uuid: f3e3eeb0663c471aa26cf4b6872b0c50
name: 'Telemt: ME Route Drops'
graph_items:
- color: F63100
item:
host: Telemt
key: telemt.me_route_drop_channel_closed_total
- color: FF8000
sortorder: '1'
item:
host: Telemt
key: telemt.me_route_drop_no_conn_total
- color: AA00FF
sortorder: '2'
item:
host: Telemt
key: telemt.me_route_drop_queue_full_total
- uuid: 49b51ed78a5943bdbd6d1d34fe28bf61
name: 'Telemt: ME Writer Pool'
graph_items:
- color: 0080FF
item:
host: Telemt
key: telemt.pool_drain_active
- color: F63100
sortorder: '1'
item:
host: Telemt
key: telemt.pool_force_close_total
- color: FF8000
sortorder: '2'
item:
host: Telemt
key: telemt.pool_stale_pick_total
- color: 1A7C11
sortorder: '3'
item:
host: Telemt
key: telemt.pool_swap_total
- uuid: a0779e6c979f4c1ab7ac4da7123a5ecb
name: 'Telemt: ME Writer Removals and Restores'
graph_items:
- color: F63100
item:
host: Telemt
key: telemt.me_writer_removed_total
- color: FF8000
sortorder: '1'
item:
host: Telemt
key: telemt.me_writer_removed_unexpected_total
- color: FFAA00
sortorder: '2'
item:
host: Telemt
key: telemt.me_writer_removed_unexpected_minus_restored_total
- color: 1A7C11
sortorder: '3'
item:
host: Telemt
key: telemt.me_writer_restored_same_endpoint_total
- color: 00AA00
sortorder: '4'
item:
host: Telemt
key: telemt.me_writer_restored_fallback_total
- uuid: 4fead70290664953b026a228108bee0e
name: 'Telemt: Desync Detections'
graph_items:
- color: F63100
item:
host: Telemt
key: telemt.desync_total
- color: 1A7C11
sortorder: '1'
item:
host: Telemt
key: telemt.desync_full_logged_total
- color: FF8000
sortorder: '2'
item:
host: Telemt
key: telemt.desync_suppressed_total
- uuid: 9f8c9f48cb534a66ac21b1bba1acb602
name: 'Telemt: Upstream Connect Cycles'
graph_items:
- color: 0080FF
item:
host: Telemt
key: telemt.upstream_connect_attempt_total
- color: 1A7C11
sortorder: '1'
item:
host: Telemt
key: telemt.upstream_connect_success_total
- color: F63100
sortorder: '2'
item:
host: Telemt
key: telemt.upstream_connect_fail_total
- color: FF8000
sortorder: '3'
item:
host: Telemt
key: telemt.upstream_connect_failfast_hard_error_total
- uuid: 05182057727547f8b8884b7e71e34f19
name: 'Telemt: ME Single-Endpoint Outages'
graph_items:
- color: F63100
item:
host: Telemt
key: telemt.me_single_endpoint_outage_enter_total
- color: 1A7C11
sortorder: '1'
item:
host: Telemt
key: telemt.me_single_endpoint_outage_exit_total
- color: 0080FF
sortorder: '2'
item:
host: Telemt
key: telemt.me_single_endpoint_outage_reconnect_attempt_total
- color: 00AA00
sortorder: '3'
item:
host: Telemt
key: telemt.me_single_endpoint_outage_reconnect_success_total
- uuid: 6892e8b7fbd2445d9ccc0574af58a354
name: 'Telemt: ME Refill Activity'
graph_items:
- color: 0080FF
item:
host: Telemt
key: telemt.me_refill_triggered_total
- color: F63100
sortorder: '1'
item:
host: Telemt
key: telemt.me_refill_failed_total
- color: FF8000
sortorder: '2'
item:
host: Telemt
key: telemt.me_refill_skipped_inflight_total