Merge pull request #495 from DavidOsipov/rescue/flow-sec-security

Add adversarial and security tests for client, handshake, and relay modules
Potential fix for pull request finding
2026-06-19 09:21:10 +03:00 · 2026-03-19 17:33:08 +03:00 · 2026-03-19 18:23:36 +04:00 · 2026-03-19 17:31:19 +04:00 · 2026-03-19 17:31:19 +04:00 · 2026-03-18 23:02:58 +03:00
70 changed files with 23103 additions and 3897 deletions
@@ -0,0 +1,15 @@
+[bans]
+multiple-versions = "deny"
+wildcards = "allow"
+highlight = "all"
+
+# Explicitly flag the weak cryptography so the agent is forced to justify its existence
+[[bans.skip]]
+name = "md-5"
+version = "*"
+reason = "MUST VERIFY: Only allowed for legacy checksums, never for security."
+
+[[bans.skip]]
+name = "sha1"
+version = "*"
+reason = "MUST VERIFY: Only allowed for backwards compatibility."
@@ -21,3 +21,4 @@ target
 #.idea/

 proxy-secret
+coverage-html/
@@ -5,6 +5,22 @@ Your responses are precise, minimal, and architecturally sound. You are working

 ---

+### Context: The Telemt Project
+
+You are working on **Telemt**, a high-performance, production-grade Telegram MTProxy implementation written in Rust. It is explicitly designed to operate in highly hostile network environments and evade advanced network censorship.
+
+**Adversarial Threat Model:**
+The proxy operates under constant surveillance by DPI (Deep Packet Inspection) systems and active scanners (state firewalls, mobile operator fraud controls). These entities actively probe IPs, analyze protocol handshakes, and look for known proxy signatures to block or throttle traffic.
+
+**Core Architectural Pillars:**
+1. **TLS-Fronting (TLS-F) & TCP-Splitting (TCP-S):** To the outside world, Telemt looks like a standard TLS server. If a client presents a valid MTProxy key, the connection is handled internally. If a censor's scanner, web browser, or unauthorized crawler connects, Telemt seamlessly splices the TCP connection (L4) to a real, legitimate HTTPS fallback server (e.g., Nginx) without modifying the `ClientHello` or terminating the TLS handshake.
+2. **Middle-End (ME) Orchestration:** A highly concurrent, generation-based pool managing upstream connections to Telegram Datacenters (DCs). It utilizes an **Adaptive Floor** (dynamically scaling writer connections based on traffic), **Hardswaps** (zero-downtime pool reconfiguration), and **STUN/NAT** reflection mechanisms. 
+3. **Strict KDF Routing:** Cryptographic Key Derivation Functions (KDF) in this protocol strictly rely on the exact pairing of Source IP/Port and Destination IP/Port. Deviations or missing port logic will silently break the MTProto handshake.
+4. **Data Plane vs. Control Plane Isolation:** The Data Plane (readers, writers, payload relay, TCP splicing) must remain strictly non-blocking, zero-allocation in hot paths, and highly resilient to network backpressure. The Control Plane (API, metrics, pool generation swaps, config reloads) orchestrates the state asynchronously without stalling the Data Plane.
+
+Any modification you make must preserve Telemt's invisibility to censors, its strict memory-safety invariants, and its hot-path throughput.
+
+
 ### 0. Priority Resolution — Scope Control

 This section resolves conflicts between code quality enforcement and scope limitation.
@@ -374,6 +390,12 @@ you MUST explain why existing invariants remain valid.
 - Do not modify existing tests unless the task explicitly requires it.
 - Do not weaken assertions.
 - Preserve determinism in testable components.
+- Bug-first forces the discipline of proving you understand a bug before you fix it. Tests written after a fix almost always pass trivially and catch nothing new.
+- Invariants over scenarios is the core shift. The route_mode table alone would have caught both BUG-1 and BUG-2 before they were written — "snapshot equals watch state after any transition burst" is a two-line property test that fails immediately on the current diverged-atomics code.
+- Differential/model catches logic drift over time.
+- Scheduler pressure is specifically aimed at the concurrent state bugs that keep reappearing. A single-threaded happy-path test of set_mode will never find subtle bugs; 10,000 concurrent calls will find it on the first run.
+- Mutation gate answers your original complaint directly. It measures test power. If you can remove a bounds check and nothing breaks, the suite isn't covering that branch yet — it just says so explicitly.
+- Dead parameter is a code smell rule. 

 ### 15. Security Constraints

@@ -425,6 +425,32 @@ dependencies = [
 "cipher",
 ]

+[[package]]
+name = "curve25519-dalek"
+version = "4.1.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "97fb8b7c4503de7d6ae7b42ab72a5a59857b4c937ec27a3d4539dba95b5ab2be"
+dependencies = [
+ "cfg-if",
+ "cpufeatures",
+ "curve25519-dalek-derive",
+ "fiat-crypto",
+ "rustc_version",
+ "subtle",
+ "zeroize",
+]
+
+[[package]]
+name = "curve25519-dalek-derive"
+version = "0.1.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "f46882e17999c6cc590af592290432be3bce0428cb0d5f8b6715e4dc7b383eb3"
+dependencies = [
+ "proc-macro2",
+ "quote",
+ "syn 2.0.114",
+]
+
 [[package]]
 name = "dashmap"
 version = "5.5.3"
@@ -517,6 +543,12 @@ version = "2.3.0"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be"

+[[package]]
+name = "fiat-crypto"
+version = "0.2.9"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "28dea519a9695b9977216879a3ebfddf92f1c08c05d984f8996aecd6ecdc811d"
+
 [[package]]
 name = "filetime"
 version = "0.2.27"
@@ -1609,7 +1641,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "6db2770f06117d490610c7488547d543617b21bfa07796d7a12f6f1bd53850d1"
 dependencies = [
 "rand_chacha",
- "rand_core",
+ "rand_core 0.9.5",
 ]

 [[package]]
@@ -1619,9 +1651,15 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "d3022b5f1df60f26e1ffddd6c66e8aa15de382ae63b3a0c1bfc0e4d3e3f325cb"
 dependencies = [
 "ppv-lite86",
- "rand_core",
+ "rand_core 0.9.5",
 ]

+[[package]]
+name = "rand_core"
+version = "0.6.4"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "ec0be4795e2f6a28069bec0b5ff3e2ac9bafc99e6a9a7dc3547996c5c816922c"
+
 [[package]]
 name = "rand_core"
 version = "0.9.5"
@@ -1637,7 +1675,7 @@ version = "0.4.0"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "513962919efc330f829edb2535844d1b912b0fbe2ca165d613e4e8788bb05a5a"
 dependencies = [
- "rand_core",
+ "rand_core 0.9.5",
 ]

 [[package]]
@@ -2025,6 +2063,12 @@ version = "1.2.1"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "6ce2be8dc25455e1f91df71bfa12ad37d7af1092ae736f3a6cd0e37bc7810596"

+[[package]]
+name = "static_assertions"
+version = "1.1.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "a2eb9349b6444b326872e140eb1cf5e7c522154d69e7a0ffb0fb81c06b37543f"
+
 [[package]]
 name = "subtle"
 version = "2.6.1"
@@ -2087,7 +2131,7 @@ dependencies = [

 [[package]]
 name = "telemt"
-version = "3.3.19"
+version = "3.3.20"
 dependencies = [
 "aes",
 "anyhow",
@@ -2127,6 +2171,8 @@ dependencies = [
 "sha1",
 "sha2",
 "socket2 0.5.10",
+ "static_assertions",
+ "subtle",
 "thiserror 2.0.18",
 "tokio",
 "tokio-rustls",
@@ -2137,6 +2183,7 @@ dependencies = [
 "tracing-subscriber",
 "url",
 "webpki-roots 0.26.11",
+ "x25519-dalek",
 "x509-parser",
 "zeroize",
 ]
@@ -3136,6 +3183,18 @@ version = "0.6.2"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9"

+[[package]]
+name = "x25519-dalek"
+version = "2.0.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "c7e468321c81fb07fa7f4c636c3972b9100f0346e5b6a9f2bd0603a52f7ed277"
+dependencies = [
+ "curve25519-dalek",
+ "rand_core 0.6.4",
+ "serde",
+ "zeroize",
+]
+
 [[package]]
 name = "x509-parser"
 version = "0.15.1"
@@ -1,6 +1,6 @@
 [package]
 name = "telemt"
-version = "3.3.22"
+version = "3.3.20"
 edition = "2024"

 [dependencies]
@@ -22,6 +22,8 @@ hmac = "0.12"
 crc32fast = "1.4"
 crc32c = "0.6"
 zeroize = { version = "1.8", features = ["derive"] }
+subtle = "2.6"
+static_assertions = "1.1"

 # Network
 socket2 = { version = "0.5", features = ["all"] }
@@ -50,6 +52,7 @@ regex = "1.11"
 crossbeam-queue = "0.3"
 num-bigint = "0.4"
 num-traits = "0.2"
+x25519-dalek = "2"
 anyhow = "1.0"

 # HTTP
@@ -1,289 +0,0 @@
-# Telemt Config Parameters Reference
-
-This document lists all configuration keys accepted by `config.toml`.
-
-> [!WARNING]
-> 
-> The configuration parameters detailed in this document are intended for advanced users and fine-tuning purposes. Modifying these settings without a clear understanding of their function may lead to application instability or other unexpected behavior. Please proceed with caution and at your own risk.
-
-## Top-level keys
-
-| Parameter | Type | Description |
-|---|---|---|
-| include | `String` (special directive) | Includes another TOML file with `include = "relative/or/absolute/path.toml"`; includes are processed recursively before parsing. |
-| show_link | `"*" \| String[]` | Legacy top-level link visibility selector (`"*"` for all users or explicit usernames list). |
-| dc_overrides | `Map<String, String[]>` | Overrides DC endpoints for non-standard DCs; key is DC id string, value is `ip:port` list. |
-| default_dc | `u8` | Default DC index used for unmapped non-standard DCs. |
-
-## [general]
-
-| Parameter | Type | Description |
-|---|---|---|
-| data_path | `String` | Optional runtime data directory path. |
-| prefer_ipv6 | `bool` | Prefer IPv6 where applicable in runtime logic. |
-| fast_mode | `bool` | Enables fast-path optimizations for traffic processing. |
-| use_middle_proxy | `bool` | Enables Middle Proxy mode. |
-| proxy_secret_path | `String` | Path to proxy secret binary; can be auto-downloaded if absent. |
-| proxy_config_v4_cache_path | `String` | Optional cache path for raw `getProxyConfig` (IPv4) snapshot. |
-| proxy_config_v6_cache_path | `String` | Optional cache path for raw `getProxyConfigV6` (IPv6) snapshot. |
-| ad_tag | `String` | Global fallback ad tag (32 hex characters). |
-| middle_proxy_nat_ip | `IpAddr` | Explicit public IP override for NAT environments. |
-| middle_proxy_nat_probe | `bool` | Enables NAT probing for Middle Proxy KDF/public address discovery. |
-| middle_proxy_nat_stun | `String` | Deprecated legacy single STUN server for NAT probing. |
-| middle_proxy_nat_stun_servers | `String[]` | Deprecated legacy STUN list for NAT probing fallback. |
-| stun_nat_probe_concurrency | `usize` | Maximum concurrent STUN probes during NAT detection. |
-| middle_proxy_pool_size | `usize` | Target size of active Middle Proxy writer pool. |
-| middle_proxy_warm_standby | `usize` | Number of warm standby Middle-End connections. |
-| me_init_retry_attempts | `u32` | Startup retries for ME pool initialization (`0` means unlimited). |
-| me2dc_fallback | `bool` | Allows fallback from ME mode to direct DC when ME startup fails. |
-| me_keepalive_enabled | `bool` | Enables ME keepalive padding frames. |
-| me_keepalive_interval_secs | `u64` | Keepalive interval in seconds. |
-| me_keepalive_jitter_secs | `u64` | Keepalive jitter in seconds. |
-| me_keepalive_payload_random | `bool` | Randomizes keepalive payload bytes instead of zero payload. |
-| rpc_proxy_req_every | `u64` | Interval for service `RPC_PROXY_REQ` activity signals (`0` disables). |
-| me_writer_cmd_channel_capacity | `usize` | Capacity of per-writer command channel. |
-| me_route_channel_capacity | `usize` | Capacity of per-connection ME response route channel. |
-| me_c2me_channel_capacity | `usize` | Capacity of per-client command queue (client reader -> ME sender). |
-| me_reader_route_data_wait_ms | `u64` | Bounded wait for routing ME DATA to per-connection queue (`0` = no wait). |
-| me_d2c_flush_batch_max_frames | `usize` | Max ME->client frames coalesced before flush. |
-| me_d2c_flush_batch_max_bytes | `usize` | Max ME->client payload bytes coalesced before flush. |
-| me_d2c_flush_batch_max_delay_us | `u64` | Max microsecond wait for coalescing more ME->client frames (`0` disables timed coalescing). |
-| me_d2c_ack_flush_immediate | `bool` | Flushes client writer immediately after quick-ack write. |
-| direct_relay_copy_buf_c2s_bytes | `usize` | Copy buffer size for client->DC direction in direct relay. |
-| direct_relay_copy_buf_s2c_bytes | `usize` | Copy buffer size for DC->client direction in direct relay. |
-| crypto_pending_buffer | `usize` | Max pending ciphertext buffer per client writer (bytes). |
-| max_client_frame | `usize` | Maximum allowed client MTProto frame size (bytes). |
-| desync_all_full | `bool` | Emits full crypto-desync forensic logs for every event. |
-| beobachten | `bool` | Enables per-IP forensic observation buckets. |
-| beobachten_minutes | `u64` | Retention window (minutes) for per-IP observation buckets. |
-| beobachten_flush_secs | `u64` | Snapshot flush interval (seconds) for observation output file. |
-| beobachten_file | `String` | Observation snapshot output file path. |
-| hardswap | `bool` | Enables hard-swap generation switching for ME pool updates. |
-| me_warmup_stagger_enabled | `bool` | Enables staggered warmup for extra ME writers. |
-| me_warmup_step_delay_ms | `u64` | Base delay between warmup connections (ms). |
-| me_warmup_step_jitter_ms | `u64` | Jitter for warmup delay (ms). |
-| me_reconnect_max_concurrent_per_dc | `u32` | Max concurrent reconnect attempts per DC. |
-| me_reconnect_backoff_base_ms | `u64` | Base reconnect backoff in ms. |
-| me_reconnect_backoff_cap_ms | `u64` | Cap reconnect backoff in ms. |
-| me_reconnect_fast_retry_count | `u32` | Number of fast retry attempts before backoff. |
-| me_single_endpoint_shadow_writers | `u8` | Additional reserve writers for one-endpoint DC groups. |
-| me_single_endpoint_outage_mode_enabled | `bool` | Enables aggressive outage recovery for one-endpoint DC groups. |
-| me_single_endpoint_outage_disable_quarantine | `bool` | Ignores endpoint quarantine in one-endpoint outage mode. |
-| me_single_endpoint_outage_backoff_min_ms | `u64` | Minimum reconnect backoff in outage mode (ms). |
-| me_single_endpoint_outage_backoff_max_ms | `u64` | Maximum reconnect backoff in outage mode (ms). |
-| me_single_endpoint_shadow_rotate_every_secs | `u64` | Periodic shadow writer rotation interval (`0` disables). |
-| me_floor_mode | `"static" \| "adaptive"` | Writer floor policy mode. |
-| me_adaptive_floor_idle_secs | `u64` | Idle time before adaptive floor may reduce one-endpoint target. |
-| me_adaptive_floor_min_writers_single_endpoint | `u8` | Minimum adaptive writer target for one-endpoint DC groups. |
-| me_adaptive_floor_min_writers_multi_endpoint | `u8` | Minimum adaptive writer target for multi-endpoint DC groups. |
-| me_adaptive_floor_recover_grace_secs | `u64` | Grace period to hold static floor after activity. |
-| me_adaptive_floor_writers_per_core_total | `u16` | Global writer budget per logical CPU core in adaptive mode. |
-| me_adaptive_floor_cpu_cores_override | `u16` | Manual CPU core count override (`0` uses auto-detection). |
-| me_adaptive_floor_max_extra_writers_single_per_core | `u16` | Per-core max extra writers above base floor for one-endpoint DCs. |
-| me_adaptive_floor_max_extra_writers_multi_per_core | `u16` | Per-core max extra writers above base floor for multi-endpoint DCs. |
-| me_adaptive_floor_max_active_writers_per_core | `u16` | Hard cap for active ME writers per logical CPU core. |
-| me_adaptive_floor_max_warm_writers_per_core | `u16` | Hard cap for warm ME writers per logical CPU core. |
-| me_adaptive_floor_max_active_writers_global | `u32` | Hard global cap for active ME writers. |
-| me_adaptive_floor_max_warm_writers_global | `u32` | Hard global cap for warm ME writers. |
-| upstream_connect_retry_attempts | `u32` | Connect attempts for selected upstream before error/fallback. |
-| upstream_connect_retry_backoff_ms | `u64` | Delay between upstream connect attempts (ms). |
-| upstream_connect_budget_ms | `u64` | Total wall-clock budget for one upstream connect request (ms). |
-| upstream_unhealthy_fail_threshold | `u32` | Consecutive failed requests before upstream is marked unhealthy. |
-| upstream_connect_failfast_hard_errors | `bool` | Skips additional retries for hard non-transient connect errors. |
-| stun_iface_mismatch_ignore | `bool` | Ignores STUN/interface mismatch and keeps Middle Proxy mode. |
-| unknown_dc_log_path | `String` | File path for unknown-DC request logging (`null` disables file path). |
-| unknown_dc_file_log_enabled | `bool` | Enables unknown-DC file logging. |
-| log_level | `"debug" \| "verbose" \| "normal" \| "silent"` | Runtime logging verbosity. |
-| disable_colors | `bool` | Disables ANSI colors in logs. |
-| me_socks_kdf_policy | `"strict" \| "compat"` | SOCKS-bound KDF fallback policy for ME handshake. |
-| me_route_backpressure_base_timeout_ms | `u64` | Base backpressure timeout for route-channel send (ms). |
-| me_route_backpressure_high_timeout_ms | `u64` | High backpressure timeout when queue occupancy exceeds watermark (ms). |
-| me_route_backpressure_high_watermark_pct | `u8` | Queue occupancy threshold (%) for high timeout mode. |
-| me_health_interval_ms_unhealthy | `u64` | Health monitor interval while writer coverage is degraded (ms). |
-| me_health_interval_ms_healthy | `u64` | Health monitor interval while writer coverage is healthy (ms). |
-| me_admission_poll_ms | `u64` | Poll interval for conditional-admission checks (ms). |
-| me_warn_rate_limit_ms | `u64` | Cooldown for repetitive ME warning logs (ms). |
-| me_route_no_writer_mode | `"async_recovery_failfast" \| "inline_recovery_legacy" \| "hybrid_async_persistent"` | Route behavior when no writer is immediately available. |
-| me_route_no_writer_wait_ms | `u64` | Max wait in async-recovery failfast mode (ms). |
-| me_route_inline_recovery_attempts | `u32` | Inline recovery attempts in legacy mode. |
-| me_route_inline_recovery_wait_ms | `u64` | Max inline recovery wait in legacy mode (ms). |
-| fast_mode_min_tls_record | `usize` | Minimum TLS record size when fast-mode coalescing is enabled (`0` disables). |
-| update_every | `u64` | Unified interval for config/secret updater tasks. |
-| me_reinit_every_secs | `u64` | Periodic ME pool reinitialization interval (seconds). |
-| me_hardswap_warmup_delay_min_ms | `u64` | Minimum delay between hardswap warmup connects (ms). |
-| me_hardswap_warmup_delay_max_ms | `u64` | Maximum delay between hardswap warmup connects (ms). |
-| me_hardswap_warmup_extra_passes | `u8` | Additional warmup passes per hardswap cycle. |
-| me_hardswap_warmup_pass_backoff_base_ms | `u64` | Base backoff between hardswap warmup passes (ms). |
-| me_config_stable_snapshots | `u8` | Number of identical config snapshots required before apply. |
-| me_config_apply_cooldown_secs | `u64` | Cooldown between applied ME map updates (seconds). |
-| me_snapshot_require_http_2xx | `bool` | Requires 2xx HTTP responses for applying config snapshots. |
-| me_snapshot_reject_empty_map | `bool` | Rejects empty config snapshots. |
-| me_snapshot_min_proxy_for_lines | `u32` | Minimum parsed `proxy_for` rows required to accept snapshot. |
-| proxy_secret_stable_snapshots | `u8` | Number of identical secret snapshots required before runtime rotation. |
-| proxy_secret_rotate_runtime | `bool` | Enables runtime proxy-secret rotation from remote source. |
-| me_secret_atomic_snapshot | `bool` | Keeps selector and secret bytes from the same snapshot atomically. |
-| proxy_secret_len_max | `usize` | Maximum allowed proxy-secret length (bytes). |
-| me_pool_drain_ttl_secs | `u64` | Drain TTL for stale ME writers after endpoint-map changes (seconds). |
-| me_pool_drain_threshold | `u64` | Max draining stale writers before batch force-close (`0` disables threshold cleanup). |
-| me_bind_stale_mode | `"never" \| "ttl" \| "always"` | Policy for new binds on stale draining writers. |
-| me_bind_stale_ttl_secs | `u64` | TTL for stale bind allowance when stale mode is `ttl`. |
-| me_pool_min_fresh_ratio | `f32` | Minimum desired-DC fresh coverage ratio before draining stale writers. |
-| me_reinit_drain_timeout_secs | `u64` | Force-close timeout for stale writers after endpoint-map changes (`0` disables force-close). |
-| proxy_secret_auto_reload_secs | `u64` | Deprecated legacy secret reload interval (fallback when `update_every` is not set). |
-| proxy_config_auto_reload_secs | `u64` | Deprecated legacy config reload interval (fallback when `update_every` is not set). |
-| me_reinit_singleflight | `bool` | Serializes ME reinit cycles across trigger sources. |
-| me_reinit_trigger_channel | `usize` | Trigger queue capacity for reinit scheduler. |
-| me_reinit_coalesce_window_ms | `u64` | Trigger coalescing window before starting reinit (ms). |
-| me_deterministic_writer_sort | `bool` | Enables deterministic candidate sort for writer binding path. |
-| me_writer_pick_mode | `"sorted_rr" \| "p2c"` | Writer selection mode for route bind path. |
-| me_writer_pick_sample_size | `u8` | Number of candidates sampled by picker in `p2c` mode. |
-| ntp_check | `bool` | Enables NTP drift check at startup. |
-| ntp_servers | `String[]` | NTP servers used for drift check. |
-| auto_degradation_enabled | `bool` | Enables automatic degradation from ME to direct DC. |
-| degradation_min_unavailable_dc_groups | `u8` | Minimum unavailable ME DC groups required before degrading. |
-
-## [general.modes]
-
-| Parameter | Type | Description |
-|---|---|---|
-| classic | `bool` | Enables classic MTProxy mode. |
-| secure | `bool` | Enables secure mode. |
-| tls | `bool` | Enables TLS mode. |
-
-## [general.links]
-
-| Parameter | Type | Description |
-|---|---|---|
-| show | `"*" \| String[]` | Selects users whose tg:// links are shown at startup. |
-| public_host | `String` | Public hostname/IP override for generated tg:// links. |
-| public_port | `u16` | Public port override for generated tg:// links. |
-
-## [general.telemetry]
-
-| Parameter | Type | Description |
-|---|---|---|
-| core_enabled | `bool` | Enables core hot-path telemetry counters. |
-| user_enabled | `bool` | Enables per-user telemetry counters. |
-| me_level | `"silent" \| "normal" \| "debug"` | Middle-End telemetry verbosity level. |
-
-## [network]
-
-| Parameter | Type | Description |
-|---|---|---|
-| ipv4 | `bool` | Enables IPv4 networking. |
-| ipv6 | `bool` | Enables/disables IPv6 (`null` = auto-detect availability). |
-| prefer | `u8` | Preferred IP family for selection (`4` or `6`). |
-| multipath | `bool` | Enables multipath behavior where supported. |
-| stun_use | `bool` | Global switch for STUN probing. |
-| stun_servers | `String[]` | STUN server list for public IP detection. |
-| stun_tcp_fallback | `bool` | Enables TCP STUN fallback when UDP STUN is blocked. |
-| http_ip_detect_urls | `String[]` | HTTP endpoints used as fallback public IP detectors. |
-| cache_public_ip_path | `String` | File path for caching detected public IP. |
-| dns_overrides | `String[]` | Runtime DNS overrides in `host:port:ip` format. |
-
-## [server]
-
-| Parameter | Type | Description |
-|---|---|---|
-| port | `u16` | Main proxy listen port. |
-| listen_addr_ipv4 | `String` | IPv4 bind address for TCP listener. |
-| listen_addr_ipv6 | `String` | IPv6 bind address for TCP listener. |
-| listen_unix_sock | `String` | Unix socket path for listener. |
-| listen_unix_sock_perm | `String` | Unix socket permissions in octal string (e.g., `"0666"`). |
-| listen_tcp | `bool` | Explicit TCP listener enable/disable override. |
-| proxy_protocol | `bool` | Enables HAProxy PROXY protocol parsing on incoming client connections. |
-| proxy_protocol_header_timeout_ms | `u64` | Timeout for PROXY protocol header read/parse (ms). |
-| metrics_port | `u16` | Metrics endpoint port (enables metrics listener). |
-| metrics_listen | `String` | Full metrics bind address (`IP:PORT`), overrides `metrics_port`. |
-| metrics_whitelist | `IpNetwork[]` | CIDR whitelist for metrics endpoint access. |
-| max_connections | `u32` | Max concurrent client connections (`0` = unlimited). |
-
-## [server.api]
-
-| Parameter | Type | Description |
-|---|---|---|
-| enabled | `bool` | Enables control-plane REST API. |
-| listen | `String` | API bind address in `IP:PORT` format. |
-| whitelist | `IpNetwork[]` | CIDR whitelist allowed to access API. |
-| auth_header | `String` | Exact expected `Authorization` header value (empty = disabled). |
-| request_body_limit_bytes | `usize` | Maximum accepted HTTP request body size. |
-| minimal_runtime_enabled | `bool` | Enables minimal runtime snapshots endpoint logic. |
-| minimal_runtime_cache_ttl_ms | `u64` | Cache TTL for minimal runtime snapshots (ms; `0` disables cache). |
-| runtime_edge_enabled | `bool` | Enables runtime edge endpoints. |
-| runtime_edge_cache_ttl_ms | `u64` | Cache TTL for runtime edge aggregation payloads (ms). |
-| runtime_edge_top_n | `usize` | Top-N size for edge connection leaderboard. |
-| runtime_edge_events_capacity | `usize` | Ring-buffer capacity for runtime edge events. |
-| read_only | `bool` | Rejects mutating API endpoints when enabled. |
-
-## [[server.listeners]]
-
-| Parameter | Type | Description |
-|---|---|---|
-| ip | `IpAddr` | Listener bind IP. |
-| announce | `String` | Public IP/domain announced in proxy links (priority over `announce_ip`). |
-| announce_ip | `IpAddr` | Deprecated legacy announce IP (migrated to `announce` if needed). |
-| proxy_protocol | `bool` | Per-listener override for PROXY protocol enable flag. |
-| reuse_allow | `bool` | Enables `SO_REUSEPORT` for multi-instance bind sharing. |
-
-## [timeouts]
-
-| Parameter | Type | Description |
-|---|---|---|
-| client_handshake | `u64` | Client handshake timeout. |
-| tg_connect | `u64` | Upstream Telegram connect timeout. |
-| client_keepalive | `u64` | Client keepalive timeout. |
-| client_ack | `u64` | Client ACK timeout. |
-| me_one_retry | `u8` | Quick ME reconnect attempts for single-address DC. |
-| me_one_timeout_ms | `u64` | Timeout per quick attempt for single-address DC (ms). |
-
-## [censorship]
-
-| Parameter | Type | Description |
-|---|---|---|
-| tls_domain | `String` | Primary TLS domain used in fake TLS handshake profile. |
-| tls_domains | `String[]` | Additional TLS domains for generating multiple links. |
-| mask | `bool` | Enables masking/fronting relay mode. |
-| mask_host | `String` | Upstream mask host for TLS fronting relay. |
-| mask_port | `u16` | Upstream mask port for TLS fronting relay. |
-| mask_unix_sock | `String` | Unix socket path for mask backend instead of TCP host/port. |
-| fake_cert_len | `usize` | Length of synthetic certificate payload when emulation data is unavailable. |
-| tls_emulation | `bool` | Enables certificate/TLS behavior emulation from cached real fronts. |
-| tls_front_dir | `String` | Directory path for TLS front cache storage. |
-| server_hello_delay_min_ms | `u64` | Minimum server_hello delay for anti-fingerprint behavior (ms). |
-| server_hello_delay_max_ms | `u64` | Maximum server_hello delay for anti-fingerprint behavior (ms). |
-| tls_new_session_tickets | `u8` | Number of `NewSessionTicket` messages to emit after handshake. |
-| tls_full_cert_ttl_secs | `u64` | TTL for sending full cert payload per (domain, client IP) tuple. |
-| alpn_enforce | `bool` | Enforces ALPN echo behavior based on client preference. |
-| mask_proxy_protocol | `u8` | PROXY protocol mode for mask backend (`0` disabled, `1` v1, `2` v2). |
-
-## [access]
-
-| Parameter | Type | Description |
-|---|---|---|
-| users | `Map<String, String>` | Username -> 32-hex secret mapping. |
-| user_ad_tags | `Map<String, String>` | Per-user ad tags (32 hex chars). |
-| user_max_tcp_conns | `Map<String, usize>` | Per-user maximum concurrent TCP connections. |
-| user_expirations | `Map<String, DateTime<Utc>>` | Per-user account expiration timestamps. |
-| user_data_quota | `Map<String, u64>` | Per-user data quota limits. |
-| user_max_unique_ips | `Map<String, usize>` | Per-user unique source IP limits. |
-| user_max_unique_ips_global_each | `usize` | Global fallback per-user unique IP limit when no per-user override exists. |
-| user_max_unique_ips_mode | `"active_window" \| "time_window" \| "combined"` | Unique source IP limit accounting mode. |
-| user_max_unique_ips_window_secs | `u64` | Recent-window size for unique IP accounting (seconds). |
-| replay_check_len | `usize` | Replay check storage length. |
-| replay_window_secs | `u64` | Replay protection time window in seconds. |
-| ignore_time_skew | `bool` | Ignores client/server timestamp skew in replay validation. |
-
-## [[upstreams]]
-
-| Parameter | Type | Description |
-|---|---|---|
-| type | `"direct" \| "socks4" \| "socks5"` | Upstream transport type selector. |
-| weight | `u16` | Weighted selection coefficient for this upstream. |
-| enabled | `bool` | Enables/disables this upstream entry. |
-| scopes | `String` | Comma-separated scope tags for routing. |
-| interface | `String` | Optional outgoing interface name (`direct`, `socks4`, `socks5`). |
-| bind_addresses | `String[]` | Optional source bind addresses for `direct` upstream. |
-| address | `String` | Upstream proxy address (`host:port`) for SOCKS upstreams. |
-| user_id | `String` | SOCKS4 user ID (only for `type = "socks4"`). |
-| username | `String` | SOCKS5 username (only for `type = "socks5"`). |
-| password | `String` | SOCKS5 password (only for `type = "socks5"`). |
@@ -195,8 +195,6 @@ pub(super) struct ZeroPoolData {
    pub(super) pool_swap_total: u64,
    pub(super) pool_drain_active: u64,
    pub(super) pool_force_close_total: u64,
-    pub(super) pool_drain_soft_evict_total: u64,
-    pub(super) pool_drain_soft_evict_writer_total: u64,
    pub(super) pool_stale_pick_total: u64,
    pub(super) writer_removed_total: u64,
    pub(super) writer_removed_unexpected_total: u64,
@@ -237,7 +235,6 @@ pub(super) struct MeWritersSummary {
    pub(super) available_pct: f64,
    pub(super) required_writers: usize,
    pub(super) alive_writers: usize,
-    pub(super) coverage_ratio: f64,
    pub(super) coverage_pct: f64,
    pub(super) fresh_alive_writers: usize,
    pub(super) fresh_coverage_pct: f64,
@@ -286,7 +283,6 @@ pub(super) struct DcStatus {
    pub(super) floor_max: usize,
    pub(super) floor_capped: bool,
    pub(super) alive_writers: usize,
-    pub(super) coverage_ratio: f64,
    pub(super) coverage_pct: f64,
    pub(super) fresh_alive_writers: usize,
    pub(super) fresh_coverage_pct: f64,
@@ -364,11 +360,6 @@ pub(super) struct MinimalMeRuntimeData {
    pub(super) me_reconnect_backoff_cap_ms: u64,
    pub(super) me_reconnect_fast_retry_count: u32,
    pub(super) me_pool_drain_ttl_secs: u64,
-    pub(super) me_pool_drain_soft_evict_enabled: bool,
-    pub(super) me_pool_drain_soft_evict_grace_secs: u64,
-    pub(super) me_pool_drain_soft_evict_per_writer: u8,
-    pub(super) me_pool_drain_soft_evict_budget_per_core: u16,
-    pub(super) me_pool_drain_soft_evict_cooldown_ms: u64,
    pub(super) me_pool_force_close_secs: u64,
    pub(super) me_pool_min_fresh_ratio: f32,
    pub(super) me_bind_stale_mode: &'static str,
@@ -113,7 +113,6 @@ pub(super) struct RuntimeMeQualityDcRttData {
    pub(super) rtt_ema_ms: Option<f64>,
    pub(super) alive_writers: usize,
    pub(super) required_writers: usize,
-    pub(super) coverage_ratio: f64,
    pub(super) coverage_pct: f64,
 }

@@ -389,7 +388,6 @@ pub(super) async fn build_runtime_me_quality_data(shared: &ApiShared) -> Runtime
                    rtt_ema_ms: dc.rtt_ms,
                    alive_writers: dc.alive_writers,
                    required_writers: dc.required_writers,
-                    coverage_ratio: dc.coverage_ratio,
                    coverage_pct: dc.coverage_pct,
                })
                .collect(),
@@ -96,8 +96,6 @@ pub(super) fn build_zero_all_data(stats: &Stats, configured_users: usize) -> Zer
            pool_swap_total: stats.get_pool_swap_total(),
            pool_drain_active: stats.get_pool_drain_active(),
            pool_force_close_total: stats.get_pool_force_close_total(),
-            pool_drain_soft_evict_total: stats.get_pool_drain_soft_evict_total(),
-            pool_drain_soft_evict_writer_total: stats.get_pool_drain_soft_evict_writer_total(),
            pool_stale_pick_total: stats.get_pool_stale_pick_total(),
            writer_removed_total: stats.get_me_writer_removed_total(),
            writer_removed_unexpected_total: stats.get_me_writer_removed_unexpected_total(),
@@ -315,7 +313,6 @@ async fn get_minimal_payload_cached(
            available_pct: status.available_pct,
            required_writers: status.required_writers,
            alive_writers: status.alive_writers,
-            coverage_ratio: status.coverage_ratio,
            coverage_pct: status.coverage_pct,
            fresh_alive_writers: status.fresh_alive_writers,
            fresh_coverage_pct: status.fresh_coverage_pct,
@@ -373,7 +370,6 @@ async fn get_minimal_payload_cached(
                floor_max: entry.floor_max,
                floor_capped: entry.floor_capped,
                alive_writers: entry.alive_writers,
-                coverage_ratio: entry.coverage_ratio,
                coverage_pct: entry.coverage_pct,
                fresh_alive_writers: entry.fresh_alive_writers,
                fresh_coverage_pct: entry.fresh_coverage_pct,
@@ -431,11 +427,6 @@ async fn get_minimal_payload_cached(
        me_reconnect_backoff_cap_ms: runtime.me_reconnect_backoff_cap_ms,
        me_reconnect_fast_retry_count: runtime.me_reconnect_fast_retry_count,
        me_pool_drain_ttl_secs: runtime.me_pool_drain_ttl_secs,
-        me_pool_drain_soft_evict_enabled: runtime.me_pool_drain_soft_evict_enabled,
-        me_pool_drain_soft_evict_grace_secs: runtime.me_pool_drain_soft_evict_grace_secs,
-        me_pool_drain_soft_evict_per_writer: runtime.me_pool_drain_soft_evict_per_writer,
-        me_pool_drain_soft_evict_budget_per_core: runtime.me_pool_drain_soft_evict_budget_per_core,
-        me_pool_drain_soft_evict_cooldown_ms: runtime.me_pool_drain_soft_evict_cooldown_ms,
        me_pool_force_close_secs: runtime.me_pool_force_close_secs,
        me_pool_min_fresh_ratio: runtime.me_pool_min_fresh_ratio,
        me_bind_stale_mode: runtime.me_bind_stale_mode,
@@ -504,7 +495,6 @@ fn disabled_me_writers(now_epoch_secs: u64, reason: &'static str) -> MeWritersDa
            available_pct: 0.0,
            required_writers: 0,
            alive_writers: 0,
-            coverage_ratio: 0.0,
            coverage_pct: 0.0,
            fresh_alive_writers: 0,
            fresh_coverage_pct: 0.0,
@@ -239,7 +239,7 @@ tls_full_cert_ttl_secs = 90

 [access]
 replay_check_len = 65536
-replay_window_secs = 1800
+replay_window_secs = 120
 ignore_time_skew = false

 [access.users]
@@ -27,8 +27,8 @@ const DEFAULT_ME_C2ME_CHANNEL_CAPACITY: usize = 1024;
 const DEFAULT_ME_READER_ROUTE_DATA_WAIT_MS: u64 = 2;
 const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_FRAMES: usize = 32;
 const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_BYTES: usize = 128 * 1024;
-const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_DELAY_US: u64 = 500;
-const DEFAULT_ME_D2C_ACK_FLUSH_IMMEDIATE: bool = true;
+const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_DELAY_US: u64 = 1500;
+const DEFAULT_ME_D2C_ACK_FLUSH_IMMEDIATE: bool = false;
 const DEFAULT_DIRECT_RELAY_COPY_BUF_C2S_BYTES: usize = 64 * 1024;
 const DEFAULT_DIRECT_RELAY_COPY_BUF_S2C_BYTES: usize = 256 * 1024;
 const DEFAULT_ME_WRITER_PICK_SAMPLE_SIZE: u8 = 3;
@@ -36,11 +36,6 @@ const DEFAULT_ME_HEALTH_INTERVAL_MS_UNHEALTHY: u64 = 1000;
 const DEFAULT_ME_HEALTH_INTERVAL_MS_HEALTHY: u64 = 3000;
 const DEFAULT_ME_ADMISSION_POLL_MS: u64 = 1000;
 const DEFAULT_ME_WARN_RATE_LIMIT_MS: u64 = 5000;
-const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_ENABLED: bool = true;
-const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_GRACE_SECS: u64 = 30;
-const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_PER_WRITER: u8 = 1;
-const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_BUDGET_PER_CORE: u16 = 8;
-const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_COOLDOWN_MS: u64 = 5000;
 const DEFAULT_USER_MAX_UNIQUE_IPS_WINDOW_SECS: u64 = 30;
 const DEFAULT_UPSTREAM_CONNECT_RETRY_ATTEMPTS: u32 = 2;
 const DEFAULT_UPSTREAM_UNHEALTHY_FAIL_THRESHOLD: u32 = 5;
@@ -78,7 +73,9 @@ pub(crate) fn default_replay_check_len() -> usize {
 }

 pub(crate) fn default_replay_window_secs() -> u64 {
-    1800
+    // Keep replay cache TTL tight by default to reduce replay surface.
+    // Deployments with higher RTT or longer reconnect jitter can override this in config.
+    120
 }

 pub(crate) fn default_handshake_timeout() -> u64 {
@@ -90,11 +87,11 @@ pub(crate) fn default_connect_timeout() -> u64 {
 }

 pub(crate) fn default_keepalive() -> u64 {
-    15
+    60
 }

 pub(crate) fn default_ack_timeout() -> u64 {
-    90
+    300
 }
 pub(crate) fn default_me_one_retry() -> u8 {
    12
@@ -461,11 +458,11 @@ pub(crate) fn default_tls_full_cert_ttl_secs() -> u64 {
 }

 pub(crate) fn default_server_hello_delay_min_ms() -> u64 {
-    0
+    8
 }

 pub(crate) fn default_server_hello_delay_max_ms() -> u64 {
-    0
+    24
 }

 pub(crate) fn default_alpn_enforce() -> bool {
@@ -597,26 +594,6 @@ pub(crate) fn default_me_pool_drain_threshold() -> u64 {
    128
 }

-pub(crate) fn default_me_pool_drain_soft_evict_enabled() -> bool {
-    DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_ENABLED
-}
-
-pub(crate) fn default_me_pool_drain_soft_evict_grace_secs() -> u64 {
-    DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_GRACE_SECS
-}
-
-pub(crate) fn default_me_pool_drain_soft_evict_per_writer() -> u8 {
-    DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_PER_WRITER
-}
-
-pub(crate) fn default_me_pool_drain_soft_evict_budget_per_core() -> u16 {
-    DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_BUDGET_PER_CORE
-}
-
-pub(crate) fn default_me_pool_drain_soft_evict_cooldown_ms() -> u64 {
-    DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_COOLDOWN_MS
-}
-
 pub(crate) fn default_me_bind_stale_ttl_secs() -> u64 {
    default_me_pool_drain_ttl_secs()
 }
@@ -37,9 +37,7 @@ use crate::config::{
 };
 use super::load::{LoadedConfig, ProxyConfig};

-const HOT_RELOAD_STABLE_SNAPSHOTS: u8 = 2;
 const HOT_RELOAD_DEBOUNCE: Duration = Duration::from_millis(50);
-const HOT_RELOAD_STABLE_RECHECK: Duration = Duration::from_millis(75);

 // ── Hot fields ────────────────────────────────────────────────────────────────

@@ -57,11 +55,6 @@ pub struct HotFields {
    pub hardswap:                bool,
    pub me_pool_drain_ttl_secs:  u64,
    pub me_pool_drain_threshold: u64,
-    pub me_pool_drain_soft_evict_enabled: bool,
-    pub me_pool_drain_soft_evict_grace_secs: u64,
-    pub me_pool_drain_soft_evict_per_writer: u8,
-    pub me_pool_drain_soft_evict_budget_per_core: u16,
-    pub me_pool_drain_soft_evict_cooldown_ms: u64,
    pub me_pool_min_fresh_ratio: f32,
    pub me_reinit_drain_timeout_secs: u64,
    pub me_hardswap_warmup_delay_min_ms: u64,
@@ -144,15 +137,6 @@ impl HotFields {
            hardswap:                cfg.general.hardswap,
            me_pool_drain_ttl_secs:  cfg.general.me_pool_drain_ttl_secs,
            me_pool_drain_threshold: cfg.general.me_pool_drain_threshold,
-            me_pool_drain_soft_evict_enabled: cfg.general.me_pool_drain_soft_evict_enabled,
-            me_pool_drain_soft_evict_grace_secs: cfg.general.me_pool_drain_soft_evict_grace_secs,
-            me_pool_drain_soft_evict_per_writer: cfg.general.me_pool_drain_soft_evict_per_writer,
-            me_pool_drain_soft_evict_budget_per_core: cfg
-                .general
-                .me_pool_drain_soft_evict_budget_per_core,
-            me_pool_drain_soft_evict_cooldown_ms: cfg
-                .general
-                .me_pool_drain_soft_evict_cooldown_ms,
            me_pool_min_fresh_ratio: cfg.general.me_pool_min_fresh_ratio,
            me_reinit_drain_timeout_secs: cfg.general.me_reinit_drain_timeout_secs,
            me_hardswap_warmup_delay_min_ms: cfg.general.me_hardswap_warmup_delay_min_ms,
@@ -344,49 +328,19 @@ impl WatchManifest {
 #[derive(Debug, Default)]
 struct ReloadState {
    applied_snapshot_hash: Option<u64>,
-    candidate_snapshot_hash: Option<u64>,
-    candidate_hits: u8,
 }

 impl ReloadState {
    fn new(applied_snapshot_hash: Option<u64>) -> Self {
-        Self {
-            applied_snapshot_hash,
-            candidate_snapshot_hash: None,
-            candidate_hits: 0,
-        }
+        Self { applied_snapshot_hash }
    }

    fn is_applied(&self, hash: u64) -> bool {
        self.applied_snapshot_hash == Some(hash)
    }

-    fn observe_candidate(&mut self, hash: u64) -> u8 {
-        if self.candidate_snapshot_hash == Some(hash) {
-            self.candidate_hits = self.candidate_hits.saturating_add(1);
-        } else {
-            self.candidate_snapshot_hash = Some(hash);
-            self.candidate_hits = 1;
-        }
-        self.candidate_hits
-    }
-
-    fn reset_candidate(&mut self) {
-        self.candidate_snapshot_hash = None;
-        self.candidate_hits = 0;
-    }
-
    fn mark_applied(&mut self, hash: u64) {
        self.applied_snapshot_hash = Some(hash);
-        self.reset_candidate();
-    }
-
-    fn pending_candidate(&self) -> Option<(u64, u8)> {
-        let hash = self.candidate_snapshot_hash?;
-        if self.candidate_hits < HOT_RELOAD_STABLE_SNAPSHOTS {
-            return Some((hash, self.candidate_hits));
-        }
-        None
    }
 }

@@ -478,15 +432,6 @@ fn overlay_hot_fields(old: &ProxyConfig, new: &ProxyConfig) -> ProxyConfig {
    cfg.general.hardswap = new.general.hardswap;
    cfg.general.me_pool_drain_ttl_secs = new.general.me_pool_drain_ttl_secs;
    cfg.general.me_pool_drain_threshold = new.general.me_pool_drain_threshold;
-    cfg.general.me_pool_drain_soft_evict_enabled = new.general.me_pool_drain_soft_evict_enabled;
-    cfg.general.me_pool_drain_soft_evict_grace_secs =
-        new.general.me_pool_drain_soft_evict_grace_secs;
-    cfg.general.me_pool_drain_soft_evict_per_writer =
-        new.general.me_pool_drain_soft_evict_per_writer;
-    cfg.general.me_pool_drain_soft_evict_budget_per_core =
-        new.general.me_pool_drain_soft_evict_budget_per_core;
-    cfg.general.me_pool_drain_soft_evict_cooldown_ms =
-        new.general.me_pool_drain_soft_evict_cooldown_ms;
    cfg.general.me_pool_min_fresh_ratio = new.general.me_pool_min_fresh_ratio;
    cfg.general.me_reinit_drain_timeout_secs = new.general.me_reinit_drain_timeout_secs;
    cfg.general.me_hardswap_warmup_delay_min_ms = new.general.me_hardswap_warmup_delay_min_ms;
@@ -867,25 +812,6 @@ fn log_changes(
            old_hot.me_pool_drain_threshold, new_hot.me_pool_drain_threshold,
        );
    }
-    if old_hot.me_pool_drain_soft_evict_enabled != new_hot.me_pool_drain_soft_evict_enabled
-        || old_hot.me_pool_drain_soft_evict_grace_secs
-            != new_hot.me_pool_drain_soft_evict_grace_secs
-        || old_hot.me_pool_drain_soft_evict_per_writer
-            != new_hot.me_pool_drain_soft_evict_per_writer
-        || old_hot.me_pool_drain_soft_evict_budget_per_core
-            != new_hot.me_pool_drain_soft_evict_budget_per_core
-        || old_hot.me_pool_drain_soft_evict_cooldown_ms
-            != new_hot.me_pool_drain_soft_evict_cooldown_ms
-    {
-        info!(
-            "config reload: me_pool_drain_soft_evict: enabled={} grace={}s per_writer={} budget_per_core={} cooldown={}ms",
-            new_hot.me_pool_drain_soft_evict_enabled,
-            new_hot.me_pool_drain_soft_evict_grace_secs,
-            new_hot.me_pool_drain_soft_evict_per_writer,
-            new_hot.me_pool_drain_soft_evict_budget_per_core,
-            new_hot.me_pool_drain_soft_evict_cooldown_ms
-        );
-    }

    if (old_hot.me_pool_min_fresh_ratio - new_hot.me_pool_min_fresh_ratio).abs() > f32::EPSILON {
        info!(
@@ -1189,7 +1115,6 @@ fn reload_config(
    let loaded = match ProxyConfig::load_with_metadata(config_path) {
        Ok(loaded) => loaded,
        Err(e) => {
-            reload_state.reset_candidate();
            error!("config reload: failed to parse {:?}: {}", config_path, e);
            return None;
        }
@@ -1202,7 +1127,6 @@ fn reload_config(
    let next_manifest = WatchManifest::from_source_files(&source_files);

    if let Err(e) = new_cfg.validate() {
-        reload_state.reset_candidate();
        error!("config reload: validation failed: {}; keeping old config", e);
        return Some(next_manifest);
    }
@@ -1211,17 +1135,6 @@ fn reload_config(
        return Some(next_manifest);
    }

-    let candidate_hits = reload_state.observe_candidate(rendered_hash);
-    if candidate_hits < HOT_RELOAD_STABLE_SNAPSHOTS {
-        info!(
-            snapshot_hash = rendered_hash,
-            candidate_hits,
-            required_hits = HOT_RELOAD_STABLE_SNAPSHOTS,
-            "config reload: candidate snapshot observed but not stable yet"
-        );
-        return Some(next_manifest);
-    }
-
    let old_cfg = config_tx.borrow().clone();
    let applied_cfg = overlay_hot_fields(&old_cfg, &new_cfg);
    let old_hot = HotFields::from_config(&old_cfg);
@@ -1241,7 +1154,6 @@ fn reload_config(
    if old_hot.dns_overrides != applied_hot.dns_overrides
        && let Err(e) = crate::network::dns_overrides::install_entries(&applied_hot.dns_overrides)
    {
-        reload_state.reset_candidate();
        error!(
            "config reload: invalid network.dns_overrides: {}; keeping old config",
            e
@@ -1262,73 +1174,6 @@ fn reload_config(
    Some(next_manifest)
 }

-async fn reload_with_internal_stable_rechecks(
-    config_path: &PathBuf,
-    config_tx: &watch::Sender<Arc<ProxyConfig>>,
-    log_tx: &watch::Sender<LogLevel>,
-    detected_ip_v4: Option<IpAddr>,
-    detected_ip_v6: Option<IpAddr>,
-    reload_state: &mut ReloadState,
-) -> Option<WatchManifest> {
-    let mut next_manifest = reload_config(
-        config_path,
-        config_tx,
-        log_tx,
-        detected_ip_v4,
-        detected_ip_v6,
-        reload_state,
-    );
-    let mut rechecks_left = HOT_RELOAD_STABLE_SNAPSHOTS.saturating_sub(1);
-
-    while rechecks_left > 0 {
-        let Some((snapshot_hash, candidate_hits)) = reload_state.pending_candidate() else {
-            break;
-        };
-
-        info!(
-            snapshot_hash,
-            candidate_hits,
-            required_hits = HOT_RELOAD_STABLE_SNAPSHOTS,
-            rechecks_left,
-            recheck_delay_ms = HOT_RELOAD_STABLE_RECHECK.as_millis(),
-            "config reload: scheduling internal stable recheck"
-        );
-        tokio::time::sleep(HOT_RELOAD_STABLE_RECHECK).await;
-
-        let recheck_manifest = reload_config(
-            config_path,
-            config_tx,
-            log_tx,
-            detected_ip_v4,
-            detected_ip_v6,
-            reload_state,
-        );
-        if recheck_manifest.is_some() {
-            next_manifest = recheck_manifest;
-        }
-
-        if reload_state.is_applied(snapshot_hash) {
-            info!(
-                snapshot_hash,
-                "config reload: applied after internal stable recheck"
-            );
-            break;
-        }
-
-        if reload_state.pending_candidate().is_none() {
-            info!(
-                snapshot_hash,
-                "config reload: internal stable recheck aborted"
-            );
-            break;
-        }
-
-        rechecks_left = rechecks_left.saturating_sub(1);
-    }
-
-    next_manifest
-}
-
 // ── Public API ────────────────────────────────────────────────────────────────

 /// Spawn the hot-reload watcher task.
@@ -1452,16 +1297,28 @@ pub fn spawn_config_watcher(
            tokio::time::sleep(HOT_RELOAD_DEBOUNCE).await;
            while notify_rx.try_recv().is_ok() {}

-            if let Some(next_manifest) = reload_with_internal_stable_rechecks(
+            let mut next_manifest = reload_config(
                &config_path,
                &config_tx,
                &log_tx,
                detected_ip_v4,
                detected_ip_v6,
                &mut reload_state,
-            )
-            .await
-            {
+            );
+            if next_manifest.is_none() {
+                tokio::time::sleep(HOT_RELOAD_DEBOUNCE).await;
+                while notify_rx.try_recv().is_ok() {}
+                next_manifest = reload_config(
+                    &config_path,
+                    &config_tx,
+                    &log_tx,
+                    detected_ip_v4,
+                    detected_ip_v6,
+                    &mut reload_state,
+                );
+            }
+
+            if let Some(next_manifest) = next_manifest {
                apply_watch_manifest(
                    inotify_watcher.as_mut(),
                    poll_watcher.as_mut(),
@@ -1586,7 +1443,7 @@ mod tests {
    }

    #[test]
-    fn reload_requires_stable_snapshot_before_hot_apply() {
+    fn reload_applies_hot_change_on_first_observed_snapshot() {
        let initial_tag = "11111111111111111111111111111111";
        let final_tag = "22222222222222222222222222222222";
        let path = temp_config_path("telemt_hot_reload_stable");
@@ -1598,55 +1455,13 @@ mod tests {
        let (log_tx, _log_rx) = watch::channel(initial_cfg.general.log_level.clone());
        let mut reload_state = ReloadState::new(Some(initial_hash));

-        write_reload_config(&path, None, None);
-        reload_config(&path, &config_tx, &log_tx, None, None, &mut reload_state).unwrap();
-        assert_eq!(
-            config_tx.borrow().general.ad_tag.as_deref(),
-            Some(initial_tag)
-        );
-
        write_reload_config(&path, Some(final_tag), None);
-        reload_config(&path, &config_tx, &log_tx, None, None, &mut reload_state).unwrap();
-        assert_eq!(
-            config_tx.borrow().general.ad_tag.as_deref(),
-            Some(initial_tag)
-        );
-
        reload_config(&path, &config_tx, &log_tx, None, None, &mut reload_state).unwrap();
        assert_eq!(config_tx.borrow().general.ad_tag.as_deref(), Some(final_tag));

        let _ = std::fs::remove_file(path);
    }

-    #[tokio::test]
-    async fn reload_cycle_applies_after_single_external_event() {
-        let initial_tag = "10101010101010101010101010101010";
-        let final_tag = "20202020202020202020202020202020";
-        let path = temp_config_path("telemt_hot_reload_single_event");
-
-        write_reload_config(&path, Some(initial_tag), None);
-        let initial_cfg = Arc::new(ProxyConfig::load(&path).unwrap());
-        let initial_hash = ProxyConfig::load_with_metadata(&path).unwrap().rendered_hash;
-        let (config_tx, _config_rx) = watch::channel(initial_cfg.clone());
-        let (log_tx, _log_rx) = watch::channel(initial_cfg.general.log_level.clone());
-        let mut reload_state = ReloadState::new(Some(initial_hash));
-
-        write_reload_config(&path, Some(final_tag), None);
-        reload_with_internal_stable_rechecks(
-            &path,
-            &config_tx,
-            &log_tx,
-            None,
-            None,
-            &mut reload_state,
-        )
-        .await
-        .unwrap();
-
-        assert_eq!(config_tx.borrow().general.ad_tag.as_deref(), Some(final_tag));
-        let _ = std::fs::remove_file(path);
-    }
-
    #[test]
    fn reload_keeps_hot_apply_when_non_hot_fields_change() {
        let initial_tag = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
@@ -1662,7 +1477,6 @@ mod tests {

        write_reload_config(&path, Some(final_tag), Some(initial_cfg.server.port + 1));
        reload_config(&path, &config_tx, &log_tx, None, None, &mut reload_state).unwrap();
-        reload_config(&path, &config_tx, &log_tx, None, None, &mut reload_state).unwrap();

        let applied = config_tx.borrow().clone();
        assert_eq!(applied.general.ad_tag.as_deref(), Some(final_tag));
@@ -1670,4 +1484,31 @@ mod tests {

        let _ = std::fs::remove_file(path);
    }
+
+    #[test]
+    fn reload_recovers_after_parse_error_on_next_attempt() {
+        let initial_tag = "cccccccccccccccccccccccccccccccc";
+        let final_tag = "dddddddddddddddddddddddddddddddd";
+        let path = temp_config_path("telemt_hot_reload_parse_recovery");
+
+        write_reload_config(&path, Some(initial_tag), None);
+        let initial_cfg = Arc::new(ProxyConfig::load(&path).unwrap());
+        let initial_hash = ProxyConfig::load_with_metadata(&path).unwrap().rendered_hash;
+        let (config_tx, _config_rx) = watch::channel(initial_cfg.clone());
+        let (log_tx, _log_rx) = watch::channel(initial_cfg.general.log_level.clone());
+        let mut reload_state = ReloadState::new(Some(initial_hash));
+
+        std::fs::write(&path, "[access.users\nuser = \"broken\"\n").unwrap();
+        assert!(reload_config(&path, &config_tx, &log_tx, None, None, &mut reload_state).is_none());
+        assert_eq!(
+            config_tx.borrow().general.ad_tag.as_deref(),
+            Some(initial_tag)
+        );
+
+        write_reload_config(&path, Some(final_tag), None);
+        reload_config(&path, &config_tx, &log_tx, None, None, &mut reload_state).unwrap();
+        assert_eq!(config_tx.borrow().general.ad_tag.as_deref(), Some(final_tag));
+
+        let _ = std::fs::remove_file(path);
+    }
 }
@@ -406,35 +406,6 @@ impl ProxyConfig {
            ));
        }

-        if config.general.me_pool_drain_soft_evict_grace_secs > 3600 {
-            return Err(ProxyError::Config(
-                "general.me_pool_drain_soft_evict_grace_secs must be within [0, 3600]".to_string(),
-            ));
-        }
-
-        if config.general.me_pool_drain_soft_evict_per_writer == 0
-            || config.general.me_pool_drain_soft_evict_per_writer > 16
-        {
-            return Err(ProxyError::Config(
-                "general.me_pool_drain_soft_evict_per_writer must be within [1, 16]".to_string(),
-            ));
-        }
-
-        if config.general.me_pool_drain_soft_evict_budget_per_core == 0
-            || config.general.me_pool_drain_soft_evict_budget_per_core > 64
-        {
-            return Err(ProxyError::Config(
-                "general.me_pool_drain_soft_evict_budget_per_core must be within [1, 64]"
-                    .to_string(),
-            ));
-        }
-
-        if config.general.me_pool_drain_soft_evict_cooldown_ms == 0 {
-            return Err(ProxyError::Config(
-                "general.me_pool_drain_soft_evict_cooldown_ms must be > 0".to_string(),
-            ));
-        }
-
        if config.access.user_max_unique_ips_window_secs == 0 {
            return Err(ProxyError::Config(
                "access.user_max_unique_ips_window_secs must be > 0".to_string(),
@@ -803,26 +803,6 @@ pub struct GeneralConfig {
    #[serde(default = "default_me_pool_drain_threshold")]
    pub me_pool_drain_threshold: u64,

-    /// Enable staged client eviction for draining ME writers that remain non-empty past TTL.
-    #[serde(default = "default_me_pool_drain_soft_evict_enabled")]
-    pub me_pool_drain_soft_evict_enabled: bool,
-
-    /// Extra grace in seconds after drain TTL before soft-eviction stage starts.
-    #[serde(default = "default_me_pool_drain_soft_evict_grace_secs")]
-    pub me_pool_drain_soft_evict_grace_secs: u64,
-
-    /// Maximum number of client sessions to evict from one draining writer per health tick.
-    #[serde(default = "default_me_pool_drain_soft_evict_per_writer")]
-    pub me_pool_drain_soft_evict_per_writer: u8,
-
-    /// Soft-eviction budget per CPU core for one health tick.
-    #[serde(default = "default_me_pool_drain_soft_evict_budget_per_core")]
-    pub me_pool_drain_soft_evict_budget_per_core: u16,
-
-    /// Cooldown for repetitive soft-eviction on the same writer in milliseconds.
-    #[serde(default = "default_me_pool_drain_soft_evict_cooldown_ms")]
-    pub me_pool_drain_soft_evict_cooldown_ms: u64,
-
    /// Policy for new binds on stale draining writers.
    #[serde(default)]
    pub me_bind_stale_mode: MeBindStaleMode,
@@ -1004,13 +984,6 @@ impl Default for GeneralConfig {
            proxy_secret_len_max: default_proxy_secret_len_max(),
            me_pool_drain_ttl_secs: default_me_pool_drain_ttl_secs(),
            me_pool_drain_threshold: default_me_pool_drain_threshold(),
-            me_pool_drain_soft_evict_enabled: default_me_pool_drain_soft_evict_enabled(),
-            me_pool_drain_soft_evict_grace_secs: default_me_pool_drain_soft_evict_grace_secs(),
-            me_pool_drain_soft_evict_per_writer: default_me_pool_drain_soft_evict_per_writer(),
-            me_pool_drain_soft_evict_budget_per_core:
-                default_me_pool_drain_soft_evict_budget_per_core(),
-            me_pool_drain_soft_evict_cooldown_ms:
-                default_me_pool_drain_soft_evict_cooldown_ms(),
            me_bind_stale_mode: MeBindStaleMode::default(),
            me_bind_stale_ttl_secs: default_me_bind_stale_ttl_secs(),
            me_pool_min_fresh_ratio: default_me_pool_min_fresh_ratio(),
@@ -1183,6 +1156,13 @@ pub struct ServerConfig {
    #[serde(default = "default_proxy_protocol_header_timeout_ms")]
    pub proxy_protocol_header_timeout_ms: u64,

+    /// Trusted source CIDRs allowed to send incoming PROXY protocol headers.
+    ///
+    /// When non-empty, connections from addresses outside this allowlist are
+    /// rejected before `src_addr` is applied.
+    #[serde(default)]
+    pub proxy_protocol_trusted_cidrs: Vec<IpNetwork>,
+
    /// Port for the Prometheus-compatible metrics endpoint.
    /// Enables metrics when set; binds on all interfaces (dual-stack) by default.
    #[serde(default)]
@@ -1220,6 +1200,7 @@ impl Default for ServerConfig {
            listen_tcp: None,
            proxy_protocol: false,
            proxy_protocol_header_timeout_ms: default_proxy_protocol_header_timeout_ms(),
+            proxy_protocol_trusted_cidrs: Vec::new(),
            metrics_port: None,
            metrics_listen: None,
            metrics_whitelist: default_metrics_whitelist(),
@@ -7,8 +7,9 @@ use std::net::IpAddr;
 use std::sync::Arc;
 use std::sync::atomic::{AtomicU64, Ordering};
 use std::time::{Duration, Instant};
+use std::sync::Mutex;

-use tokio::sync::RwLock;
+use tokio::sync::{Mutex as AsyncMutex, RwLock};

 use crate::config::UserMaxUniqueIpsMode;

@@ -21,6 +22,8 @@ pub struct UserIpTracker {
    limit_mode: Arc<RwLock<UserMaxUniqueIpsMode>>,
    limit_window: Arc<RwLock<Duration>>,
    last_compact_epoch_secs: Arc<AtomicU64>,
+    pub(crate) cleanup_queue: Arc<Mutex<Vec<(String, IpAddr)>>>,
+    cleanup_drain_lock: Arc<AsyncMutex<()>>,
 }

 impl UserIpTracker {
@@ -33,6 +36,67 @@ impl UserIpTracker {
            limit_mode: Arc::new(RwLock::new(UserMaxUniqueIpsMode::ActiveWindow)),
            limit_window: Arc::new(RwLock::new(Duration::from_secs(30))),
            last_compact_epoch_secs: Arc::new(AtomicU64::new(0)),
+            cleanup_queue: Arc::new(Mutex::new(Vec::new())),
+            cleanup_drain_lock: Arc::new(AsyncMutex::new(())),
+        }
+    }
+
+
+    pub fn enqueue_cleanup(&self, user: String, ip: IpAddr) {
+        match self.cleanup_queue.lock() {
+            Ok(mut queue) => queue.push((user, ip)),
+            Err(poisoned) => {
+                let mut queue = poisoned.into_inner();
+                queue.push((user.clone(), ip));
+                self.cleanup_queue.clear_poison();
+                tracing::warn!(
+                    "UserIpTracker cleanup_queue lock poisoned; recovered and enqueued IP cleanup for {} ({})",
+                    user,
+                    ip
+                );
+            }
+        }
+    }
+
+    pub(crate) async fn drain_cleanup_queue(&self) {
+        // Serialize queue draining and active-IP mutation so check-and-add cannot
+        // observe stale active entries that are already queued for removal.
+        let _drain_guard = self.cleanup_drain_lock.lock().await;
+        let to_remove = {
+            match self.cleanup_queue.lock() {
+                Ok(mut queue) => {
+                    if queue.is_empty() {
+                        return;
+                    }
+                    std::mem::take(&mut *queue)
+                }
+                Err(poisoned) => {
+                    let mut queue = poisoned.into_inner();
+                    if queue.is_empty() {
+                        self.cleanup_queue.clear_poison();
+                        return;
+                    }
+                    let drained = std::mem::take(&mut *queue);
+                    self.cleanup_queue.clear_poison();
+                    drained
+                }
+            }
+        };
+
+        let mut active_ips = self.active_ips.write().await;
+        for (user, ip) in to_remove {
+            if let Some(user_ips) = active_ips.get_mut(&user) {
+                if let Some(count) = user_ips.get_mut(&ip) {
+                    if *count > 1 {
+                        *count -= 1;
+                    } else {
+                        user_ips.remove(&ip);
+                    }
+                }
+                if user_ips.is_empty() {
+                    active_ips.remove(&user);
+                }
+            }
        }
    }

@@ -118,6 +182,7 @@ impl UserIpTracker {
    }

    pub async fn check_and_add(&self, username: &str, ip: IpAddr) -> Result<(), String> {
+        self.drain_cleanup_queue().await;
        self.maybe_compact_empty_users().await;
        let default_max_ips = *self.default_max_ips.read().await;
        let limit = {
@@ -194,6 +259,7 @@ impl UserIpTracker {
    }

    pub async fn get_recent_counts_for_users(&self, users: &[String]) -> HashMap<String, usize> {
+        self.drain_cleanup_queue().await;
        let window = *self.limit_window.read().await;
        let now = Instant::now();
        let recent_ips = self.recent_ips.read().await;
@@ -214,6 +280,7 @@ impl UserIpTracker {
    }

    pub async fn get_active_ips_for_users(&self, users: &[String]) -> HashMap<String, Vec<IpAddr>> {
+        self.drain_cleanup_queue().await;
        let active_ips = self.active_ips.read().await;
        let mut out = HashMap::with_capacity(users.len());
        for user in users {
@@ -228,6 +295,7 @@ impl UserIpTracker {
    }

    pub async fn get_recent_ips_for_users(&self, users: &[String]) -> HashMap<String, Vec<IpAddr>> {
+        self.drain_cleanup_queue().await;
        let window = *self.limit_window.read().await;
        let now = Instant::now();
        let recent_ips = self.recent_ips.read().await;
@@ -250,11 +318,13 @@ impl UserIpTracker {
    }

    pub async fn get_active_ip_count(&self, username: &str) -> usize {
+        self.drain_cleanup_queue().await;
        let active_ips = self.active_ips.read().await;
        active_ips.get(username).map(|ips| ips.len()).unwrap_or(0)
    }

    pub async fn get_active_ips(&self, username: &str) -> Vec<IpAddr> {
+        self.drain_cleanup_queue().await;
        let active_ips = self.active_ips.read().await;
        active_ips
            .get(username)
@@ -263,6 +333,7 @@ impl UserIpTracker {
    }

    pub async fn get_stats(&self) -> Vec<(String, usize, usize)> {
+        self.drain_cleanup_queue().await;
        let active_ips = self.active_ips.read().await;
        let max_ips = self.max_ips.read().await;
        let default_max_ips = *self.default_max_ips.read().await;
@@ -301,6 +372,7 @@ impl UserIpTracker {
    }

    pub async fn is_ip_active(&self, username: &str, ip: IpAddr) -> bool {
+        self.drain_cleanup_queue().await;
        let active_ips = self.active_ips.read().await;
        active_ips
            .get(username)
@@ -448,3 +448,172 @@ async fn concurrent_reconnect_and_disconnect_preserves_non_negative_counts() {

    assert!(tracker.get_active_ip_count("cc").await <= 8);
 }
+
+#[tokio::test]
+async fn enqueue_cleanup_recovers_from_poisoned_mutex() {
+    let tracker = UserIpTracker::new();
+    let ip = ip_from_idx(99);
+    
+    // Poison the lock by panicking while holding it
+    let result = std::panic::catch_unwind(|| {
+        let _guard = tracker.cleanup_queue.lock().unwrap();
+        panic!("Intentional poison panic");
+    });
+    assert!(result.is_err(), "Expected panic to poison mutex");
+    
+    // Attempt to enqueue anyway; should hit the poison catch arm and still insert
+    tracker.enqueue_cleanup("poison-user".to_string(), ip);
+    
+    tracker.drain_cleanup_queue().await;
+    
+    assert_eq!(tracker.get_active_ip_count("poison-user").await, 0);
+}
+
+#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
+async fn mass_reconnect_sync_cleanup_prevents_temporary_reservation_bloat() {
+    // Tests that synchronous M-01 drop mechanism protects against starvation
+    let tracker = Arc::new(UserIpTracker::new());
+    tracker.set_user_limit("mass", 5).await;
+    
+    let ip = ip_from_idx(42);
+    let mut join_handles = Vec::new();
+
+    // 10,000 rapid concurrent requests hitting the same IP limit
+    for _ in 0..10_000 {
+        let tracker_clone = tracker.clone();
+        join_handles.push(tokio::spawn(async move {
+            if tracker_clone.check_and_add("mass", ip).await.is_ok() {
+                // Instantly enqueue cleanup, simulating synchronous reservation drop
+                tracker_clone.enqueue_cleanup("mass".to_string(), ip);
+                // The next caller will drain it before acquiring again
+            }
+        }));
+    }
+
+    for handle in join_handles {
+        let _ = handle.await;
+    }
+
+    // Force flush
+    tracker.drain_cleanup_queue().await;
+    assert_eq!(tracker.get_active_ip_count("mass").await, 0, "No leaked footprints");
+}
+
+#[tokio::test]
+async fn adversarial_drain_cleanup_queue_race_does_not_cause_false_rejections() {
+    // Regression guard: concurrent cleanup draining must not produce false
+    // limit denials for a new IP when the previous IP is already queued.
+    let tracker = Arc::new(UserIpTracker::new());
+    tracker.set_user_limit("racer", 1).await;
+    let ip1 = ip_from_idx(1);
+    let ip2 = ip_from_idx(2);
+
+    // Initial state: add ip1
+    tracker.check_and_add("racer", ip1).await.unwrap();
+
+    // User disconnects from ip1, queuing it
+    tracker.enqueue_cleanup("racer".to_string(), ip1);
+
+    let mut saw_false_rejection = false;
+    for _ in 0..100 {
+        // Queue cleanup then race explicit drain and check-and-add on the alternative IP.
+        tracker.enqueue_cleanup("racer".to_string(), ip1);
+        let tracker_a = tracker.clone();
+        let tracker_b = tracker.clone();
+
+        let drain_handle = tokio::spawn(async move {
+            tracker_a.drain_cleanup_queue().await;
+        });
+        let handle = tokio::spawn(async move {
+            tracker_b.check_and_add("racer", ip2).await
+        });
+
+        drain_handle.await.unwrap();
+        let res = handle.await.unwrap();
+        if res.is_err() {
+            saw_false_rejection = true;
+            break;
+        }
+
+        // Restore baseline for next iteration.
+        tracker.remove_ip("racer", ip2).await;
+        tracker.check_and_add("racer", ip1).await.unwrap();
+    }
+
+    assert!(
+        !saw_false_rejection,
+        "Concurrent cleanup draining must not cause false-positive IP denials"
+    );
+}
+
+#[tokio::test]
+async fn poisoned_cleanup_queue_still_releases_slot_for_next_ip() {
+    let tracker = UserIpTracker::new();
+    tracker.set_user_limit("poison-slot", 1).await;
+    let ip1 = ip_from_idx(7001);
+    let ip2 = ip_from_idx(7002);
+
+    tracker.check_and_add("poison-slot", ip1).await.unwrap();
+
+    // Poison the queue lock as an adversarial condition.
+    let _ = std::panic::catch_unwind(|| {
+        let _guard = tracker.cleanup_queue.lock().unwrap();
+        panic!("intentional queue poison");
+    });
+
+    // Disconnect path must still queue cleanup so the next IP can be admitted.
+    tracker.enqueue_cleanup("poison-slot".to_string(), ip1);
+    let admitted = tracker.check_and_add("poison-slot", ip2).await;
+    assert!(
+        admitted.is_ok(),
+        "cleanup queue poison must not permanently block slot release for the next IP"
+    );
+}
+
+#[tokio::test]
+async fn duplicate_cleanup_entries_do_not_break_future_admission() {
+    let tracker = UserIpTracker::new();
+    tracker.set_user_limit("dup-cleanup", 1).await;
+    let ip1 = ip_from_idx(7101);
+    let ip2 = ip_from_idx(7102);
+
+    tracker.check_and_add("dup-cleanup", ip1).await.unwrap();
+    tracker.enqueue_cleanup("dup-cleanup".to_string(), ip1);
+    tracker.enqueue_cleanup("dup-cleanup".to_string(), ip1);
+    tracker.enqueue_cleanup("dup-cleanup".to_string(), ip1);
+
+    tracker.drain_cleanup_queue().await;
+
+    assert_eq!(tracker.get_active_ip_count("dup-cleanup").await, 0);
+    assert!(
+        tracker.check_and_add("dup-cleanup", ip2).await.is_ok(),
+        "extra queued cleanup entries must not leave user stuck in denied state"
+    );
+}
+
+#[tokio::test]
+async fn stress_repeated_queue_poison_recovery_preserves_admission_progress() {
+    let tracker = UserIpTracker::new();
+    tracker.set_user_limit("poison-stress", 1).await;
+    let ip_primary = ip_from_idx(7201);
+    let ip_alt = ip_from_idx(7202);
+
+    tracker.check_and_add("poison-stress", ip_primary).await.unwrap();
+
+    for _ in 0..64 {
+        let _ = std::panic::catch_unwind(|| {
+            let _guard = tracker.cleanup_queue.lock().unwrap();
+            panic!("intentional queue poison in stress loop");
+        });
+
+        tracker.enqueue_cleanup("poison-stress".to_string(), ip_primary);
+
+        assert!(
+            tracker.check_and_add("poison-stress", ip_alt).await.is_ok(),
+            "poison recovery must preserve admission progress under repeated queue poisoning"
+        );
+
+        tracker.remove_ip("poison-stress", ip_alt).await;
+        tracker.check_and_add("poison-stress", ip_primary).await.unwrap();
+    }
+}
@@ -10,6 +10,16 @@ use crate::transport::middle_proxy::{
    ProxyConfigData, fetch_proxy_config_with_raw, load_proxy_config_cache, save_proxy_config_cache,
 };

+pub(crate) fn resolve_runtime_config_path(config_path_cli: &str, startup_cwd: &std::path::Path) -> PathBuf {
+    let raw = PathBuf::from(config_path_cli);
+    let absolute = if raw.is_absolute() {
+        raw
+    } else {
+        startup_cwd.join(raw)
+    };
+    absolute.canonicalize().unwrap_or(absolute)
+}
+
 pub(crate) fn parse_cli() -> (String, Option<PathBuf>, bool, Option<String>) {
    let mut config_path = "config.toml".to_string();
    let mut data_path: Option<PathBuf> = None;
@@ -96,6 +106,44 @@ pub(crate) fn parse_cli() -> (String, Option<PathBuf>, bool, Option<String>) {
    (config_path, data_path, silent, log_level)
 }

+#[cfg(test)]
+mod tests {
+    use super::resolve_runtime_config_path;
+
+    #[test]
+    fn resolve_runtime_config_path_anchors_relative_to_startup_cwd() {
+        let nonce = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .unwrap()
+            .as_nanos();
+        let startup_cwd = std::env::temp_dir().join(format!("telemt_cfg_path_{nonce}"));
+        std::fs::create_dir_all(&startup_cwd).unwrap();
+        let target = startup_cwd.join("config.toml");
+        std::fs::write(&target, " ").unwrap();
+
+        let resolved = resolve_runtime_config_path("config.toml", &startup_cwd);
+        assert_eq!(resolved, target.canonicalize().unwrap());
+
+        let _ = std::fs::remove_file(&target);
+        let _ = std::fs::remove_dir(&startup_cwd);
+    }
+
+    #[test]
+    fn resolve_runtime_config_path_keeps_absolute_for_missing_file() {
+        let nonce = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .unwrap()
+            .as_nanos();
+        let startup_cwd = std::env::temp_dir().join(format!("telemt_cfg_path_missing_{nonce}"));
+        std::fs::create_dir_all(&startup_cwd).unwrap();
+
+        let resolved = resolve_runtime_config_path("missing.toml", &startup_cwd);
+        assert_eq!(resolved, startup_cwd.join("missing.toml"));
+
+        let _ = std::fs::remove_dir(&startup_cwd);
+    }
+}
+
 pub(crate) fn print_proxy_links(host: &str, port: u16, config: &ProxyConfig) {
    info!(target: "telemt::links", "--- Proxy Links ({}) ---", host);
    for user_name in config.general.links.show.resolve_users(&config.access.users) {
@@ -238,11 +238,6 @@ pub(crate) async fn initialize_me_pool(
                    config.general.hardswap,
                    config.general.me_pool_drain_ttl_secs,
                    config.general.me_pool_drain_threshold,
-                    config.general.me_pool_drain_soft_evict_enabled,
-                    config.general.me_pool_drain_soft_evict_grace_secs,
-                    config.general.me_pool_drain_soft_evict_per_writer,
-                    config.general.me_pool_drain_soft_evict_budget_per_core,
-                    config.general.me_pool_drain_soft_evict_cooldown_ms,
                    config.general.effective_me_pool_force_close_secs(),
                    config.general.me_pool_min_fresh_ratio,
                    config.general.me_hardswap_warmup_delay_min_ms,
@@ -45,7 +45,7 @@ use crate::startup::{
 use crate::stream::BufferPool;
 use crate::transport::middle_proxy::MePool;
 use crate::transport::UpstreamManager;
-use helpers::parse_cli;
+use helpers::{parse_cli, resolve_runtime_config_path};

 /// Runs the full telemt runtime startup pipeline and blocks until shutdown.
 pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
@@ -58,18 +58,26 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
    startup_tracker
        .start_component(COMPONENT_CONFIG_LOAD, Some("load and validate config".to_string()))
        .await;
-    let (config_path, data_path, cli_silent, cli_log_level) = parse_cli();
+    let (config_path_cli, data_path, cli_silent, cli_log_level) = parse_cli();
+    let startup_cwd = match std::env::current_dir() {
+        Ok(cwd) => cwd,
+        Err(e) => {
+            eprintln!("[telemt] Can't read current_dir: {}", e);
+            std::process::exit(1);
+        }
+    };
+    let config_path = resolve_runtime_config_path(&config_path_cli, &startup_cwd);

    let mut config = match ProxyConfig::load(&config_path) {
        Ok(c) => c,
        Err(e) => {
-            if std::path::Path::new(&config_path).exists() {
+            if config_path.exists() {
                eprintln!("[telemt] Error: {}", e);
                std::process::exit(1);
            } else {
                let default = ProxyConfig::default();
                std::fs::write(&config_path, toml::to_string_pretty(&default).unwrap()).unwrap();
-                eprintln!("[telemt] Created default config at {}", config_path);
+                eprintln!("[telemt] Created default config at {}", config_path.display());
                default
            }
        }
@@ -258,7 +266,7 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
            let route_runtime_api = route_runtime.clone();
            let config_rx_api = api_config_rx.clone();
            let admission_rx_api = admission_rx.clone();
-            let config_path_api = std::path::PathBuf::from(&config_path);
+            let config_path_api = config_path.clone();
            let startup_tracker_api = startup_tracker.clone();
            let detected_ips_rx_api = detected_ips_rx.clone();
            tokio::spawn(async move {
@@ -476,7 +484,7 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
        Duration::from_secs(config.access.replay_window_secs),
    ));

-    let buffer_pool = Arc::new(BufferPool::with_config(64 * 1024, 4096));
+    let buffer_pool = Arc::new(BufferPool::with_config(16 * 1024, 4096));

    connectivity::run_startup_connectivity(
        &config,
@@ -1,5 +1,5 @@
 use std::net::IpAddr;
-use std::path::PathBuf;
+use std::path::Path;
 use std::sync::Arc;

 use tokio::sync::{mpsc, watch};
@@ -32,7 +32,7 @@ pub(crate) struct RuntimeWatches {
 #[allow(clippy::too_many_arguments)]
 pub(crate) async fn spawn_runtime_tasks(
    config: &Arc<ProxyConfig>,
-    config_path: &str,
+    config_path: &Path,
    probe: &NetworkProbe,
    prefer_ipv6: bool,
    decision_ipv4_dc: bool,
@@ -83,7 +83,7 @@ pub(crate) async fn spawn_runtime_tasks(
        watch::Receiver<Arc<ProxyConfig>>,
        watch::Receiver<LogLevel>,
    ) = spawn_config_watcher(
-        PathBuf::from(config_path),
+        config_path.to_path_buf(),
        config.clone(),
        detected_ip_v4,
        detected_ip_v6,
@@ -292,109 +292,6 @@ async fn render_metrics(stats: &Stats, config: &ProxyConfig, ip_tracker: &UserIp
        "telemt_connections_bad_total {}",
        if core_enabled { stats.get_connects_bad() } else { 0 }
    );
-    let _ = writeln!(out, "# HELP telemt_connections_current Current active connections");
-    let _ = writeln!(out, "# TYPE telemt_connections_current gauge");
-    let _ = writeln!(
-        out,
-        "telemt_connections_current {}",
-        if core_enabled {
-            stats.get_current_connections_total()
-        } else {
-            0
-        }
-    );
-    let _ = writeln!(out, "# HELP telemt_connections_direct_current Current active direct connections");
-    let _ = writeln!(out, "# TYPE telemt_connections_direct_current gauge");
-    let _ = writeln!(
-        out,
-        "telemt_connections_direct_current {}",
-        if core_enabled {
-            stats.get_current_connections_direct()
-        } else {
-            0
-        }
-    );
-    let _ = writeln!(out, "# HELP telemt_connections_me_current Current active middle-end connections");
-    let _ = writeln!(out, "# TYPE telemt_connections_me_current gauge");
-    let _ = writeln!(
-        out,
-        "telemt_connections_me_current {}",
-        if core_enabled {
-            stats.get_current_connections_me()
-        } else {
-            0
-        }
-    );
-    let _ = writeln!(
-        out,
-        "# HELP telemt_relay_adaptive_promotions_total Adaptive relay tier promotions"
-    );
-    let _ = writeln!(out, "# TYPE telemt_relay_adaptive_promotions_total counter");
-    let _ = writeln!(
-        out,
-        "telemt_relay_adaptive_promotions_total {}",
-        if core_enabled {
-            stats.get_relay_adaptive_promotions_total()
-        } else {
-            0
-        }
-    );
-    let _ = writeln!(
-        out,
-        "# HELP telemt_relay_adaptive_demotions_total Adaptive relay tier demotions"
-    );
-    let _ = writeln!(out, "# TYPE telemt_relay_adaptive_demotions_total counter");
-    let _ = writeln!(
-        out,
-        "telemt_relay_adaptive_demotions_total {}",
-        if core_enabled {
-            stats.get_relay_adaptive_demotions_total()
-        } else {
-            0
-        }
-    );
-    let _ = writeln!(
-        out,
-        "# HELP telemt_relay_adaptive_hard_promotions_total Adaptive relay hard promotions triggered by write pressure"
-    );
-    let _ = writeln!(
-        out,
-        "# TYPE telemt_relay_adaptive_hard_promotions_total counter"
-    );
-    let _ = writeln!(
-        out,
-        "telemt_relay_adaptive_hard_promotions_total {}",
-        if core_enabled {
-            stats.get_relay_adaptive_hard_promotions_total()
-        } else {
-            0
-        }
-    );
-    let _ = writeln!(out, "# HELP telemt_reconnect_evict_total Reconnect-driven session evictions");
-    let _ = writeln!(out, "# TYPE telemt_reconnect_evict_total counter");
-    let _ = writeln!(
-        out,
-        "telemt_reconnect_evict_total {}",
-        if core_enabled {
-            stats.get_reconnect_evict_total()
-        } else {
-            0
-        }
-    );
-    let _ = writeln!(
-        out,
-        "# HELP telemt_reconnect_stale_close_total Sessions closed because they became stale after reconnect"
-    );
-    let _ = writeln!(out, "# TYPE telemt_reconnect_stale_close_total counter");
-    let _ = writeln!(
-        out,
-        "telemt_reconnect_stale_close_total {}",
-        if core_enabled {
-            stats.get_reconnect_stale_close_total()
-        } else {
-            0
-        }
-    );

    let _ = writeln!(out, "# HELP telemt_handshake_timeouts_total Handshake timeouts");
    let _ = writeln!(out, "# TYPE telemt_handshake_timeouts_total counter");
@@ -1650,36 +1547,6 @@ async fn render_metrics(stats: &Stats, config: &ProxyConfig, ip_tracker: &UserIp
        }
    );

-    let _ = writeln!(
-        out,
-        "# HELP telemt_pool_drain_soft_evict_total Soft-evicted client sessions on stuck draining writers"
-    );
-    let _ = writeln!(out, "# TYPE telemt_pool_drain_soft_evict_total counter");
-    let _ = writeln!(
-        out,
-        "telemt_pool_drain_soft_evict_total {}",
-        if me_allows_normal {
-            stats.get_pool_drain_soft_evict_total()
-        } else {
-            0
-        }
-    );
-
-    let _ = writeln!(
-        out,
-        "# HELP telemt_pool_drain_soft_evict_writer_total Draining writers with at least one soft eviction"
-    );
-    let _ = writeln!(out, "# TYPE telemt_pool_drain_soft_evict_writer_total counter");
-    let _ = writeln!(
-        out,
-        "telemt_pool_drain_soft_evict_writer_total {}",
-        if me_allows_normal {
-            stats.get_pool_drain_soft_evict_writer_total()
-        } else {
-            0
-        }
-    );
-
    let _ = writeln!(out, "# HELP telemt_pool_stale_pick_total Stale writer fallback picks for new binds");
    let _ = writeln!(out, "# TYPE telemt_pool_stale_pick_total counter");
    let _ = writeln!(
@@ -1997,8 +1864,6 @@ mod tests {
        stats.increment_connects_all();
        stats.increment_connects_all();
        stats.increment_connects_bad();
-        stats.increment_current_connections_direct();
-        stats.increment_current_connections_me();
        stats.increment_handshake_timeouts();
        stats.increment_upstream_connect_attempt_total();
        stats.increment_upstream_connect_attempt_total();
@@ -2030,9 +1895,6 @@ mod tests {

        assert!(output.contains("telemt_connections_total 2"));
        assert!(output.contains("telemt_connections_bad_total 1"));
-        assert!(output.contains("telemt_connections_current 2"));
-        assert!(output.contains("telemt_connections_direct_current 1"));
-        assert!(output.contains("telemt_connections_me_current 1"));
        assert!(output.contains("telemt_handshake_timeouts_total 1"));
        assert!(output.contains("telemt_upstream_connect_attempt_total 2"));
        assert!(output.contains("telemt_upstream_connect_success_total 1"));
@@ -2075,9 +1937,6 @@ mod tests {
        let output = render_metrics(&stats, &config, &tracker).await;
        assert!(output.contains("telemt_connections_total 0"));
        assert!(output.contains("telemt_connections_bad_total 0"));
-        assert!(output.contains("telemt_connections_current 0"));
-        assert!(output.contains("telemt_connections_direct_current 0"));
-        assert!(output.contains("telemt_connections_me_current 0"));
        assert!(output.contains("telemt_handshake_timeouts_total 0"));
        assert!(output.contains("telemt_user_unique_ips_current{user="));
        assert!(output.contains("telemt_user_unique_ips_recent_window{user="));
@@ -2111,21 +1970,11 @@ mod tests {
        assert!(output.contains("# TYPE telemt_uptime_seconds gauge"));
        assert!(output.contains("# TYPE telemt_connections_total counter"));
        assert!(output.contains("# TYPE telemt_connections_bad_total counter"));
-        assert!(output.contains("# TYPE telemt_connections_current gauge"));
-        assert!(output.contains("# TYPE telemt_connections_direct_current gauge"));
-        assert!(output.contains("# TYPE telemt_connections_me_current gauge"));
-        assert!(output.contains("# TYPE telemt_relay_adaptive_promotions_total counter"));
-        assert!(output.contains("# TYPE telemt_relay_adaptive_demotions_total counter"));
-        assert!(output.contains("# TYPE telemt_relay_adaptive_hard_promotions_total counter"));
-        assert!(output.contains("# TYPE telemt_reconnect_evict_total counter"));
-        assert!(output.contains("# TYPE telemt_reconnect_stale_close_total counter"));
        assert!(output.contains("# TYPE telemt_handshake_timeouts_total counter"));
        assert!(output.contains("# TYPE telemt_upstream_connect_attempt_total counter"));
        assert!(output.contains("# TYPE telemt_me_rpc_proxy_req_signal_sent_total counter"));
        assert!(output.contains("# TYPE telemt_me_idle_close_by_peer_total counter"));
        assert!(output.contains("# TYPE telemt_me_writer_removed_total counter"));
-        assert!(output.contains("# TYPE telemt_pool_drain_soft_evict_total counter"));
-        assert!(output.contains("# TYPE telemt_pool_drain_soft_evict_writer_total counter"));
        assert!(output.contains(
            "# TYPE telemt_me_writer_removed_unexpected_minus_restored_total gauge"
        ));
@@ -11,8 +11,8 @@ use crate::crypto::{sha256_hmac, SecureRandom};
 use crate::error::ProxyError;
 use super::constants::*;
 use std::time::{SystemTime, UNIX_EPOCH};
-use num_bigint::BigUint;
-use num_traits::One;
+use subtle::ConstantTimeEq;
+use x25519_dalek::{X25519_BASEPOINT_BYTES, x25519};

 // ============= Public Constants =============

@@ -26,8 +26,17 @@ pub const TLS_DIGEST_POS: usize = 11;
 pub const TLS_DIGEST_HALF_LEN: usize = 16;

 /// Time skew limits for anti-replay (in seconds)
-pub const TIME_SKEW_MIN: i64 = -20 * 60; // 20 minutes before
-pub const TIME_SKEW_MAX: i64 = 10 * 60;  // 10 minutes after
+///
+/// The default window is intentionally narrow to reduce replay acceptance.
+/// Operators with known clock-drifted clients should tune deployment config
+/// (for example replay-window policy) to match their environment.
+pub const TIME_SKEW_MIN: i64 = -2 * 60; // 2 minutes before
+pub const TIME_SKEW_MAX: i64 = 2 * 60;  // 2 minutes after
+/// Maximum accepted boot-time timestamp (seconds) before skew checks are enforced.
+pub const BOOT_TIME_MAX_SECS: u32 = 7 * 24 * 60 * 60;
+/// Hard cap for boot-time compatibility bypass to avoid oversized acceptance
+/// windows when replay TTL is configured very large.
+pub const BOOT_TIME_COMPAT_MAX_SECS: u32 = 2 * 60;

 // ============= Private Constants =============

@@ -60,6 +69,7 @@ pub struct TlsValidation {
    /// Client digest for response generation
    pub digest: [u8; TLS_DIGEST_LEN],
    /// Timestamp extracted from digest
+    
    pub timestamp: u32,
 }

@@ -114,28 +124,8 @@ impl TlsExtensionBuilder {
        self
    }

-    /// Add ALPN extension with a single selected protocol.
-    fn add_alpn(&mut self, proto: &[u8]) -> &mut Self {
-        // Extension type: ALPN (0x0010)
-        self.extensions.extend_from_slice(&extension_type::ALPN.to_be_bytes());
-
-        // ALPN extension format:
-        // extension_data length (2 bytes)
-        //   protocols length (2 bytes)
-        //     protocol name length (1 byte)
-        //     protocol name bytes
-        let proto_len = proto.len() as u8;
-        let list_len: u16 = 1 + proto_len as u16;
-        let ext_len: u16 = 2 + list_len;
-
-        self.extensions.extend_from_slice(&ext_len.to_be_bytes());
-        self.extensions.extend_from_slice(&list_len.to_be_bytes());
-        self.extensions.push(proto_len);
-        self.extensions.extend_from_slice(proto);
-        self
-    }
-    
    /// Build final extensions with length prefix
+    
    fn build(self) -> Vec<u8> {
        let mut result = Vec::with_capacity(2 + self.extensions.len());
        
@@ -150,7 +140,7 @@ impl TlsExtensionBuilder {
    }
    
    /// Get current extensions without length prefix (for calculation)
-    #[allow(dead_code)]
+    
    fn as_bytes(&self) -> &[u8] {
        &self.extensions
    }
@@ -170,8 +160,6 @@ struct ServerHelloBuilder {
    compression: u8,
    /// Extensions
    extensions: TlsExtensionBuilder,
-    /// Selected ALPN protocol (if any)
-    alpn: Option<Vec<u8>>,
 }

 impl ServerHelloBuilder {
@@ -182,7 +170,6 @@ impl ServerHelloBuilder {
            cipher_suite: cipher_suite::TLS_AES_128_GCM_SHA256,
            compression: 0x00,
            extensions: TlsExtensionBuilder::new(),
-            alpn: None,
        }
    }
    
@@ -197,18 +184,9 @@ impl ServerHelloBuilder {
        self
    }

-    fn with_alpn(mut self, proto: Option<Vec<u8>>) -> Self {
-        self.alpn = proto;
-        self
-    }
-    
    /// Build ServerHello message (without record header)
    fn build_message(&self) -> Vec<u8> {
-        let mut ext_builder = self.extensions.clone();
-        if let Some(ref alpn) = self.alpn {
-            ext_builder.add_alpn(alpn);
-        }
-        let extensions = ext_builder.extensions.clone();
+        let extensions = self.extensions.extensions.clone();
        let extensions_len = extensions.len() as u16;
        
        // Calculate total length
@@ -273,13 +251,97 @@ impl ServerHelloBuilder {

 // ============= Public Functions =============

-/// Validate TLS ClientHello against user secrets
+/// Validate TLS ClientHello against user secrets.
 ///
 /// Returns validation result if a matching user is found.
+/// The result **must** be used — ignoring it silently bypasses authentication.
+#[must_use]
+
 pub fn validate_tls_handshake(
    handshake: &[u8],
    secrets: &[(String, Vec<u8>)],
    ignore_time_skew: bool,
+) -> Option<TlsValidation> {
+    validate_tls_handshake_with_replay_window(
+        handshake,
+        secrets,
+        ignore_time_skew,
+        u64::from(BOOT_TIME_MAX_SECS),
+    )
+}
+
+/// Validate TLS ClientHello and cap the boot-time bypass by replay-cache TTL.
+///
+/// A boot-time timestamp is only accepted when it falls below all three
+/// bounds: `BOOT_TIME_MAX_SECS`, configured replay window, and
+/// `BOOT_TIME_COMPAT_MAX_SECS`, preventing oversized compatibility windows.
+#[must_use]
+pub fn validate_tls_handshake_with_replay_window(
+    handshake: &[u8],
+    secrets: &[(String, Vec<u8>)],
+    ignore_time_skew: bool,
+    replay_window_secs: u64,
+) -> Option<TlsValidation> {
+    // Only pay the clock syscall when we will actually compare against it.
+    // If `ignore_time_skew` is set, a broken or unavailable system clock
+    // must not block legitimate clients — that would be a DoS via clock failure.
+    let now = if !ignore_time_skew {
+        system_time_to_unix_secs(SystemTime::now())?
+    } else {
+        0_i64
+    };
+
+    let replay_window_u32 = u32::try_from(replay_window_secs).unwrap_or(u32::MAX);
+    // Boot-time bypass and ignore_time_skew serve different compatibility paths.
+    // When skew checks are disabled, force boot-time cap to zero to prevent
+    // accidental future coupling of boot-time logic into the ignore-skew path.
+    let boot_time_cap_secs = if ignore_time_skew {
+        0
+    } else {
+        BOOT_TIME_MAX_SECS
+            .min(replay_window_u32)
+            .min(BOOT_TIME_COMPAT_MAX_SECS)
+    };
+
+    validate_tls_handshake_at_time_with_boot_cap(
+        handshake,
+        secrets,
+        ignore_time_skew,
+        now,
+        boot_time_cap_secs,
+    )
+}
+
+fn system_time_to_unix_secs(now: SystemTime) -> Option<i64> {
+    // `try_from` rejects values that overflow i64 (> ~292 billion years CE),
+    // whereas `as i64` would silently wrap to a negative timestamp and corrupt
+    // every subsequent time-skew comparison.
+    let d = now.duration_since(UNIX_EPOCH).ok()?;
+    i64::try_from(d.as_secs()).ok()
+}
+
+
+fn validate_tls_handshake_at_time(
+    handshake: &[u8],
+    secrets: &[(String, Vec<u8>)],
+    ignore_time_skew: bool,
+    now: i64,
+) -> Option<TlsValidation> {
+    validate_tls_handshake_at_time_with_boot_cap(
+        handshake,
+        secrets,
+        ignore_time_skew,
+        now,
+        BOOT_TIME_MAX_SECS,
+    )
+}
+
+fn validate_tls_handshake_at_time_with_boot_cap(
+    handshake: &[u8],
+    secrets: &[(String, Vec<u8>)],
+    ignore_time_skew: bool,
+    now: i64,
+    boot_time_cap_secs: u32,
 ) -> Option<TlsValidation> {
    if handshake.len() < TLS_DIGEST_POS + TLS_DIGEST_LEN + 1 {
        return None;
@@ -293,6 +355,9 @@ pub fn validate_tls_handshake(
    // Extract session ID
    let session_id_len_pos = TLS_DIGEST_POS + TLS_DIGEST_LEN;
    let session_id_len = handshake.get(session_id_len_pos).copied()? as usize;
+    if session_id_len > 32 {
+        return None;
+    }
    let session_id_start = session_id_len_pos + 1;
    
    if handshake.len() < session_id_start + session_id_len {
@@ -305,73 +370,66 @@ pub fn validate_tls_handshake(
    let mut msg = handshake.to_vec();
    msg[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN].fill(0);
    
-    // Get current time
-    let now = SystemTime::now()
-        .duration_since(UNIX_EPOCH)
-        .unwrap()
-        .as_secs() as i64;
-    
+    let mut first_match: Option<(&String, u32)> = None;
+
    for (user, secret) in secrets {
        let computed = sha256_hmac(secret, &msg);
-        
-        // XOR digests
-        let xored: Vec<u8> = digest.iter()
-            .zip(computed.iter())
-            .map(|(a, b)| a ^ b)
-            .collect();
-        
-        // Check that first 28 bytes are zeros (timestamp in last 4)
-        if !xored[..28].iter().all(|&b| b == 0) {
+
+        // Constant-time equality check on the 28-byte HMAC window.
+        // A variable-time short-circuit here lets an active censor measure how many
+        // bytes matched, enabling secret brute-force via timing side-channels.
+        // Direct comparison on the original arrays avoids a heap allocation and
+        // removes the `try_into().unwrap()` that the intermediate Vec would require.
+        if !bool::from(digest[..28].ct_eq(&computed[..28])) {
            continue;
        }
-        
-        // Extract timestamp
-        let timestamp = u32::from_le_bytes(xored[28..32].try_into().unwrap());
-        let time_diff = now - timestamp as i64;
-        
-        // Check time skew
+
+        // The last 4 bytes encode the timestamp as XOR(digest[28..32], computed[28..32]).
+        // Inline array construction is infallible: both slices are [u8; 32] by construction.
+        let timestamp = u32::from_le_bytes([
+            digest[28] ^ computed[28],
+            digest[29] ^ computed[29],
+            digest[30] ^ computed[30],
+            digest[31] ^ computed[31],
+        ]);
+
+        // time_diff is only meaningful (and `now` is only valid) when we are
+        // actually checking the window.  Keep both inside the guard to make
+        // the dead-code path explicit and prevent accidental future use of
+        // a sentinel `now` value outside its intended scope.
        if !ignore_time_skew {
            // Allow very small timestamps (boot time instead of unix time)
            // This is a quirk in some clients that use uptime instead of real time
-            let is_boot_time = timestamp < 60 * 60 * 24 * 1000; // < ~2.7 years in seconds
-            
-            if !is_boot_time && !(TIME_SKEW_MIN..=TIME_SKEW_MAX).contains(&time_diff) {
-                continue;
+            let is_boot_time = boot_time_cap_secs > 0 && timestamp < boot_time_cap_secs;
+            if !is_boot_time {
+                let time_diff = now - i64::from(timestamp);
+                if !(TIME_SKEW_MIN..=TIME_SKEW_MAX).contains(&time_diff) {
+                    continue;
+                }
            }
        }
        
-        return Some(TlsValidation {
-            user: user.clone(),
-            session_id,
-            digest,
-            timestamp,
-        });
+        if first_match.is_none() {
+            first_match = Some((user, timestamp));
+        }
    }
-    
-    None
-}

-fn curve25519_prime() -> BigUint {
-    (BigUint::one() << 255) - BigUint::from(19u32)
+    first_match.map(|(user, timestamp)| TlsValidation {
+        user: user.clone(),
+        session_id,
+        digest,
+        timestamp,
+    })
 }

 /// Generate a fake X25519 public key for TLS
 ///
-/// Produces a quadratic residue mod p = 2^255 - 19 by computing n² mod p,
-/// which matches Python/C behavior and avoids DPI fingerprinting.
+/// Uses RFC 7748 X25519 scalar multiplication over the canonical basepoint,
+/// yielding distribution-consistent public keys for anti-fingerprinting.
 pub fn gen_fake_x25519_key(rng: &SecureRandom) -> [u8; 32] {
-    let mut n_bytes = [0u8; 32];
-    n_bytes.copy_from_slice(&rng.bytes(32));
-
-    let n = BigUint::from_bytes_le(&n_bytes);
-    let p = curve25519_prime();
-    let pk = (&n * &n) % &p;
-
-    let mut out = pk.to_bytes_le();
-    out.resize(32, 0);
-    let mut result = [0u8; 32];
-    result.copy_from_slice(&out[..32]);
-    result
+    let mut scalar = [0u8; 32];
+    scalar.copy_from_slice(&rng.bytes(32));
+    x25519(scalar, X25519_BASEPOINT_BYTES)
 }

 /// Build TLS ServerHello response
@@ -400,7 +458,6 @@ pub fn build_server_hello(
    let server_hello = ServerHelloBuilder::new(session_id.to_vec())
        .with_x25519_key(&x25519_key)
        .with_tls13_version()
-        .with_alpn(alpn)
        .build_record();
    
    // Build Change Cipher Spec record
@@ -411,8 +468,27 @@ pub fn build_server_hello(
        0x01,       // CCS byte
    ];
    
-    // Build fake certificate (Application Data record)
-    let fake_cert = rng.bytes(fake_cert_len);
+    // Build first encrypted flight mimic as opaque ApplicationData bytes.
+    // Embed a compact EncryptedExtensions-like ALPN block when selected.
+    let mut fake_cert = Vec::with_capacity(fake_cert_len);
+    if let Some(proto) = alpn.as_ref().filter(|p| !p.is_empty() && p.len() <= u8::MAX as usize) {
+        let proto_list_len = 1usize + proto.len();
+        let ext_data_len = 2usize + proto_list_len;
+        let marker_len = 4usize + ext_data_len;
+        if marker_len <= fake_cert_len {
+            fake_cert.extend_from_slice(&0x0010u16.to_be_bytes());
+            fake_cert.extend_from_slice(&(ext_data_len as u16).to_be_bytes());
+            fake_cert.extend_from_slice(&(proto_list_len as u16).to_be_bytes());
+            fake_cert.push(proto.len() as u8);
+            fake_cert.extend_from_slice(proto);
+        }
+    }
+    if fake_cert.len() < fake_cert_len {
+        fake_cert.extend_from_slice(&rng.bytes(fake_cert_len - fake_cert.len()));
+    } else if fake_cert.len() > fake_cert_len {
+        fake_cert.truncate(fake_cert_len);
+    }
+
    let mut app_data_record = Vec::with_capacity(5 + fake_cert_len);
    app_data_record.push(TLS_RECORD_APPLICATION);
    app_data_record.extend_from_slice(&TLS_VERSION);
@@ -424,8 +500,9 @@ pub fn build_server_hello(
    // Build optional NewSessionTicket records (TLS 1.3 handshake messages are encrypted;
    // here we mimic with opaque ApplicationData records of plausible size).
    let mut tickets = Vec::new();
-    if new_session_tickets > 0 {
-        for _ in 0..new_session_tickets {
+    let ticket_count = new_session_tickets.min(4);
+    if ticket_count > 0 {
+        for _ in 0..ticket_count {
            let ticket_len: usize = rng.range(48) + 48; // 48-95 bytes
            let mut record = Vec::with_capacity(5 + ticket_len);
            record.push(TLS_RECORD_APPLICATION);
@@ -467,6 +544,11 @@ pub fn extract_sni_from_client_hello(handshake: &[u8]) -> Option<String> {
        return None;
    }

+    let record_len = u16::from_be_bytes([handshake[3], handshake[4]]) as usize;
+    if handshake.len() < 5 + record_len {
+        return None;
+    }
+
    let mut pos = 5; // after record header
    if handshake.get(pos).copied()? != 0x01 {
        return None; // not ClientHello
@@ -528,7 +610,9 @@ pub fn extract_sni_from_client_hello(handshake: &[u8]) -> Option<String> {
                if name_type == 0 && name_len > 0
                    && let Ok(host) = std::str::from_utf8(&handshake[sn_pos..sn_pos + name_len])
                {
-                    return Some(host.to_string());
+                    if is_valid_sni_hostname(host) {
+                        return Some(host.to_string());
+                    }
                }
                sn_pos += name_len;
            }
@@ -539,8 +623,46 @@ pub fn extract_sni_from_client_hello(handshake: &[u8]) -> Option<String> {
    None
 }

+fn is_valid_sni_hostname(host: &str) -> bool {
+    if host.is_empty() || host.len() > 253 {
+        return false;
+    }
+    if host.starts_with('.') || host.ends_with('.') {
+        return false;
+    }
+    if host.parse::<std::net::IpAddr>().is_ok() {
+        return false;
+    }
+
+    for label in host.split('.') {
+        if label.is_empty() || label.len() > 63 {
+            return false;
+        }
+        if label.starts_with('-') || label.ends_with('-') {
+            return false;
+        }
+        if !label
+            .bytes()
+            .all(|b| b.is_ascii_alphanumeric() || b == b'-')
+        {
+            return false;
+        }
+    }
+
+    true
+}
+
 /// Extract ALPN protocol list from ClientHello, return in offered order.
 pub fn extract_alpn_from_client_hello(handshake: &[u8]) -> Vec<Vec<u8>> {
+    if handshake.len() < 5 || handshake[0] != TLS_RECORD_HANDSHAKE {
+        return Vec::new();
+    }
+
+    let record_len = u16::from_be_bytes([handshake[3], handshake[4]]) as usize;
+    if handshake.len() < 5 + record_len {
+        return Vec::new();
+    }
+
    let mut pos = 5; // after record header
    if handshake.get(pos) != Some(&0x01) {
        return Vec::new();
@@ -592,13 +714,14 @@ pub fn is_tls_handshake(first_bytes: &[u8]) -> bool {
        return false;
    }
    
-    // TLS record header: 0x16 (handshake) 0x03 0x01 (TLS 1.0)
+    // TLS ClientHello commonly uses legacy record versions 0x0301 or 0x0303.
    first_bytes[0] == TLS_RECORD_HANDSHAKE 
        && first_bytes[1] == 0x03 
-        && first_bytes[2] == 0x01
+        && (first_bytes[2] == 0x01 || first_bytes[2] == 0x03)
 }

 /// Parse TLS record header, returns (record_type, length)
+
 pub fn parse_tls_record_header(header: &[u8; 5]) -> Option<(u8, u16)> {
    let record_type = header[0];
    let version = [header[1], header[2]];
@@ -667,291 +790,37 @@ fn validate_server_hello_structure(data: &[u8]) -> Result<(), ProxyError> {
    Ok(())
 }

-#[cfg(test)]
-mod tests {
-    use super::*;
-    
-    #[test]
-    fn test_is_tls_handshake() {
-        assert!(is_tls_handshake(&[0x16, 0x03, 0x01]));
-        assert!(is_tls_handshake(&[0x16, 0x03, 0x01, 0x02, 0x00]));
-        assert!(!is_tls_handshake(&[0x17, 0x03, 0x01])); // Application data
-        assert!(!is_tls_handshake(&[0x16, 0x03, 0x02])); // Wrong version
-        assert!(!is_tls_handshake(&[0x16, 0x03])); // Too short
-    }
-    
-    #[test]
-    fn test_parse_tls_record_header() {
-        let header = [0x16, 0x03, 0x01, 0x02, 0x00];
-        let result = parse_tls_record_header(&header).unwrap();
-        assert_eq!(result.0, TLS_RECORD_HANDSHAKE);
-        assert_eq!(result.1, 512);
-        
-        let header = [0x17, 0x03, 0x03, 0x40, 0x00];
-        let result = parse_tls_record_header(&header).unwrap();
-        assert_eq!(result.0, TLS_RECORD_APPLICATION);
-        assert_eq!(result.1, 16384);
-    }
-    
-    #[test]
-    fn test_gen_fake_x25519_key() {
-        let rng = SecureRandom::new();
-        let key1 = gen_fake_x25519_key(&rng);
-        let key2 = gen_fake_x25519_key(&rng);
-        
-        assert_eq!(key1.len(), 32);
-        assert_eq!(key2.len(), 32);
-        assert_ne!(key1, key2); // Should be random
-    }
+// ============= Compile-time Security Invariants =============

-    #[test]
-    fn test_fake_x25519_key_is_quadratic_residue() {
-        let rng = SecureRandom::new();
-        let key = gen_fake_x25519_key(&rng);
-        let p = curve25519_prime();
-        let k_num = BigUint::from_bytes_le(&key);
-        let exponent = (&p - BigUint::one()) >> 1;
-        let legendre = k_num.modpow(&exponent, &p);
-        assert_eq!(legendre, BigUint::one());
-    }
-    
-    #[test]
-    fn test_tls_extension_builder() {
-        let key = [0x42u8; 32];
-        
-        let mut builder = TlsExtensionBuilder::new();
-        builder.add_key_share(&key);
-        builder.add_supported_versions(0x0304);
-        
-        let result = builder.build();
-        
-        // Check length prefix
-        let len = u16::from_be_bytes([result[0], result[1]]) as usize;
-        assert_eq!(len, result.len() - 2);
-        
-        // Check key_share extension is present
-        assert!(result.len() > 40); // At least key share
-    }
-    
-    #[test]
-    fn test_server_hello_builder() {
-        let session_id = vec![0x01, 0x02, 0x03, 0x04];
-        let key = [0x55u8; 32];
-        
-        let builder = ServerHelloBuilder::new(session_id.clone())
-            .with_x25519_key(&key)
-            .with_tls13_version();
-        
-        let record = builder.build_record();
-        
-        // Validate structure
-        validate_server_hello_structure(&record).expect("Invalid ServerHello structure");
-        
-        // Check record type
-        assert_eq!(record[0], TLS_RECORD_HANDSHAKE);
-        
-        // Check version
-        assert_eq!(&record[1..3], &TLS_VERSION);
-        
-        // Check message type (ServerHello = 0x02)
-        assert_eq!(record[5], 0x02);
-    }
-    
-    #[test]
-    fn test_build_server_hello_structure() {
-        let secret = b"test secret";
-        let client_digest = [0x42u8; 32];
-        let session_id = vec![0xAA; 32];
-        
-        let rng = SecureRandom::new();
-        let response = build_server_hello(secret, &client_digest, &session_id, 2048, &rng, None, 0);
-        
-        // Should have at least 3 records
-        assert!(response.len() > 100);
-        
-        // First record should be ServerHello
-        assert_eq!(response[0], TLS_RECORD_HANDSHAKE);
-        
-        // Validate ServerHello structure
-        validate_server_hello_structure(&response).expect("Invalid ServerHello");
-        
-        // Find Change Cipher Spec
-        let server_hello_len = 5 + u16::from_be_bytes([response[3], response[4]]) as usize;
-        let ccs_start = server_hello_len;
-        
-        assert!(response.len() > ccs_start + 6);
-        assert_eq!(response[ccs_start], TLS_RECORD_CHANGE_CIPHER);
-        
-        // Find Application Data
-        let ccs_len = 5 + u16::from_be_bytes([response[ccs_start + 3], response[ccs_start + 4]]) as usize;
-        let app_start = ccs_start + ccs_len;
-        
-        assert!(response.len() > app_start + 5);
-        assert_eq!(response[app_start], TLS_RECORD_APPLICATION);
-    }
-    
-    #[test]
-    fn test_build_server_hello_digest() {
-        let secret = b"test secret key here";
-        let client_digest = [0x42u8; 32];
-        let session_id = vec![0xAA; 32];
-        
-        let rng = SecureRandom::new();
-        let response1 = build_server_hello(secret, &client_digest, &session_id, 1024, &rng, None, 0);
-        let response2 = build_server_hello(secret, &client_digest, &session_id, 1024, &rng, None, 0);
-        
-        // Digest position should have non-zero data
-        let digest1 = &response1[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN];
-        assert!(!digest1.iter().all(|&b| b == 0));
-        
-        // Different calls should have different digests (due to random cert)
-        let digest2 = &response2[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN];
-        assert_ne!(digest1, digest2);
-    }
-    
-    #[test]
-    fn test_server_hello_extensions_length() {
-        let session_id = vec![0x01; 32];
-        let key = [0x55u8; 32];
-        
-        let builder = ServerHelloBuilder::new(session_id)
-            .with_x25519_key(&key)
-            .with_tls13_version();
-        
-        let record = builder.build_record();
-        
-        // Parse to find extensions
-        let msg_start = 5; // After record header
-        let msg_len = u32::from_be_bytes([0, record[6], record[7], record[8]]) as usize;
-        
-        // Skip to session ID
-        let session_id_pos = msg_start + 4 + 2 + 32; // header(4) + version(2) + random(32)
-        let session_id_len = record[session_id_pos] as usize;
-        
-        // Skip to extensions
-        let ext_len_pos = session_id_pos + 1 + session_id_len + 2 + 1; // session_id + cipher(2) + compression(1)
-        let ext_len = u16::from_be_bytes([record[ext_len_pos], record[ext_len_pos + 1]]) as usize;
-        
-        // Verify extensions length matches actual data
-        let extensions_data = &record[ext_len_pos + 2..msg_start + 4 + msg_len];
-        assert_eq!(ext_len, extensions_data.len(), 
-            "Extension length mismatch: declared {}, actual {}", ext_len, extensions_data.len());
-    }
-    
-    #[test]
-    fn test_validate_tls_handshake_format() {
-        // Build a minimal ClientHello-like structure
-        let mut handshake = vec![0u8; 100];
-        
-        // Put a valid-looking digest at position 11
-        handshake[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN]
-            .copy_from_slice(&[0x42; 32]);
-        
-        // Session ID length
-        handshake[TLS_DIGEST_POS + TLS_DIGEST_LEN] = 32;
-        
-        // This won't validate (wrong HMAC) but shouldn't panic
-        let secrets = vec![("test".to_string(), b"secret".to_vec())];
-        let result = validate_tls_handshake(&handshake, &secrets, true);
-        
-        // Should return None (no match) but not panic
-        assert!(result.is_none());
-    }
+/// Compile-time checks that enforce invariants the rest of the code relies on.
+/// Using `static_assertions` ensures these can never silently break across
+/// refactors without a compile error.
+mod compile_time_security_checks {
+    use super::{TLS_DIGEST_LEN, TLS_DIGEST_HALF_LEN};
+    use static_assertions::const_assert;

-    fn build_client_hello_with_exts(exts: Vec<(u16, Vec<u8>)>, host: &str) -> Vec<u8> {
-        let mut body = Vec::new();
-        body.extend_from_slice(&TLS_VERSION); // legacy version
-        body.extend_from_slice(&[0u8; 32]); // random
-        body.push(0); // session id len
-        body.extend_from_slice(&2u16.to_be_bytes()); // cipher suites len
-        body.extend_from_slice(&[0x13, 0x01]); // TLS_AES_128_GCM_SHA256
-        body.push(1); // compression len
-        body.push(0); // null compression
+    // The digest must be exactly one SHA-256 output.
+    const_assert!(TLS_DIGEST_LEN == 32);

-        // Build SNI extension
-        let host_bytes = host.as_bytes();
-        let mut sni_ext = Vec::new();
-        sni_ext.extend_from_slice(&(host_bytes.len() as u16 + 3).to_be_bytes());
-        sni_ext.push(0);
-        sni_ext.extend_from_slice(&(host_bytes.len() as u16).to_be_bytes());
-        sni_ext.extend_from_slice(host_bytes);
+    // Replay-dedup stores the first half; verify it is literally half.
+    const_assert!(TLS_DIGEST_HALF_LEN * 2 == TLS_DIGEST_LEN);

-        let mut ext_blob = Vec::new();
-        for (typ, data) in exts {
-            ext_blob.extend_from_slice(&typ.to_be_bytes());
-            ext_blob.extend_from_slice(&(data.len() as u16).to_be_bytes());
-            ext_blob.extend_from_slice(&data);
-        }
-        // SNI last
-        ext_blob.extend_from_slice(&0x0000u16.to_be_bytes());
-        ext_blob.extend_from_slice(&(sni_ext.len() as u16).to_be_bytes());
-        ext_blob.extend_from_slice(&sni_ext);
-
-        body.extend_from_slice(&(ext_blob.len() as u16).to_be_bytes());
-        body.extend_from_slice(&ext_blob);
-
-        let mut handshake = Vec::new();
-        handshake.push(0x01); // ClientHello
-        let len_bytes = (body.len() as u32).to_be_bytes();
-        handshake.extend_from_slice(&len_bytes[1..4]);
-        handshake.extend_from_slice(&body);
-
-        let mut record = Vec::new();
-        record.push(TLS_RECORD_HANDSHAKE);
-        record.extend_from_slice(&[0x03, 0x01]);
-        record.extend_from_slice(&(handshake.len() as u16).to_be_bytes());
-        record.extend_from_slice(&handshake);
-        record
-    }
-
-    #[test]
-    fn test_extract_sni_with_grease_extension() {
-        // GREASE type 0x0a0a with zero length before SNI
-        let ch = build_client_hello_with_exts(vec![(0x0a0a, Vec::new())], "example.com");
-        let sni = extract_sni_from_client_hello(&ch);
-        assert_eq!(sni.as_deref(), Some("example.com"));
-    }
-
-    #[test]
-    fn test_extract_sni_tolerates_empty_unknown_extension() {
-        let ch = build_client_hello_with_exts(vec![(0x1234, Vec::new())], "test.local");
-        let sni = extract_sni_from_client_hello(&ch);
-        assert_eq!(sni.as_deref(), Some("test.local"));
-    }
-
-    #[test]
-    fn test_extract_alpn_single() {
-        let mut alpn_data = Vec::new();
-        // list length = 3 (1 length byte + "h2")
-        alpn_data.extend_from_slice(&3u16.to_be_bytes());
-        alpn_data.push(2);
-        alpn_data.extend_from_slice(b"h2");
-        let ch = build_client_hello_with_exts(vec![(0x0010, alpn_data)], "alpn.test");
-        let alpn = extract_alpn_from_client_hello(&ch);
-        let alpn_str: Vec<String> = alpn
-            .iter()
-            .map(|p| std::str::from_utf8(p).unwrap().to_string())
-            .collect();
-        assert_eq!(alpn_str, vec!["h2"]);
-    }
-
-    #[test]
-    fn test_extract_alpn_multiple() {
-        let mut alpn_data = Vec::new();
-        // list length = 11 (sum of per-proto lengths including length bytes)
-        alpn_data.extend_from_slice(&11u16.to_be_bytes());
-        alpn_data.push(2);
-        alpn_data.extend_from_slice(b"h2");
-        alpn_data.push(4);
-        alpn_data.extend_from_slice(b"spdy");
-        alpn_data.push(2);
-        alpn_data.extend_from_slice(b"h3");
-        let ch = build_client_hello_with_exts(vec![(0x0010, alpn_data)], "alpn.test");
-        let alpn = extract_alpn_from_client_hello(&ch);
-        let alpn_str: Vec<String> = alpn
-            .iter()
-            .map(|p| std::str::from_utf8(p).unwrap().to_string())
-            .collect();
-        assert_eq!(alpn_str, vec!["h2", "spdy", "h3"]);
-    }
+    // The HMAC check window (28 bytes) plus the embedded timestamp (4 bytes)
+    // must exactly fill the digest.  If TLS_DIGEST_LEN ever changes, these
+    // assertions will catch the mismatch before any timing-oracle fix is broke.
+    const_assert!(28 + 4 == TLS_DIGEST_LEN);
 }
+
+// ============= Security-focused regression tests =============
+
+#[cfg(test)]
+#[path = "tls_security_tests.rs"]
+mod security_tests;
+
+#[cfg(test)]
+#[path = "tls_adversarial_tests.rs"]
+mod adversarial_tests;
+
+#[cfg(test)]
+#[path = "tls_fuzz_security_tests.rs"]
+mod fuzz_security_tests;
@@ -0,0 +1,352 @@
+use super::*;
+use std::time::Instant;
+use crate::crypto::sha256_hmac;
+
+/// Helper to create a byte vector of specific length.
+fn make_garbage(len: usize) -> Vec<u8> {
+    vec![0x42u8; len]
+}
+
+/// Helper to create a valid-looking HMAC digest for test.
+fn make_digest(secret: &[u8], msg: &[u8], ts: u32) -> [u8; 32] {
+    let mut hmac = sha256_hmac(secret, msg);
+    let ts_bytes = ts.to_le_bytes();
+    for i in 0..4 {
+        hmac[28 + i] ^= ts_bytes[i];
+    }
+    hmac
+}
+
+fn make_valid_tls_handshake_with_session_id(
+    secret: &[u8],
+    timestamp: u32,
+    session_id: &[u8],
+) -> Vec<u8> {
+    let session_id_len = session_id.len();
+    let len = TLS_DIGEST_POS + TLS_DIGEST_LEN + 1 + session_id_len;
+    let mut handshake = vec![0x42u8; len];
+
+    handshake[TLS_DIGEST_POS + TLS_DIGEST_LEN] = session_id_len as u8;
+    let sid_start = TLS_DIGEST_POS + TLS_DIGEST_LEN + 1;
+    handshake[sid_start..sid_start + session_id_len].copy_from_slice(session_id);
+    handshake[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN].fill(0);
+
+    let digest = make_digest(secret, &handshake, timestamp);
+
+    handshake[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN]
+        .copy_from_slice(&digest);
+    handshake
+}
+
+fn make_valid_tls_handshake(secret: &[u8], timestamp: u32) -> Vec<u8> {
+    make_valid_tls_handshake_with_session_id(secret, timestamp, &[0x42; 32])
+}
+
+// ------------------------------------------------------------------
+// Truncated Packet Tests (OWASP ASVS 5.1.4, 5.1.5)
+// ------------------------------------------------------------------
+
+#[test]
+fn validate_tls_handshake_truncated_10_bytes_rejected() {
+    let secrets = vec![("user".to_string(), b"secret".to_vec())];
+    let truncated = make_garbage(10);
+    assert!(validate_tls_handshake(&truncated, &secrets, true).is_none());
+}
+
+#[test]
+fn validate_tls_handshake_truncated_at_digest_start_rejected() {
+    let secrets = vec![("user".to_string(), b"secret".to_vec())];
+    // TLS_DIGEST_POS = 11. 11 bytes should be rejected.
+    let truncated = make_garbage(TLS_DIGEST_POS);
+    assert!(validate_tls_handshake(&truncated, &secrets, true).is_none());
+}
+
+#[test]
+fn validate_tls_handshake_truncated_inside_digest_rejected() {
+    let secrets = vec![("user".to_string(), b"secret".to_vec())];
+    // TLS_DIGEST_POS + 16 (half digest)
+    let truncated = make_garbage(TLS_DIGEST_POS + 16);
+    assert!(validate_tls_handshake(&truncated, &secrets, true).is_none());
+}
+
+#[test]
+fn extract_sni_truncated_at_record_header_rejected() {
+    let truncated = make_garbage(3);
+    assert!(extract_sni_from_client_hello(&truncated).is_none());
+}
+
+#[test]
+fn extract_sni_truncated_at_handshake_header_rejected() {
+    let mut truncated = vec![TLS_RECORD_HANDSHAKE, 0x03, 0x03, 0x00, 0x05];
+    truncated.extend_from_slice(&[0x01, 0x00]); // ClientHello type but truncated length
+    assert!(extract_sni_from_client_hello(&truncated).is_none());
+}
+
+// ------------------------------------------------------------------
+// Malformed Extension Parsing Tests
+// ------------------------------------------------------------------
+
+#[test]
+fn extract_sni_with_overlapping_extension_lengths_rejected() {
+    let mut h = vec![0x16, 0x03, 0x03, 0x00, 0x60]; // Record header
+    h.push(0x01); // Handshake type: ClientHello
+    h.extend_from_slice(&[0x00, 0x00, 0x5C]); // Length: 92
+    h.extend_from_slice(&[0x03, 0x03]); // Version
+    h.extend_from_slice(&[0u8; 32]); // Random
+    h.push(0); // Session ID length: 0
+    h.extend_from_slice(&[0x00, 0x02, 0x13, 0x01]); // Cipher suites
+    h.extend_from_slice(&[0x01, 0x00]); // Compression
+    
+    // Extensions start
+    h.extend_from_slice(&[0x00, 0x20]); // Total Extensions length: 32
+    
+    // Extension 1: SNI (type 0)
+    h.extend_from_slice(&[0x00, 0x00]); 
+    h.extend_from_slice(&[0x00, 0x40]); // Claimed len: 64 (OVERFLOWS total extensions len 32)
+    h.extend_from_slice(&[0u8; 64]);
+    
+    assert!(extract_sni_from_client_hello(&h).is_none());
+}
+
+#[test]
+fn extract_sni_with_infinite_loop_potential_extension_rejected() {
+    let mut h = vec![0x16, 0x03, 0x03, 0x00, 0x60]; // Record header
+    h.push(0x01); // Handshake type: ClientHello
+    h.extend_from_slice(&[0x00, 0x00, 0x5C]); // Length: 92
+    h.extend_from_slice(&[0x03, 0x03]); // Version
+    h.extend_from_slice(&[0u8; 32]); // Random
+    h.push(0); // Session ID length: 0
+    h.extend_from_slice(&[0x00, 0x02, 0x13, 0x01]); // Cipher suites
+    h.extend_from_slice(&[0x01, 0x00]); // Compression
+    
+    // Extensions start
+    h.extend_from_slice(&[0x00, 0x10]); // Total Extensions length: 16
+    
+    // Extension: zero length but claims more? 
+    // If our parser didn't advance, it might loop.
+    // Telemt uses `pos += 4 + elen;` so it always advances.
+    h.extend_from_slice(&[0x12, 0x34]); // Unknown type
+    h.extend_from_slice(&[0x00, 0x00]); // Length 0
+    
+    // Fill the rest with garbage
+    h.extend_from_slice(&[0x42; 12]);
+    
+    // We expect it to finish without SNI found
+    assert!(extract_sni_from_client_hello(&h).is_none());
+}
+
+#[test]
+fn extract_sni_with_invalid_hostname_rejected() {
+    let host = b"invalid_host!%^";
+    let mut sni = Vec::new();
+    sni.extend_from_slice(&((host.len() + 3) as u16).to_be_bytes());
+    sni.push(0);
+    sni.extend_from_slice(&(host.len() as u16).to_be_bytes());
+    sni.extend_from_slice(host);
+    
+    let mut h = vec![0x16, 0x03, 0x03, 0x00, 0x60]; // Record header
+    h.push(0x01); // ClientHello
+    h.extend_from_slice(&[0x00, 0x00, 0x5C]);
+    h.extend_from_slice(&[0x03, 0x03]);
+    h.extend_from_slice(&[0u8; 32]);
+    h.push(0);
+    h.extend_from_slice(&[0x00, 0x02, 0x13, 0x01]);
+    h.extend_from_slice(&[0x01, 0x00]);
+    
+    let mut ext = Vec::new();
+    ext.extend_from_slice(&0x0000u16.to_be_bytes());
+    ext.extend_from_slice(&(sni.len() as u16).to_be_bytes());
+    ext.extend_from_slice(&sni);
+    
+    h.extend_from_slice(&(ext.len() as u16).to_be_bytes());
+    h.extend_from_slice(&ext);
+    
+    assert!(extract_sni_from_client_hello(&h).is_none(), "Invalid SNI hostname must be rejected");
+}
+
+// ------------------------------------------------------------------
+// Timing Neutrality Tests (OWASP ASVS 5.1.7)
+// ------------------------------------------------------------------
+
+#[test]
+fn validate_tls_handshake_timing_neutrality() {
+    let secret = b"timing_test_secret_32_bytes_long_";
+    let secrets = vec![("u".to_string(), secret.to_vec())];
+
+    let mut base = vec![0x42u8; 100];
+    base[TLS_DIGEST_POS + TLS_DIGEST_LEN] = 32;
+
+    const ITER: usize = 600;
+    const ROUNDS: usize = 7;
+
+    let mut per_round_avg_diff_ns = Vec::with_capacity(ROUNDS);
+
+    for round in 0..ROUNDS {
+        let mut success_h = base.clone();
+        let mut fail_h = base.clone();
+
+        let start_success = Instant::now();
+        for _ in 0..ITER {
+            let digest = make_digest(secret, &success_h, 0);
+            success_h[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN].copy_from_slice(&digest);
+            let _ = validate_tls_handshake_at_time(&success_h, &secrets, true, 0);
+        }
+        let success_elapsed = start_success.elapsed();
+
+        let start_fail = Instant::now();
+        for i in 0..ITER {
+            let mut digest = make_digest(secret, &fail_h, 0);
+            let flip_idx = (i + round) % (TLS_DIGEST_LEN - 4);
+            digest[flip_idx] ^= 0xFF;
+            fail_h[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN].copy_from_slice(&digest);
+            let _ = validate_tls_handshake_at_time(&fail_h, &secrets, true, 0);
+        }
+        let fail_elapsed = start_fail.elapsed();
+
+        let diff = if success_elapsed > fail_elapsed {
+            success_elapsed - fail_elapsed
+        } else {
+            fail_elapsed - success_elapsed
+        };
+        per_round_avg_diff_ns.push(diff.as_nanos() as f64 / ITER as f64);
+    }
+
+    per_round_avg_diff_ns.sort_by(|a, b| a.partial_cmp(b).unwrap());
+    let median_avg_diff_ns = per_round_avg_diff_ns[ROUNDS / 2];
+
+    // Keep this as a coarse side-channel guard only; noisy shared CI hosts can
+    // introduce microsecond-level jitter that should not fail deterministic suites.
+    assert!(
+        median_avg_diff_ns < 50_000.0,
+        "Median timing delta too large: {} ns/iter",
+        median_avg_diff_ns
+    );
+}
+
+// ------------------------------------------------------------------
+// Adversarial Fingerprinting / Active Probing Tests
+// ------------------------------------------------------------------
+
+#[test]
+fn is_tls_handshake_robustness_against_probing() {
+    // Valid TLS 1.0 ClientHello
+    assert!(is_tls_handshake(&[0x16, 0x03, 0x01]));
+    // Valid TLS 1.2/1.3 ClientHello (Legacy Record Layer)
+    assert!(is_tls_handshake(&[0x16, 0x03, 0x03]));
+    
+    // Invalid record type but matching version
+    assert!(!is_tls_handshake(&[0x17, 0x03, 0x03]));
+    // Plaintext HTTP request
+    assert!(!is_tls_handshake(b"GET / HTTP/1.1"));
+    // Short garbage
+    assert!(!is_tls_handshake(&[0x16, 0x03]));
+}
+
+#[test]
+fn validate_tls_handshake_at_time_strict_boundary() {
+    let secret = b"strict_boundary_secret_32_bytes_";
+    let secrets = vec![("u".to_string(), secret.to_vec())];
+    let now: i64 = 1_000_000_000;
+    
+    // Boundary: exactly TIME_SKEW_MAX (120s past)
+    let ts_past = (now - TIME_SKEW_MAX) as u32;
+    let h = make_valid_tls_handshake_with_session_id(secret, ts_past, &[0x42; 32]);
+    assert!(validate_tls_handshake_at_time(&h, &secrets, false, now).is_some());
+    
+    // Boundary + 1s: should be rejected
+    let ts_too_past = (now - TIME_SKEW_MAX - 1) as u32;
+    let h2 = make_valid_tls_handshake_with_session_id(secret, ts_too_past, &[0x42; 32]);
+    assert!(validate_tls_handshake_at_time(&h2, &secrets, false, now).is_none());
+}
+
+#[test]
+fn extract_sni_with_duplicate_extensions_rejected() {
+    // Construct a ClientHello with TWO SNI extensions
+    let host1 = b"first.com";
+    let mut sni1 = Vec::new();
+    sni1.extend_from_slice(&((host1.len() + 3) as u16).to_be_bytes());
+    sni1.push(0);
+    sni1.extend_from_slice(&(host1.len() as u16).to_be_bytes());
+    sni1.extend_from_slice(host1);
+    
+    let host2 = b"second.com";
+    let mut sni2 = Vec::new();
+    sni2.extend_from_slice(&((host2.len() + 3) as u16).to_be_bytes());
+    sni2.push(0);
+    sni2.extend_from_slice(&(host2.len() as u16).to_be_bytes());
+    sni2.extend_from_slice(host2);
+    
+    let mut ext = Vec::new();
+    // Ext 1: SNI
+    ext.extend_from_slice(&0x0000u16.to_be_bytes());
+    ext.extend_from_slice(&(sni1.len() as u16).to_be_bytes());
+    ext.extend_from_slice(&sni1);
+    // Ext 2: SNI again
+    ext.extend_from_slice(&0x0000u16.to_be_bytes());
+    ext.extend_from_slice(&(sni2.len() as u16).to_be_bytes());
+    ext.extend_from_slice(&sni2);
+    
+    let mut body = Vec::new();
+    body.extend_from_slice(&[0x03, 0x03]);
+    body.extend_from_slice(&[0u8; 32]);
+    body.push(0);
+    body.extend_from_slice(&[0x00, 0x02, 0x13, 0x01]);
+    body.extend_from_slice(&[0x01, 0x00]);
+    body.extend_from_slice(&(ext.len() as u16).to_be_bytes());
+    body.extend_from_slice(&ext);
+
+    let mut handshake = Vec::new();
+    handshake.push(0x01);
+    let body_len = (body.len() as u32).to_be_bytes();
+    handshake.extend_from_slice(&body_len[1..4]);
+    handshake.extend_from_slice(&body);
+
+    let mut h = Vec::new();
+    h.push(0x16);
+    h.extend_from_slice(&[0x03, 0x03]);
+    h.extend_from_slice(&(handshake.len() as u16).to_be_bytes());
+    h.extend_from_slice(&handshake);
+    
+    // Parser might return first, see second, or fail. OWASP ASVS prefers rejection of unexpected dups.
+    // Telemt's `extract_sni` returns the first one found.
+    assert!(extract_sni_from_client_hello(&h).is_some()); 
+}
+
+#[test]
+fn extract_alpn_with_malformed_list_rejected() {
+    let mut alpn_payload = Vec::new();
+    alpn_payload.extend_from_slice(&0x0005u16.to_be_bytes()); // Total len 5
+    alpn_payload.push(10); // Labeled len 10 (OVERFLOWS total 5)
+    alpn_payload.extend_from_slice(b"h2");
+    
+    let mut ext = Vec::new();
+    ext.extend_from_slice(&0x0010u16.to_be_bytes()); // Type: ALPN (16)
+    ext.extend_from_slice(&(alpn_payload.len() as u16).to_be_bytes());
+    ext.extend_from_slice(&alpn_payload);
+    
+    let mut h = vec![0x16, 0x03, 0x03, 0x00, 0x40, 0x01, 0x00, 0x00, 0x3C, 0x03, 0x03];
+    h.extend_from_slice(&[0u8; 32]);
+    h.push(0);
+    h.extend_from_slice(&[0x00, 0x02, 0x13, 0x01, 0x01, 0x00]);
+    h.extend_from_slice(&(ext.len() as u16).to_be_bytes());
+    h.extend_from_slice(&ext);
+    
+    let res = extract_alpn_from_client_hello(&h);
+    assert!(res.is_empty(), "Malformed ALPN list must return empty or fail");
+}
+
+#[test]
+fn extract_sni_with_huge_extension_header_rejected() {
+    let mut h = vec![0x16, 0x03, 0x03, 0x00, 0x00]; // Record header
+    h.push(0x01); // ClientHello
+    h.extend_from_slice(&[0x00, 0xFF, 0xFF]); // Huge length (65535) - overflows record
+    h.extend_from_slice(&[0x03, 0x03]);
+    h.extend_from_slice(&[0u8; 32]);
+    h.push(0);
+    h.extend_from_slice(&[0x00, 0x02, 0x13, 0x01, 0x01, 0x00]);
+    
+    // Extensions start
+    h.extend_from_slice(&[0xFF, 0xFF]); // Total extensions: 65535 (OVERFLOWS everything)
+    
+    assert!(extract_sni_from_client_hello(&h).is_none());
+}
@@ -0,0 +1,195 @@
+use super::*;
+use crate::crypto::sha256_hmac;
+use std::panic::catch_unwind;
+
+fn make_valid_tls_handshake_with_session_id(
+    secret: &[u8],
+    timestamp: u32,
+    session_id: &[u8],
+) -> Vec<u8> {
+    let session_id_len = session_id.len();
+    assert!(session_id_len <= u8::MAX as usize);
+
+    let len = TLS_DIGEST_POS + TLS_DIGEST_LEN + 1 + session_id_len;
+    let mut handshake = vec![0x42u8; len];
+    handshake[TLS_DIGEST_POS + TLS_DIGEST_LEN] = session_id_len as u8;
+    let sid_start = TLS_DIGEST_POS + TLS_DIGEST_LEN + 1;
+    handshake[sid_start..sid_start + session_id_len].copy_from_slice(session_id);
+    handshake[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN].fill(0);
+
+    let mut digest = sha256_hmac(secret, &handshake);
+    let ts = timestamp.to_le_bytes();
+    for idx in 0..4 {
+        digest[28 + idx] ^= ts[idx];
+    }
+
+    handshake[TLS_DIGEST_POS..TLS_DIGEST_POS + TLS_DIGEST_LEN].copy_from_slice(&digest);
+    handshake
+}
+
+fn make_valid_client_hello_record(host: &str, alpn_protocols: &[&[u8]]) -> Vec<u8> {
+    let mut body = Vec::new();
+    body.extend_from_slice(&TLS_VERSION);
+    body.extend_from_slice(&[0u8; 32]);
+    body.push(0);
+    body.extend_from_slice(&2u16.to_be_bytes());
+    body.extend_from_slice(&[0x13, 0x01]);
+    body.push(1);
+    body.push(0);
+
+    let mut ext_blob = Vec::new();
+
+    let host_bytes = host.as_bytes();
+    let mut sni_payload = Vec::new();
+    sni_payload.extend_from_slice(&((host_bytes.len() + 3) as u16).to_be_bytes());
+    sni_payload.push(0);
+    sni_payload.extend_from_slice(&(host_bytes.len() as u16).to_be_bytes());
+    sni_payload.extend_from_slice(host_bytes);
+    ext_blob.extend_from_slice(&0x0000u16.to_be_bytes());
+    ext_blob.extend_from_slice(&(sni_payload.len() as u16).to_be_bytes());
+    ext_blob.extend_from_slice(&sni_payload);
+
+    if !alpn_protocols.is_empty() {
+        let mut alpn_list = Vec::new();
+        for proto in alpn_protocols {
+            alpn_list.push(proto.len() as u8);
+            alpn_list.extend_from_slice(proto);
+        }
+        let mut alpn_data = Vec::new();
+        alpn_data.extend_from_slice(&(alpn_list.len() as u16).to_be_bytes());
+        alpn_data.extend_from_slice(&alpn_list);
+
+        ext_blob.extend_from_slice(&0x0010u16.to_be_bytes());
+        ext_blob.extend_from_slice(&(alpn_data.len() as u16).to_be_bytes());
+        ext_blob.extend_from_slice(&alpn_data);
+    }
+
+    body.extend_from_slice(&(ext_blob.len() as u16).to_be_bytes());
+    body.extend_from_slice(&ext_blob);
+
+    let mut handshake = Vec::new();
+    handshake.push(0x01);
+    let body_len = (body.len() as u32).to_be_bytes();
+    handshake.extend_from_slice(&body_len[1..4]);
+    handshake.extend_from_slice(&body);
+
+    let mut record = Vec::new();
+    record.push(TLS_RECORD_HANDSHAKE);
+    record.extend_from_slice(&[0x03, 0x01]);
+    record.extend_from_slice(&(handshake.len() as u16).to_be_bytes());
+    record.extend_from_slice(&handshake);
+    record
+}
+
+#[test]
+fn client_hello_fuzz_corpus_never_panics_or_accepts_corruption() {
+    let valid = make_valid_client_hello_record("example.com", &[b"h2", b"http/1.1"]);
+    assert_eq!(extract_sni_from_client_hello(&valid).as_deref(), Some("example.com"));
+    assert_eq!(
+        extract_alpn_from_client_hello(&valid),
+        vec![b"h2".to_vec(), b"http/1.1".to_vec()]
+    );
+    assert!(
+        extract_sni_from_client_hello(&make_valid_client_hello_record("127.0.0.1", &[])).is_none(),
+        "literal IP hostnames must be rejected"
+    );
+
+    let mut corpus = vec![
+        Vec::new(),
+        vec![0x16, 0x03, 0x03],
+        valid[..9].to_vec(),
+        valid[..valid.len() - 1].to_vec(),
+    ];
+
+    let mut wrong_type = valid.clone();
+    wrong_type[0] = 0x15;
+    corpus.push(wrong_type);
+
+    let mut wrong_handshake = valid.clone();
+    wrong_handshake[5] = 0x02;
+    corpus.push(wrong_handshake);
+
+    let mut wrong_length = valid.clone();
+    wrong_length[3] ^= 0x7f;
+    corpus.push(wrong_length);
+
+    for (idx, input) in corpus.iter().enumerate() {
+        assert!(catch_unwind(|| extract_sni_from_client_hello(input)).is_ok());
+        assert!(catch_unwind(|| extract_alpn_from_client_hello(input)).is_ok());
+
+        if idx == 0 {
+            continue;
+        }
+
+        assert!(extract_sni_from_client_hello(input).is_none(), "corpus item {idx} must fail closed for SNI");
+        assert!(extract_alpn_from_client_hello(input).is_empty(), "corpus item {idx} must fail closed for ALPN");
+    }
+}
+
+#[test]
+fn tls_handshake_fuzz_corpus_never_panics_and_rejects_digest_mutations() {
+    let secret = b"tls_fuzz_security_secret";
+    let now: i64 = 1_700_000_000;
+    let base = make_valid_tls_handshake_with_session_id(secret, now as u32, &[0x42; 32]);
+    let secrets = vec![("fuzz-user".to_string(), secret.to_vec())];
+
+    assert!(validate_tls_handshake_at_time(&base, &secrets, false, now).is_some());
+
+    let mut corpus = Vec::new();
+
+    let mut truncated = base.clone();
+    truncated.truncate(TLS_DIGEST_POS + 16);
+    corpus.push(truncated);
+
+    let mut digest_flip = base.clone();
+    digest_flip[TLS_DIGEST_POS + 7] ^= 0x80;
+    corpus.push(digest_flip);
+
+    let mut session_id_len_overflow = base.clone();
+    session_id_len_overflow[TLS_DIGEST_POS + TLS_DIGEST_LEN] = 33;
+    corpus.push(session_id_len_overflow);
+
+    let mut timestamp_far_past = base.clone();
+    timestamp_far_past[TLS_DIGEST_POS + 28..TLS_DIGEST_POS + 32]
+        .copy_from_slice(&((now - i64::from(TIME_SKEW_MAX) - 1) as u32).to_le_bytes());
+    corpus.push(timestamp_far_past);
+
+    let mut timestamp_far_future = base.clone();
+    timestamp_far_future[TLS_DIGEST_POS + 28..TLS_DIGEST_POS + 32]
+        .copy_from_slice(&((now - TIME_SKEW_MIN + 1) as u32).to_le_bytes());
+    corpus.push(timestamp_far_future);
+
+    let mut seed = 0xA5A5_5A5A_F00D_BAAD_u64;
+    for _ in 0..32 {
+        let mut mutated = base.clone();
+        for _ in 0..2 {
+            seed = seed.wrapping_mul(2862933555777941757).wrapping_add(3037000493);
+            let idx = TLS_DIGEST_POS + (seed as usize % TLS_DIGEST_LEN);
+            mutated[idx] ^= ((seed >> 17) as u8).wrapping_add(1);
+        }
+        corpus.push(mutated);
+    }
+
+    for (idx, handshake) in corpus.iter().enumerate() {
+        let result = catch_unwind(|| validate_tls_handshake_at_time(handshake, &secrets, false, now));
+        assert!(result.is_ok(), "corpus item {idx} must not panic");
+        assert!(result.unwrap().is_none(), "corpus item {idx} must fail closed");
+    }
+}
+
+#[test]
+fn tls_boot_time_acceptance_is_capped_by_replay_window() {
+    let secret = b"tls_boot_time_cap_secret";
+    let secrets = vec![("boot-user".to_string(), secret.to_vec())];
+    let boot_ts = 1u32;
+    let handshake = make_valid_tls_handshake_with_session_id(secret, boot_ts, &[0x42; 32]);
+
+    assert!(
+        validate_tls_handshake_with_replay_window(&handshake, &secrets, false, 300).is_some(),
+        "boot-time timestamp should be accepted while replay window permits it"
+    );
+    assert!(
+        validate_tls_handshake_with_replay_window(&handshake, &secrets, false, 0).is_none(),
+        "boot-time timestamp must be rejected when replay window disables the bypass"
+    );
+}
@@ -1,383 +0,0 @@
-use dashmap::DashMap;
-use std::cmp::max;
-use std::sync::OnceLock;
-use std::time::{Duration, Instant};
-
-const EMA_ALPHA: f64 = 0.2;
-const PROFILE_TTL: Duration = Duration::from_secs(300);
-const THROUGHPUT_UP_BPS: f64 = 8_000_000.0;
-const THROUGHPUT_DOWN_BPS: f64 = 2_000_000.0;
-const RATIO_CONFIRM_THRESHOLD: f64 = 1.12;
-const TIER1_HOLD_TICKS: u32 = 8;
-const TIER2_HOLD_TICKS: u32 = 4;
-const QUIET_DEMOTE_TICKS: u32 = 480;
-const HARD_COOLDOWN_TICKS: u32 = 20;
-const HARD_PENDING_THRESHOLD: u32 = 3;
-const HARD_PARTIAL_RATIO_THRESHOLD: f64 = 0.25;
-const DIRECT_C2S_CAP_BYTES: usize = 128 * 1024;
-const DIRECT_S2C_CAP_BYTES: usize = 512 * 1024;
-const ME_FRAMES_CAP: usize = 96;
-const ME_BYTES_CAP: usize = 384 * 1024;
-const ME_DELAY_MIN_US: u64 = 150;
-
-#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
-pub enum AdaptiveTier {
-    Base = 0,
-    Tier1 = 1,
-    Tier2 = 2,
-    Tier3 = 3,
-}
-
-impl AdaptiveTier {
-    pub fn promote(self) -> Self {
-        match self {
-            Self::Base => Self::Tier1,
-            Self::Tier1 => Self::Tier2,
-            Self::Tier2 => Self::Tier3,
-            Self::Tier3 => Self::Tier3,
-        }
-    }
-
-    pub fn demote(self) -> Self {
-        match self {
-            Self::Base => Self::Base,
-            Self::Tier1 => Self::Base,
-            Self::Tier2 => Self::Tier1,
-            Self::Tier3 => Self::Tier2,
-        }
-    }
-
-    fn ratio(self) -> (usize, usize) {
-        match self {
-            Self::Base => (1, 1),
-            Self::Tier1 => (5, 4),
-            Self::Tier2 => (3, 2),
-            Self::Tier3 => (2, 1),
-        }
-    }
-
-    pub fn as_u8(self) -> u8 {
-        self as u8
-    }
-}
-
-#[derive(Debug, Clone, Copy, PartialEq, Eq)]
-pub enum TierTransitionReason {
-    SoftConfirmed,
-    HardPressure,
-    QuietDemotion,
-}
-
-#[derive(Debug, Clone, Copy, PartialEq, Eq)]
-pub struct TierTransition {
-    pub from: AdaptiveTier,
-    pub to: AdaptiveTier,
-    pub reason: TierTransitionReason,
-}
-
-#[derive(Debug, Clone, Copy, Default)]
-pub struct RelaySignalSample {
-    pub c2s_bytes: u64,
-    pub s2c_requested_bytes: u64,
-    pub s2c_written_bytes: u64,
-    pub s2c_write_ops: u64,
-    pub s2c_partial_writes: u64,
-    pub s2c_consecutive_pending_writes: u32,
-}
-
-#[derive(Debug, Clone, Copy)]
-pub struct SessionAdaptiveController {
-    tier: AdaptiveTier,
-    max_tier_seen: AdaptiveTier,
-    throughput_ema_bps: f64,
-    incoming_ema_bps: f64,
-    outgoing_ema_bps: f64,
-    tier1_hold_ticks: u32,
-    tier2_hold_ticks: u32,
-    quiet_ticks: u32,
-    hard_cooldown_ticks: u32,
-}
-
-impl SessionAdaptiveController {
-    pub fn new(initial_tier: AdaptiveTier) -> Self {
-        Self {
-            tier: initial_tier,
-            max_tier_seen: initial_tier,
-            throughput_ema_bps: 0.0,
-            incoming_ema_bps: 0.0,
-            outgoing_ema_bps: 0.0,
-            tier1_hold_ticks: 0,
-            tier2_hold_ticks: 0,
-            quiet_ticks: 0,
-            hard_cooldown_ticks: 0,
-        }
-    }
-
-    pub fn max_tier_seen(&self) -> AdaptiveTier {
-        self.max_tier_seen
-    }
-
-    pub fn observe(&mut self, sample: RelaySignalSample, tick_secs: f64) -> Option<TierTransition> {
-        if tick_secs <= f64::EPSILON {
-            return None;
-        }
-
-        if self.hard_cooldown_ticks > 0 {
-            self.hard_cooldown_ticks -= 1;
-        }
-
-        let c2s_bps = (sample.c2s_bytes as f64 * 8.0) / tick_secs;
-        let incoming_bps = (sample.s2c_requested_bytes as f64 * 8.0) / tick_secs;
-        let outgoing_bps = (sample.s2c_written_bytes as f64 * 8.0) / tick_secs;
-        let throughput = c2s_bps.max(outgoing_bps);
-
-        self.throughput_ema_bps = ema(self.throughput_ema_bps, throughput);
-        self.incoming_ema_bps = ema(self.incoming_ema_bps, incoming_bps);
-        self.outgoing_ema_bps = ema(self.outgoing_ema_bps, outgoing_bps);
-
-        let tier1_now = self.throughput_ema_bps >= THROUGHPUT_UP_BPS;
-        if tier1_now {
-            self.tier1_hold_ticks = self.tier1_hold_ticks.saturating_add(1);
-        } else {
-            self.tier1_hold_ticks = 0;
-        }
-
-        let ratio = if self.outgoing_ema_bps <= f64::EPSILON {
-            0.0
-        } else {
-            self.incoming_ema_bps / self.outgoing_ema_bps
-        };
-        let tier2_now = ratio >= RATIO_CONFIRM_THRESHOLD;
-        if tier2_now {
-            self.tier2_hold_ticks = self.tier2_hold_ticks.saturating_add(1);
-        } else {
-            self.tier2_hold_ticks = 0;
-        }
-
-        let partial_ratio = if sample.s2c_write_ops == 0 {
-            0.0
-        } else {
-            sample.s2c_partial_writes as f64 / sample.s2c_write_ops as f64
-        };
-        let hard_now = sample.s2c_consecutive_pending_writes >= HARD_PENDING_THRESHOLD
-            || partial_ratio >= HARD_PARTIAL_RATIO_THRESHOLD;
-
-        if hard_now && self.hard_cooldown_ticks == 0 {
-            return self.promote(TierTransitionReason::HardPressure, HARD_COOLDOWN_TICKS);
-        }
-
-        if self.tier1_hold_ticks >= TIER1_HOLD_TICKS && self.tier2_hold_ticks >= TIER2_HOLD_TICKS {
-            return self.promote(TierTransitionReason::SoftConfirmed, 0);
-        }
-
-        let demote_candidate = self.throughput_ema_bps < THROUGHPUT_DOWN_BPS && !tier2_now && !hard_now;
-        if demote_candidate {
-            self.quiet_ticks = self.quiet_ticks.saturating_add(1);
-            if self.quiet_ticks >= QUIET_DEMOTE_TICKS {
-                self.quiet_ticks = 0;
-                return self.demote(TierTransitionReason::QuietDemotion);
-            }
-        } else {
-            self.quiet_ticks = 0;
-        }
-
-        None
-    }
-
-    fn promote(
-        &mut self,
-        reason: TierTransitionReason,
-        hard_cooldown_ticks: u32,
-    ) -> Option<TierTransition> {
-        let from = self.tier;
-        let to = from.promote();
-        if from == to {
-            return None;
-        }
-        self.tier = to;
-        self.max_tier_seen = max(self.max_tier_seen, to);
-        self.hard_cooldown_ticks = hard_cooldown_ticks;
-        self.tier1_hold_ticks = 0;
-        self.tier2_hold_ticks = 0;
-        self.quiet_ticks = 0;
-        Some(TierTransition { from, to, reason })
-    }
-
-    fn demote(&mut self, reason: TierTransitionReason) -> Option<TierTransition> {
-        let from = self.tier;
-        let to = from.demote();
-        if from == to {
-            return None;
-        }
-        self.tier = to;
-        self.tier1_hold_ticks = 0;
-        self.tier2_hold_ticks = 0;
-        Some(TierTransition { from, to, reason })
-    }
-}
-
-#[derive(Debug, Clone, Copy)]
-struct UserAdaptiveProfile {
-    tier: AdaptiveTier,
-    seen_at: Instant,
-}
-
-fn profiles() -> &'static DashMap<String, UserAdaptiveProfile> {
-    static USER_PROFILES: OnceLock<DashMap<String, UserAdaptiveProfile>> = OnceLock::new();
-    USER_PROFILES.get_or_init(DashMap::new)
-}
-
-pub fn seed_tier_for_user(user: &str) -> AdaptiveTier {
-    let now = Instant::now();
-    if let Some(entry) = profiles().get(user) {
-        let value = entry.value();
-        if now.duration_since(value.seen_at) <= PROFILE_TTL {
-            return value.tier;
-        }
-    }
-    AdaptiveTier::Base
-}
-
-pub fn record_user_tier(user: &str, tier: AdaptiveTier) {
-    let now = Instant::now();
-    if let Some(mut entry) = profiles().get_mut(user) {
-        let existing = *entry;
-        let effective = if now.duration_since(existing.seen_at) > PROFILE_TTL {
-            tier
-        } else {
-            max(existing.tier, tier)
-        };
-        *entry = UserAdaptiveProfile {
-            tier: effective,
-            seen_at: now,
-        };
-        return;
-    }
-    profiles().insert(
-        user.to_string(),
-        UserAdaptiveProfile { tier, seen_at: now },
-    );
-}
-
-pub fn direct_copy_buffers_for_tier(
-    tier: AdaptiveTier,
-    base_c2s: usize,
-    base_s2c: usize,
-) -> (usize, usize) {
-    let (num, den) = tier.ratio();
-    (
-        scale(base_c2s, num, den, DIRECT_C2S_CAP_BYTES),
-        scale(base_s2c, num, den, DIRECT_S2C_CAP_BYTES),
-    )
-}
-
-pub fn me_flush_policy_for_tier(
-    tier: AdaptiveTier,
-    base_frames: usize,
-    base_bytes: usize,
-    base_delay: Duration,
-) -> (usize, usize, Duration) {
-    let (num, den) = tier.ratio();
-    let frames = scale(base_frames, num, den, ME_FRAMES_CAP).max(1);
-    let bytes = scale(base_bytes, num, den, ME_BYTES_CAP).max(4096);
-    let delay_us = base_delay.as_micros() as u64;
-    let adjusted_delay_us = match tier {
-        AdaptiveTier::Base => delay_us,
-        AdaptiveTier::Tier1 => (delay_us.saturating_mul(7)).saturating_div(10),
-        AdaptiveTier::Tier2 => delay_us.saturating_div(2),
-        AdaptiveTier::Tier3 => (delay_us.saturating_mul(3)).saturating_div(10),
-    }
-    .max(ME_DELAY_MIN_US)
-    .min(delay_us.max(ME_DELAY_MIN_US));
-    (frames, bytes, Duration::from_micros(adjusted_delay_us))
-}
-
-fn ema(prev: f64, value: f64) -> f64 {
-    if prev <= f64::EPSILON {
-        value
-    } else {
-        (prev * (1.0 - EMA_ALPHA)) + (value * EMA_ALPHA)
-    }
-}
-
-fn scale(base: usize, numerator: usize, denominator: usize, cap: usize) -> usize {
-    let scaled = base
-        .saturating_mul(numerator)
-        .saturating_div(denominator.max(1));
-    scaled.min(cap).max(1)
-}
-
-#[cfg(test)]
-mod tests {
-    use super::*;
-
-    fn sample(
-        c2s_bytes: u64,
-        s2c_requested_bytes: u64,
-        s2c_written_bytes: u64,
-        s2c_write_ops: u64,
-        s2c_partial_writes: u64,
-        s2c_consecutive_pending_writes: u32,
-    ) -> RelaySignalSample {
-        RelaySignalSample {
-            c2s_bytes,
-            s2c_requested_bytes,
-            s2c_written_bytes,
-            s2c_write_ops,
-            s2c_partial_writes,
-            s2c_consecutive_pending_writes,
-        }
-    }
-
-    #[test]
-    fn test_soft_promotion_requires_tier1_and_tier2() {
-        let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Base);
-        let tick_secs = 0.25;
-        let mut promoted = None;
-        for _ in 0..8 {
-            promoted = ctrl.observe(
-                sample(
-                    300_000, // ~9.6 Mbps
-                    320_000, // incoming > outgoing to confirm tier2
-                    250_000,
-                    10,
-                    0,
-                    0,
-                ),
-                tick_secs,
-            );
-        }
-
-        let transition = promoted.expect("expected soft promotion");
-        assert_eq!(transition.from, AdaptiveTier::Base);
-        assert_eq!(transition.to, AdaptiveTier::Tier1);
-        assert_eq!(transition.reason, TierTransitionReason::SoftConfirmed);
-    }
-
-    #[test]
-    fn test_hard_promotion_on_pending_pressure() {
-        let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Base);
-        let transition = ctrl
-            .observe(
-                sample(10_000, 20_000, 10_000, 4, 1, 3),
-                0.25,
-            )
-            .expect("expected hard promotion");
-        assert_eq!(transition.reason, TierTransitionReason::HardPressure);
-        assert_eq!(transition.to, AdaptiveTier::Tier1);
-    }
-
-    #[test]
-    fn test_quiet_demotion_is_slow_and_stepwise() {
-        let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Tier2);
-        let mut demotion = None;
-        for _ in 0..QUIET_DEMOTE_TICKS {
-            demotion = ctrl.observe(sample(1, 1, 1, 1, 0, 0), 0.25);
-        }
-
-        let transition = demotion.expect("expected quiet demotion");
-        assert_eq!(transition.from, AdaptiveTier::Tier2);
-        assert_eq!(transition.to, AdaptiveTier::Tier1);
-        assert_eq!(transition.reason, TierTransitionReason::QuietDemotion);
-    }
-}
@@ -4,7 +4,10 @@ use std::future::Future;
 use std::net::{IpAddr, SocketAddr};
 use std::pin::Pin;
 use std::sync::Arc;
+use std::sync::OnceLock;
+use std::sync::atomic::{AtomicBool, Ordering};
 use std::time::Duration;
+use ipnetwork::IpNetwork;
 use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite};
 use tokio::net::TcpStream;
 use tokio::time::timeout;
@@ -21,9 +24,50 @@ enum HandshakeOutcome {
    Handled,
 }

+#[must_use = "UserConnectionReservation must be kept alive to retain user/IP reservation until release or drop"]
+struct UserConnectionReservation {
+    stats: Arc<Stats>,
+    ip_tracker: Arc<UserIpTracker>,
+    user: String,
+    ip: IpAddr,
+    active: bool,
+}
+
+impl UserConnectionReservation {
+    fn new(stats: Arc<Stats>, ip_tracker: Arc<UserIpTracker>, user: String, ip: IpAddr) -> Self {
+        Self {
+            stats,
+            ip_tracker,
+            user,
+            ip,
+            active: true,
+        }
+    }
+
+    async fn release(mut self) {
+        if !self.active {
+            return;
+        }
+        self.ip_tracker.remove_ip(&self.user, self.ip).await;
+        self.active = false;
+        self.stats.decrement_user_curr_connects(&self.user);
+    }
+}
+
+impl Drop for UserConnectionReservation {
+    fn drop(&mut self) {
+        if !self.active {
+            return;
+        }
+        self.active = false;
+        self.stats.decrement_user_curr_connects(&self.user);
+        self.ip_tracker.enqueue_cleanup(self.user.clone(), self.ip);
+    }
+}
+
 use crate::config::ProxyConfig;
 use crate::crypto::SecureRandom;
-use crate::error::{HandshakeResult, ProxyError, Result};
+use crate::error::{HandshakeResult, ProxyError, Result, StreamError};
 use crate::ip_tracker::UserIpTracker;
 use crate::protocol::constants::*;
 use crate::protocol::tls;
@@ -40,10 +84,21 @@ use crate::proxy::handshake::{HandshakeSuccess, handle_mtproto_handshake, handle
 use crate::proxy::masking::handle_bad_client;
 use crate::proxy::middle_relay::handle_via_middle_proxy;
 use crate::proxy::route_mode::{RelayRouteMode, RouteRuntimeController};
-use crate::proxy::session_eviction::register_session;

 fn beobachten_ttl(config: &ProxyConfig) -> Duration {
-    Duration::from_secs(config.general.beobachten_minutes.saturating_mul(60))
+    let minutes = config.general.beobachten_minutes;
+    if minutes == 0 {
+        static BEOBACHTEN_ZERO_MINUTES_WARNED: OnceLock<AtomicBool> = OnceLock::new();
+        let warned = BEOBACHTEN_ZERO_MINUTES_WARNED.get_or_init(|| AtomicBool::new(false));
+        if !warned.swap(true, Ordering::Relaxed) {
+            warn!(
+                "general.beobachten_minutes=0 is insecure because entries expire immediately; forcing minimum TTL to 1 minute"
+            );
+        }
+        return Duration::from_secs(60);
+    }
+
+    Duration::from_secs(minutes.saturating_mul(60))
 }

 fn record_beobachten_class(
@@ -64,14 +119,34 @@ fn record_handshake_failure_class(
    peer_ip: IpAddr,
    error: &ProxyError,
 ) {
-    let class = if error.to_string().contains("expected 64 bytes, got 0") {
-        "expected_64_got_0"
-    } else {
-        "other"
+    let class = match error {
+        ProxyError::Io(err) if err.kind() == std::io::ErrorKind::UnexpectedEof => {
+            "expected_64_got_0"
+        }
+        ProxyError::Stream(StreamError::UnexpectedEof) => "expected_64_got_0",
+        _ => "other",
    };
    record_beobachten_class(beobachten, config, peer_ip, class);
 }

+fn is_trusted_proxy_source(peer_ip: IpAddr, trusted: &[IpNetwork]) -> bool {
+    if trusted.is_empty() {
+        static EMPTY_PROXY_TRUST_WARNED: OnceLock<AtomicBool> = OnceLock::new();
+        let warned = EMPTY_PROXY_TRUST_WARNED.get_or_init(|| AtomicBool::new(false));
+        if !warned.swap(true, Ordering::Relaxed) {
+            warn!(
+                "PROXY protocol enabled but server.proxy_protocol_trusted_cidrs is empty; rejecting all PROXY headers by default"
+            );
+        }
+        return false;
+    }
+    trusted.iter().any(|cidr| cidr.contains(peer_ip))
+}
+
+fn synthetic_local_addr(port: u16) -> SocketAddr {
+    SocketAddr::from(([0, 0, 0, 0], port))
+}
+
 pub async fn handle_client_stream<S>(
    mut stream: S,
    peer: SocketAddr,
@@ -95,9 +170,7 @@ where
    let mut real_peer = normalize_ip(peer);

    // For non-TCP streams, use a synthetic local address; may be overridden by PROXY protocol dst
-    let mut local_addr: SocketAddr = format!("0.0.0.0:{}", config.server.port)
-        .parse()
-        .unwrap_or_else(|_| "0.0.0.0:443".parse().unwrap());
+    let mut local_addr = synthetic_local_addr(config.server.port);

    if proxy_protocol_enabled {
        let proxy_header_timeout = Duration::from_millis(
@@ -105,6 +178,17 @@ where
        );
        match timeout(proxy_header_timeout, parse_proxy_protocol(&mut stream, peer)).await {
            Ok(Ok(info)) => {
+                if !is_trusted_proxy_source(peer.ip(), &config.server.proxy_protocol_trusted_cidrs)
+                {
+                    stats.increment_connects_bad();
+                    warn!(
+                        peer = %peer,
+                        trusted = ?config.server.proxy_protocol_trusted_cidrs,
+                        "Rejecting PROXY protocol header from untrusted source"
+                    );
+                    record_beobachten_class(&beobachten, &config, peer.ip(), "other");
+                    return Err(ProxyError::InvalidProxyProtocol);
+                }
                debug!(
                    peer = %peer,
                    client = %info.src_addr,
@@ -150,8 +234,13 @@ where
        if is_tls {
            let tls_len = u16::from_be_bytes([first_bytes[3], first_bytes[4]]) as usize;

-            if tls_len < 512 {
-                debug!(peer = %real_peer, tls_len = tls_len, "TLS handshake too short");
+// RFC 8446 §5.1 mandates that TLSPlaintext records must not exceed 2^14
+        // bytes (16_384). A client claiming a larger record is non-compliant and
+        // may be an active probe attempting to force large allocations.
+        //
+        // Also enforce a minimum record size to avoid trivial/garbage probes.
+        if !(512..=MAX_TLS_RECORD_SIZE).contains(&tls_len) {
+                debug!(peer = %real_peer, tls_len = tls_len, max_tls_len = MAX_TLS_RECORD_SIZE, "TLS handshake length out of bounds");
                stats.increment_connects_bad();
                let (reader, writer) = tokio::io::split(stream);
                handle_bad_client(
@@ -205,9 +294,19 @@ where
                &config, &replay_checker, true, Some(tls_user.as_str()),
            ).await {
                HandshakeResult::Success(result) => result,
-                HandshakeResult::BadClient { reader: _, writer: _ } => {
+                HandshakeResult::BadClient { reader, writer } => {
                    stats.increment_connects_bad();
                    debug!(peer = %peer, "Valid TLS but invalid MTProto handshake");
+                    handle_bad_client(
+                        reader,
+                        writer,
+                        &handshake,
+                        real_peer,
+                        local_addr,
+                        &config,
+                        &beobachten,
+                    )
+                    .await;
                    return Ok(HandshakeOutcome::Handled);
                }
                HandshakeResult::Error(e) => return Err(e),
@@ -382,7 +481,6 @@ impl RunningClientHandler {
    pub async fn run(self) -> Result<()> {
        self.stats.increment_connects_all();
        let peer = self.peer;
-        let _ip_tracker = self.ip_tracker.clone();
        debug!(peer = %peer, "New connection");

        if let Err(e) = configure_client_socket(
@@ -446,6 +544,24 @@ impl RunningClientHandler {
            .await
            {
                Ok(Ok(info)) => {
+                    if !is_trusted_proxy_source(
+                        self.peer.ip(),
+                        &self.config.server.proxy_protocol_trusted_cidrs,
+                    ) {
+                        self.stats.increment_connects_bad();
+                        warn!(
+                            peer = %self.peer,
+                            trusted = ?self.config.server.proxy_protocol_trusted_cidrs,
+                            "Rejecting PROXY protocol header from untrusted source"
+                        );
+                        record_beobachten_class(
+                            &self.beobachten,
+                            &self.config,
+                            self.peer.ip(),
+                            "other",
+                        );
+                        return Err(ProxyError::InvalidProxyProtocol);
+                    }
                    debug!(
                        peer = %self.peer,
                        client = %info.src_addr,
@@ -495,7 +611,6 @@ impl RunningClientHandler {

        let is_tls = tls::is_tls_handshake(&first_bytes[..3]);
        let peer = self.peer;
-        let _ip_tracker = self.ip_tracker.clone();

        debug!(peer = %peer, is_tls = is_tls, "Handshake type detected");

@@ -508,14 +623,15 @@ impl RunningClientHandler {

    async fn handle_tls_client(mut self, first_bytes: [u8; 5], local_addr: SocketAddr) -> Result<HandshakeOutcome> {
        let peer = self.peer;
-        let _ip_tracker = self.ip_tracker.clone();

        let tls_len = u16::from_be_bytes([first_bytes[3], first_bytes[4]]) as usize;

        debug!(peer = %peer, tls_len = tls_len, "Reading TLS handshake");

-        if tls_len < 512 {
-            debug!(peer = %peer, tls_len = tls_len, "TLS handshake too short");
+        // See RFC 8446 §5.1: TLSPlaintext records must not exceed 16_384 bytes.
+        // Treat too-small or too-large lengths as active probes and mask them.
+        if !(512..=MAX_TLS_RECORD_SIZE).contains(&tls_len) {
+            debug!(peer = %peer, tls_len = tls_len, max_tls_len = MAX_TLS_RECORD_SIZE, "TLS handshake length out of bounds");
            self.stats.increment_connects_bad();
            let (reader, writer) = self.stream.into_split();
            handle_bad_client(
@@ -591,12 +707,19 @@ impl RunningClientHandler {
        .await
        {
            HandshakeResult::Success(result) => result,
-            HandshakeResult::BadClient {
-                reader: _,
-                writer: _,
-            } => {
+            HandshakeResult::BadClient { reader, writer } => {
                stats.increment_connects_bad();
                debug!(peer = %peer, "Valid TLS but invalid MTProto handshake");
+                handle_bad_client(
+                    reader,
+                    writer,
+                    &handshake,
+                    peer,
+                    local_addr,
+                    &config,
+                    &self.beobachten,
+                )
+                .await;
                return Ok(HandshakeOutcome::Handled);
            }
            HandshakeResult::Error(e) => return Err(e),
@@ -623,7 +746,6 @@ impl RunningClientHandler {

    async fn handle_direct_client(mut self, first_bytes: [u8; 5], local_addr: SocketAddr) -> Result<HandshakeOutcome> {
        let peer = self.peer;
-        let _ip_tracker = self.ip_tracker.clone();

        if !self.config.general.modes.classic && !self.config.general.modes.secure {
            debug!(peer = %peer, "Non-TLS modes disabled");
@@ -727,21 +849,22 @@ impl RunningClientHandler {
    {
        let user = success.user.clone();

-        if let Err(e) = Self::check_user_limits_static(&user, &config, &stats, peer_addr, &ip_tracker).await {
-            warn!(user = %user, error = %e, "User limit exceeded");
-            return Err(e);
-        }
-
-        let registration = register_session(&user, success.dc_idx);
-        if registration.replaced_existing {
-            stats.increment_reconnect_evict_total();
-            warn!(
-                user = %user,
-                dc = success.dc_idx,
-                "Reconnect detected: replacing active session for user+dc"
-            );
-        }
-        let session_lease = registration.lease;
+        let user_limit_reservation =
+            match Self::acquire_user_connection_reservation_static(
+                &user,
+                &config,
+                stats.clone(),
+                peer_addr,
+                ip_tracker,
+            )
+            .await
+            {
+                Ok(reservation) => reservation,
+                Err(e) => {
+                    warn!(user = %user, error = %e, "User admission check failed");
+                    return Err(e);
+                }
+            };

        let route_snapshot = route_runtime.snapshot();
        let session_id = rng.u64();
@@ -754,7 +877,7 @@ impl RunningClientHandler {
                    client_writer,
                    success,
                    pool.clone(),
-                    stats,
+                    stats.clone(),
                    config,
                    buffer_pool,
                    local_addr,
@@ -762,7 +885,6 @@ impl RunningClientHandler {
                    route_runtime.subscribe(),
                    route_snapshot,
                    session_id,
-                    session_lease.clone(),
                )
                .await
            } else {
@@ -772,14 +894,13 @@ impl RunningClientHandler {
                    client_writer,
                    success,
                    upstream_manager,
-                    stats,
+                    stats.clone(),
                    config,
                    buffer_pool,
                    rng,
                    route_runtime.subscribe(),
                    route_snapshot,
                    session_id,
-                    session_lease.clone(),
                )
                .await
            }
@@ -790,25 +911,78 @@ impl RunningClientHandler {
                client_writer,
                success,
                upstream_manager,
-                stats,
+                stats.clone(),
                config,
                buffer_pool,
                rng,
                route_runtime.subscribe(),
                route_snapshot,
                session_id,
-                session_lease.clone(),
            )
            .await
        };
-
-        ip_tracker.remove_ip(&user, peer_addr.ip()).await;
+        user_limit_reservation.release().await;
        relay_result
    }

+    async fn acquire_user_connection_reservation_static(
+        user: &str,
+        config: &ProxyConfig,
+        stats: Arc<Stats>,
+        peer_addr: SocketAddr,
+        ip_tracker: Arc<UserIpTracker>,
+    ) -> Result<UserConnectionReservation> {
+        if let Some(expiration) = config.access.user_expirations.get(user)
+            && chrono::Utc::now() > *expiration
+        {
+            return Err(ProxyError::UserExpired {
+                user: user.to_string(),
+            });
+        }
+
+        if let Some(quota) = config.access.user_data_quota.get(user)
+            && stats.get_user_total_octets(user) >= *quota
+        {
+            return Err(ProxyError::DataQuotaExceeded {
+                user: user.to_string(),
+            });
+        }
+
+        let limit = config.access.user_max_tcp_conns.get(user).map(|v| *v as u64);
+        if !stats.try_acquire_user_curr_connects(user, limit) {
+            return Err(ProxyError::ConnectionLimitExceeded {
+                user: user.to_string(),
+            });
+        }
+
+        match ip_tracker.check_and_add(user, peer_addr.ip()).await {
+            Ok(()) => {}
+            Err(reason) => {
+                stats.decrement_user_curr_connects(user);
+                warn!(
+                    user = %user,
+                    ip = %peer_addr.ip(),
+                    reason = %reason,
+                    "IP limit exceeded"
+                );
+                return Err(ProxyError::ConnectionLimitExceeded {
+                    user: user.to_string(),
+                });
+            }
+        }
+
+        Ok(UserConnectionReservation::new(
+            stats,
+            ip_tracker,
+            user.to_string(),
+            peer_addr.ip(),
+        ))
+    }
+
+    #[cfg(test)]
    async fn check_user_limits_static(
-        user: &str, 
-        config: &ProxyConfig, 
+        user: &str,
+        config: &ProxyConfig,
        stats: &Stats,
        peer_addr: SocketAddr,
        ip_tracker: &UserIpTracker,
@@ -821,9 +995,32 @@ impl RunningClientHandler {
            });
        }

-        let ip_reserved = match ip_tracker.check_and_add(user, peer_addr.ip()).await {
-            Ok(()) => true,
+        if let Some(quota) = config.access.user_data_quota.get(user)
+            && stats.get_user_total_octets(user) >= *quota
+        {
+            return Err(ProxyError::DataQuotaExceeded {
+                user: user.to_string(),
+            });
+        }
+
+        let limit = config
+            .access
+            .user_max_tcp_conns
+            .get(user)
+            .map(|v| *v as u64);
+        if !stats.try_acquire_user_curr_connects(user, limit) {
+            return Err(ProxyError::ConnectionLimitExceeded {
+                user: user.to_string(),
+            });
+        }
+
+        match ip_tracker.check_and_add(user, peer_addr.ip()).await {
+            Ok(()) => {
+                ip_tracker.remove_ip(user, peer_addr.ip()).await;
+                stats.decrement_user_curr_connects(user);
+            }
            Err(reason) => {
+                stats.decrement_user_curr_connects(user);
                warn!(
                    user = %user,
                    ip = %peer_addr.ip(),
@@ -834,33 +1031,15 @@ impl RunningClientHandler {
                    user: user.to_string(),
                });
            }
-        };
-        // IP limit check
-
-        if let Some(limit) = config.access.user_max_tcp_conns.get(user)
-            && stats.get_user_curr_connects(user) >= *limit as u64
-        {
-            if ip_reserved {
-                ip_tracker.remove_ip(user, peer_addr.ip()).await;
-                stats.increment_ip_reservation_rollback_tcp_limit_total();
-            }
-            return Err(ProxyError::ConnectionLimitExceeded {
-                user: user.to_string(),
-            });
-        }
-
-        if let Some(quota) = config.access.user_data_quota.get(user)
-            && stats.get_user_total_octets(user) >= *quota
-        {
-            if ip_reserved {
-                ip_tracker.remove_ip(user, peer_addr.ip()).await;
-                stats.increment_ip_reservation_rollback_quota_limit_total();
-            }
-            return Err(ProxyError::DataQuotaExceeded {
-                user: user.to_string(),
-            });
        }

        Ok(())
    }
 }
+
+#[cfg(test)]
+#[path = "client_security_tests.rs"]
+mod security_tests;
+#[cfg(test)]
+#[path = "client_adversarial_tests.rs"]
+mod adversarial_tests;
@@ -0,0 +1,109 @@
+use super::*;
+use crate::config::ProxyConfig;
+use crate::stats::Stats;
+use crate::ip_tracker::UserIpTracker;
+use crate::error::ProxyError;
+use std::sync::Arc;
+use std::net::{IpAddr, Ipv4Addr, SocketAddr};
+
+// ------------------------------------------------------------------
+// Priority 3: Massive Concurrency Stress (OWASP ASVS 5.1.6)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn client_stress_10k_connections_limit_strict() {
+    let user = "stress-user";
+    let limit = 512;
+    
+    let stats = Arc::new(Stats::new());
+    let ip_tracker = Arc::new(UserIpTracker::new());
+    
+    let mut config = ProxyConfig::default();
+    config.access.user_max_tcp_conns.insert(user.to_string(), limit);
+    
+    let iterations = 1000;
+    let mut tasks = Vec::new();
+
+    for i in 0..iterations {
+        let stats = Arc::clone(&stats);
+        let ip_tracker = Arc::clone(&ip_tracker);
+        let config = config.clone();
+        let user_str = user.to_string();
+        
+        tasks.push(tokio::spawn(async move {
+            let peer = SocketAddr::new(
+                IpAddr::V4(Ipv4Addr::new(127, 0, 0, (i % 254 + 1) as u8)),
+                10000 + (i % 1000) as u16,
+            );
+            
+            match RunningClientHandler::acquire_user_connection_reservation_static(
+                &user_str,
+                &config,
+                stats,
+                peer,
+                ip_tracker,
+            ).await {
+                Ok(res) => Ok(res),
+                Err(ProxyError::ConnectionLimitExceeded { .. }) => Err(()),
+                Err(e) => panic!("Unexpected error: {:?}", e),
+            }
+        }));
+    }
+
+    let results = futures::future::join_all(tasks).await;
+    let mut successes = 0;
+    let mut failures = 0;
+    let mut reservations = Vec::new();
+
+    for res in results {
+        match res.unwrap() {
+            Ok(r) => {
+                successes += 1;
+                reservations.push(r);
+            }
+            Err(_) => failures += 1,
+        }
+    }
+
+    assert_eq!(successes, limit, "Should allow exactly 'limit' connections");
+    assert_eq!(failures, iterations - limit, "Should fail the rest with LimitExceeded");
+    assert_eq!(stats.get_user_curr_connects(user), limit as u64);
+
+    drop(reservations);
+    
+    ip_tracker.drain_cleanup_queue().await;
+    
+    assert_eq!(stats.get_user_curr_connects(user), 0, "Stats must converge to 0 after all drops");
+    assert_eq!(ip_tracker.get_active_ip_count(user).await, 0, "IP tracker must converge to 0");
+}
+
+// ------------------------------------------------------------------
+// Priority 3: IP Tracker Race Stress
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn client_ip_tracker_race_condition_stress() {
+    let user = "race-user";
+    let ip_tracker = Arc::new(UserIpTracker::new());
+    ip_tracker.set_user_limit(user, 100).await;
+    
+    let iterations = 1000;
+    let mut tasks = Vec::new();
+
+    for i in 0..iterations {
+        let ip_tracker = Arc::clone(&ip_tracker);
+        let ip = IpAddr::V4(Ipv4Addr::new(10, 0, 0, (i % 254 + 1) as u8));
+        
+        tasks.push(tokio::spawn(async move {
+            for _ in 0..10 {
+                if let Ok(()) = ip_tracker.check_and_add("race-user", ip).await {
+                    ip_tracker.remove_ip("race-user", ip).await;
+                }
+            }
+        }));
+    }
+
+    futures::future::join_all(tasks).await;
+    
+    assert_eq!(ip_tracker.get_active_ip_count(user).await, 0, "IP count must be zero after balanced add/remove burst");
+}
@@ -1,7 +1,11 @@
+use std::ffi::OsString;
 use std::fs::OpenOptions;
 use std::io::Write;
 use std::net::SocketAddr;
+use std::path::{Component, Path, PathBuf};
 use std::sync::Arc;
+use std::collections::HashSet;
+use std::sync::{Mutex, OnceLock};

 use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt};
 use tokio::net::TcpStream;
@@ -18,12 +22,155 @@ use crate::proxy::route_mode::{
    RelayRouteMode, RouteCutoverState, ROUTE_SWITCH_ERROR_MSG, affected_cutover_state,
    cutover_stagger_delay,
 };
-use crate::proxy::adaptive_buffers;
-use crate::proxy::session_eviction::SessionLease;
 use crate::stats::Stats;
 use crate::stream::{BufferPool, CryptoReader, CryptoWriter};
 use crate::transport::UpstreamManager;

+#[cfg(unix)]
+use std::os::unix::fs::OpenOptionsExt;
+
+const UNKNOWN_DC_LOG_DISTINCT_LIMIT: usize = 1024;
+static LOGGED_UNKNOWN_DCS: OnceLock<Mutex<HashSet<i16>>> = OnceLock::new();
+const MAX_SCOPE_HINT_LEN: usize = 64;
+
+fn validated_scope_hint(user: &str) -> Option<&str> {
+    let scope = user.strip_prefix("scope_")?;
+    if scope.is_empty() || scope.len() > MAX_SCOPE_HINT_LEN {
+        return None;
+    }
+    if scope
+        .bytes()
+        .all(|b| b.is_ascii_alphanumeric() || b == b'-')
+    {
+        Some(scope)
+    } else {
+        None
+    }
+}
+
+#[derive(Clone)]
+struct SanitizedUnknownDcLogPath {
+    resolved_path: PathBuf,
+    allowed_parent: PathBuf,
+    file_name: OsString,
+}
+
+// In tests, this function shares global mutable state. Callers that also use
+// cache-reset helpers must hold `unknown_dc_test_lock()` to keep assertions
+// deterministic under parallel execution.
+fn should_log_unknown_dc(dc_idx: i16) -> bool {
+    let set = LOGGED_UNKNOWN_DCS.get_or_init(|| Mutex::new(HashSet::new()));
+    should_log_unknown_dc_with_set(set, dc_idx)
+}
+
+fn should_log_unknown_dc_with_set(set: &Mutex<HashSet<i16>>, dc_idx: i16) -> bool {
+    match set.lock() {
+        Ok(mut guard) => {
+            if guard.contains(&dc_idx) {
+                return false;
+            }
+            if guard.len() >= UNKNOWN_DC_LOG_DISTINCT_LIMIT {
+                return false;
+            }
+            guard.insert(dc_idx)
+        }
+        // Fail closed on poisoned state to avoid unbounded blocking log writes.
+        Err(_) => false,
+    }
+}
+
+fn sanitize_unknown_dc_log_path(path: &str) -> Option<SanitizedUnknownDcLogPath> {
+    let candidate = Path::new(path);
+    if candidate.as_os_str().is_empty() {
+        return None;
+    }
+    if candidate
+        .components()
+        .any(|component| matches!(component, Component::ParentDir))
+    {
+        return None;
+    }
+
+    let cwd = std::env::current_dir().ok()?;
+    let file_name = candidate.file_name()?;
+    let parent = candidate.parent().unwrap_or_else(|| Path::new("."));
+    let parent_path = if parent.is_absolute() {
+        parent.to_path_buf()
+    } else {
+        cwd.join(parent)
+    };
+    let canonical_parent = parent_path.canonicalize().ok()?;
+    if !canonical_parent.is_dir() {
+        return None;
+    }
+
+    Some(SanitizedUnknownDcLogPath {
+        resolved_path: canonical_parent.join(file_name),
+        allowed_parent: canonical_parent,
+        file_name: file_name.to_os_string(),
+    })
+}
+
+fn unknown_dc_log_path_is_still_safe(path: &SanitizedUnknownDcLogPath) -> bool {
+    let Some(parent) = path.resolved_path.parent() else {
+        return false;
+    };
+    let Ok(current_parent) = parent.canonicalize() else {
+        return false;
+    };
+    if current_parent != path.allowed_parent {
+        return false;
+    }
+
+    if let Ok(canonical_target) = path.resolved_path.canonicalize() {
+        let Some(target_parent) = canonical_target.parent() else {
+            return false;
+        };
+        let Some(target_name) = canonical_target.file_name() else {
+            return false;
+        };
+        if target_parent != path.allowed_parent || target_name != path.file_name {
+            return false;
+        }
+    }
+
+    true
+}
+
+fn open_unknown_dc_log_append(path: &Path) -> std::io::Result<std::fs::File> {
+    #[cfg(unix)]
+    {
+        OpenOptions::new()
+            .create(true)
+            .append(true)
+            .custom_flags(libc::O_NOFOLLOW)
+            .open(path)
+    }
+    #[cfg(not(unix))]
+    {
+        let _ = path;
+        Err(std::io::Error::new(
+            std::io::ErrorKind::PermissionDenied,
+            "unknown_dc_file_log_enabled requires unix O_NOFOLLOW support",
+        ))
+    }
+}
+
+#[cfg(test)]
+fn clear_unknown_dc_log_cache_for_testing() {
+    if let Some(set) = LOGGED_UNKNOWN_DCS.get()
+        && let Ok(mut guard) = set.lock()
+    {
+        guard.clear();
+    }
+}
+
+#[cfg(test)]
+fn unknown_dc_test_lock() -> &'static Mutex<()> {
+    static TEST_LOCK: OnceLock<Mutex<()>> = OnceLock::new();
+    TEST_LOCK.get_or_init(|| Mutex::new(()))
+}
+
 pub(crate) async fn handle_via_direct<R, W>(
    client_reader: CryptoReader<R>,
    client_writer: CryptoWriter<W>,
@@ -36,7 +183,6 @@ pub(crate) async fn handle_via_direct<R, W>(
    mut route_rx: watch::Receiver<RouteCutoverState>,
    route_snapshot: RouteCutoverState,
    session_id: u64,
-    session_lease: SessionLease,
 ) -> Result<()>
 where
    R: AsyncRead + Unpin + Send + 'static,
@@ -55,8 +201,15 @@ where
        "Connecting to Telegram DC"
    );

+    let scope_hint = validated_scope_hint(user);
+    if user.starts_with("scope_") && scope_hint.is_none() {
+        warn!(
+            user = %user,
+            "Ignoring invalid scope hint and falling back to default upstream selection"
+        );
+    }
    let tg_stream = upstream_manager
-        .connect(dc_addr, Some(success.dc_idx), user.strip_prefix("scope_").filter(|s| !s.is_empty()))
+        .connect(dc_addr, Some(success.dc_idx), scope_hint)
        .await?;

    debug!(peer = %success.peer, dc_addr = %dc_addr, "Connected, performing TG handshake");
@@ -67,29 +220,19 @@ where
    debug!(peer = %success.peer, "TG handshake complete, starting relay");

    stats.increment_user_connects(user);
-    stats.increment_user_curr_connects(user);
-    stats.increment_current_connections_direct();
-
-    let seed_tier = adaptive_buffers::seed_tier_for_user(user);
-    let (c2s_copy_buf, s2c_copy_buf) = adaptive_buffers::direct_copy_buffers_for_tier(
-        seed_tier,
-        config.general.direct_relay_copy_buf_c2s_bytes,
-        config.general.direct_relay_copy_buf_s2c_bytes,
-    );
+    let _direct_connection_lease = stats.acquire_direct_connection_lease();

    let relay_result = relay_bidirectional(
        client_reader,
        client_writer,
        tg_reader,
        tg_writer,
-        c2s_copy_buf,
-        s2c_copy_buf,
+        config.general.direct_relay_copy_buf_c2s_bytes,
+        config.general.direct_relay_copy_buf_s2c_bytes,
        user,
-        success.dc_idx,
        Arc::clone(&stats),
+        config.access.user_data_quota.get(user).copied(),
        buffer_pool,
-        session_lease,
-        seed_tier,
    );
    tokio::pin!(relay_result);
    let relay_result = loop {
@@ -121,9 +264,6 @@ where
        }
    };

-    stats.decrement_current_connections_direct();
-    stats.decrement_user_curr_connects(user);
-
    match &relay_result {
        Ok(()) => debug!(user = %user, "Direct relay completed"),
        Err(e) => debug!(user = %user, error = %e, "Direct relay ended with error"),
@@ -175,12 +315,19 @@ fn get_dc_addr_static(dc_idx: i16, config: &ProxyConfig) -> Result<SocketAddr> {
            && let Some(path) = &config.general.unknown_dc_log_path
            && let Ok(handle) = tokio::runtime::Handle::try_current()
        {
-            let path = path.clone();
-            handle.spawn_blocking(move || {
-                if let Ok(mut file) = OpenOptions::new().create(true).append(true).open(path) {
-                    let _ = writeln!(file, "dc_idx={dc_idx}");
+            if let Some(path) = sanitize_unknown_dc_log_path(path) {
+                if should_log_unknown_dc(dc_idx) {
+                    handle.spawn_blocking(move || {
+                        if unknown_dc_log_path_is_still_safe(&path)
+                            && let Ok(mut file) = open_unknown_dc_log_append(&path.resolved_path)
+                        {
+                            let _ = writeln!(file, "dc_idx={dc_idx}");
+                        }
+                    });
                }
-            });
+            } else {
+                warn!(dc_idx = dc_idx, raw_path = %path, "Rejected unsafe unknown DC log path");
+            }
        }
    }

@@ -188,7 +335,7 @@ fn get_dc_addr_static(dc_idx: i16, config: &ProxyConfig) -> Result<SocketAddr> {
    let fallback_idx = if default_dc >= 1 && default_dc <= num_dcs {
        default_dc - 1
    } else {
-        1
+        0
    };

    info!(
@@ -216,8 +363,6 @@ async fn do_tg_handshake_static(
    let (nonce, _tg_enc_key, _tg_enc_iv, _tg_dec_key, _tg_dec_iv) = generate_tg_nonce(
        success.proto_tag,
        success.dc_idx,
-        &success.dec_key,
-        success.dec_iv,
        &success.enc_key,
        success.enc_iv,
        rng,
@@ -243,3 +388,7 @@ async fn do_tg_handshake_static(
        CryptoWriter::new(write_half, tg_encryptor, max_pending),
    ))
 }
+
+#[cfg(test)]
+#[path = "direct_relay_security_tests.rs"]
+mod security_tests;
@@ -3,11 +3,18 @@
 #![allow(dead_code)]

 use std::net::SocketAddr;
+use std::collections::HashSet;
+use std::collections::hash_map::RandomState;
+use std::net::{IpAddr, Ipv6Addr};
 use std::sync::Arc;
-use std::time::Duration;
+use std::sync::{Mutex, OnceLock};
+use std::hash::{BuildHasher, Hash, Hasher};
+use std::time::{Duration, Instant};
+use dashmap::DashMap;
+use dashmap::mapref::entry::Entry;
 use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt};
 use tracing::{debug, warn, trace};
-use zeroize::Zeroize;
+use zeroize::{Zeroize, Zeroizing};

 use crate::crypto::{sha256, AesCtr, SecureRandom};
 use rand::Rng;
@@ -19,6 +26,463 @@ use crate::stats::ReplayChecker;
 use crate::config::ProxyConfig;
 use crate::tls_front::{TlsFrontCache, emulator};

+const ACCESS_SECRET_BYTES: usize = 16;
+static INVALID_SECRET_WARNED: OnceLock<Mutex<HashSet<(String, String)>>> = OnceLock::new();
+#[cfg(test)]
+const WARNED_SECRET_MAX_ENTRIES: usize = 64;
+#[cfg(not(test))]
+const WARNED_SECRET_MAX_ENTRIES: usize = 1_024;
+
+const AUTH_PROBE_TRACK_RETENTION_SECS: u64 = 10 * 60;
+#[cfg(test)]
+const AUTH_PROBE_TRACK_MAX_ENTRIES: usize = 256;
+#[cfg(not(test))]
+const AUTH_PROBE_TRACK_MAX_ENTRIES: usize = 65_536;
+const AUTH_PROBE_PRUNE_SCAN_LIMIT: usize = 1_024;
+const AUTH_PROBE_BACKOFF_START_FAILS: u32 = 4;
+const AUTH_PROBE_SATURATION_GRACE_FAILS: u32 = 2;
+
+#[cfg(test)]
+const AUTH_PROBE_BACKOFF_BASE_MS: u64 = 1;
+#[cfg(not(test))]
+const AUTH_PROBE_BACKOFF_BASE_MS: u64 = 25;
+
+#[cfg(test)]
+const AUTH_PROBE_BACKOFF_MAX_MS: u64 = 16;
+#[cfg(not(test))]
+const AUTH_PROBE_BACKOFF_MAX_MS: u64 = 1_000;
+
+#[derive(Clone, Copy)]
+struct AuthProbeState {
+    fail_streak: u32,
+    blocked_until: Instant,
+    last_seen: Instant,
+}
+
+#[derive(Clone, Copy)]
+struct AuthProbeSaturationState {
+    fail_streak: u32,
+    blocked_until: Instant,
+    last_seen: Instant,
+}
+
+static AUTH_PROBE_STATE: OnceLock<DashMap<IpAddr, AuthProbeState>> = OnceLock::new();
+static AUTH_PROBE_SATURATION_STATE: OnceLock<Mutex<Option<AuthProbeSaturationState>>> = OnceLock::new();
+static AUTH_PROBE_EVICTION_HASHER: OnceLock<RandomState> = OnceLock::new();
+
+fn auth_probe_state_map() -> &'static DashMap<IpAddr, AuthProbeState> {
+    AUTH_PROBE_STATE.get_or_init(DashMap::new)
+}
+
+fn auth_probe_saturation_state() -> &'static Mutex<Option<AuthProbeSaturationState>> {
+    AUTH_PROBE_SATURATION_STATE.get_or_init(|| Mutex::new(None))
+}
+
+fn normalize_auth_probe_ip(peer_ip: IpAddr) -> IpAddr {
+    match peer_ip {
+        IpAddr::V4(ip) => IpAddr::V4(ip),
+        IpAddr::V6(ip) => {
+            let [a, b, c, d, _, _, _, _] = ip.segments();
+            IpAddr::V6(Ipv6Addr::new(a, b, c, d, 0, 0, 0, 0))
+        }
+    }
+}
+
+fn auth_probe_backoff(fail_streak: u32) -> Duration {
+    if fail_streak < AUTH_PROBE_BACKOFF_START_FAILS {
+        return Duration::ZERO;
+    }
+    let shift = (fail_streak - AUTH_PROBE_BACKOFF_START_FAILS).min(10);
+    let multiplier = 1u64.checked_shl(shift).unwrap_or(u64::MAX);
+    let ms = AUTH_PROBE_BACKOFF_BASE_MS
+        .saturating_mul(multiplier)
+        .min(AUTH_PROBE_BACKOFF_MAX_MS);
+    Duration::from_millis(ms)
+}
+
+fn auth_probe_state_expired(state: &AuthProbeState, now: Instant) -> bool {
+    let retention = Duration::from_secs(AUTH_PROBE_TRACK_RETENTION_SECS);
+    now.duration_since(state.last_seen) > retention
+}
+
+fn auth_probe_eviction_offset(peer_ip: IpAddr, now: Instant) -> usize {
+    let hasher_state = AUTH_PROBE_EVICTION_HASHER.get_or_init(RandomState::new);
+    let mut hasher = hasher_state.build_hasher();
+    peer_ip.hash(&mut hasher);
+    now.hash(&mut hasher);
+    hasher.finish() as usize
+}
+
+fn auth_probe_is_throttled(peer_ip: IpAddr, now: Instant) -> bool {
+    let peer_ip = normalize_auth_probe_ip(peer_ip);
+    let state = auth_probe_state_map();
+    let Some(entry) = state.get(&peer_ip) else {
+        return false;
+    };
+    if auth_probe_state_expired(&entry, now) {
+        drop(entry);
+        state.remove(&peer_ip);
+        return false;
+    }
+    now < entry.blocked_until
+}
+
+fn auth_probe_saturation_grace_exhausted(peer_ip: IpAddr, now: Instant) -> bool {
+    let peer_ip = normalize_auth_probe_ip(peer_ip);
+    let state = auth_probe_state_map();
+    let Some(entry) = state.get(&peer_ip) else {
+        return false;
+    };
+    if auth_probe_state_expired(&entry, now) {
+        drop(entry);
+        state.remove(&peer_ip);
+        return false;
+    }
+
+    entry.fail_streak >= AUTH_PROBE_BACKOFF_START_FAILS + AUTH_PROBE_SATURATION_GRACE_FAILS
+}
+
+fn auth_probe_should_apply_preauth_throttle(peer_ip: IpAddr, now: Instant) -> bool {
+    if !auth_probe_is_throttled(peer_ip, now) {
+        return false;
+    }
+
+    if !auth_probe_saturation_is_throttled(now) {
+        return true;
+    }
+
+    auth_probe_saturation_grace_exhausted(peer_ip, now)
+}
+
+fn auth_probe_saturation_is_throttled(now: Instant) -> bool {
+    let saturation = auth_probe_saturation_state();
+    let mut guard = match saturation.lock() {
+        Ok(guard) => guard,
+        Err(_) => return false,
+    };
+
+    let Some(state) = guard.as_mut() else {
+        return false;
+    };
+
+    if now.duration_since(state.last_seen) > Duration::from_secs(AUTH_PROBE_TRACK_RETENTION_SECS) {
+        *guard = None;
+        return false;
+    }
+
+    if now < state.blocked_until {
+        return true;
+    }
+
+    false
+}
+
+fn auth_probe_note_saturation(now: Instant) {
+    let saturation = auth_probe_saturation_state();
+    let mut guard = match saturation.lock() {
+        Ok(guard) => guard,
+        Err(_) => return,
+    };
+
+    match guard.as_mut() {
+        Some(state)
+            if now.duration_since(state.last_seen)
+                <= Duration::from_secs(AUTH_PROBE_TRACK_RETENTION_SECS) =>
+        {
+            state.fail_streak = state.fail_streak.saturating_add(1);
+            state.last_seen = now;
+            state.blocked_until = now + auth_probe_backoff(state.fail_streak);
+        }
+        _ => {
+            let fail_streak = AUTH_PROBE_BACKOFF_START_FAILS;
+            *guard = Some(AuthProbeSaturationState {
+                fail_streak,
+                blocked_until: now + auth_probe_backoff(fail_streak),
+                last_seen: now,
+            });
+        }
+    }
+}
+
+fn auth_probe_record_failure(peer_ip: IpAddr, now: Instant) {
+    let peer_ip = normalize_auth_probe_ip(peer_ip);
+    let state = auth_probe_state_map();
+    auth_probe_record_failure_with_state(state, peer_ip, now);
+}
+
+fn auth_probe_record_failure_with_state(
+    state: &DashMap<IpAddr, AuthProbeState>,
+    peer_ip: IpAddr,
+    now: Instant,
+) {
+    let make_new_state = || AuthProbeState {
+        fail_streak: 1,
+        blocked_until: now + auth_probe_backoff(1),
+        last_seen: now,
+    };
+
+    let update_existing = |entry: &mut AuthProbeState| {
+        if auth_probe_state_expired(entry, now) {
+            *entry = make_new_state();
+        } else {
+            entry.fail_streak = entry.fail_streak.saturating_add(1);
+            entry.last_seen = now;
+            entry.blocked_until = now + auth_probe_backoff(entry.fail_streak);
+        }
+    };
+
+    match state.entry(peer_ip) {
+        Entry::Occupied(mut entry) => {
+            update_existing(entry.get_mut());
+            return;
+        }
+        Entry::Vacant(_) => {}
+    }
+
+    if state.len() >= AUTH_PROBE_TRACK_MAX_ENTRIES {
+        let mut rounds = 0usize;
+        while state.len() >= AUTH_PROBE_TRACK_MAX_ENTRIES {
+            rounds += 1;
+            if rounds > 8 {
+                auth_probe_note_saturation(now);
+                let mut eviction_candidate: Option<(IpAddr, u32, Instant)> = None;
+                for entry in state.iter().take(AUTH_PROBE_PRUNE_SCAN_LIMIT) {
+                    let key = *entry.key();
+                    let fail_streak = entry.value().fail_streak;
+                    let last_seen = entry.value().last_seen;
+                    match eviction_candidate {
+                        Some((_, current_fail, current_seen))
+                            if fail_streak > current_fail
+                                || (fail_streak == current_fail && last_seen >= current_seen) =>
+                        {
+                        }
+                        _ => eviction_candidate = Some((key, fail_streak, last_seen)),
+                    }
+                }
+
+                let Some((evict_key, _, _)) = eviction_candidate else {
+                    return;
+                };
+                state.remove(&evict_key);
+                break;
+            }
+
+            let mut stale_keys = Vec::new();
+            let mut eviction_candidate: Option<(IpAddr, u32, Instant)> = None;
+            let state_len = state.len();
+            let scan_limit = state_len.min(AUTH_PROBE_PRUNE_SCAN_LIMIT);
+            let start_offset = if state_len == 0 {
+                0
+            } else {
+                auth_probe_eviction_offset(peer_ip, now) % state_len
+            };
+
+            let mut scanned = 0usize;
+            for entry in state.iter().skip(start_offset) {
+                let key = *entry.key();
+                let fail_streak = entry.value().fail_streak;
+                let last_seen = entry.value().last_seen;
+                match eviction_candidate {
+                    Some((_, current_fail, current_seen))
+                        if fail_streak > current_fail
+                            || (fail_streak == current_fail && last_seen >= current_seen) =>
+                    {
+                    }
+                    _ => eviction_candidate = Some((key, fail_streak, last_seen)),
+                }
+                if auth_probe_state_expired(entry.value(), now) {
+                    stale_keys.push(key);
+                }
+                scanned += 1;
+                if scanned >= scan_limit {
+                    break;
+                }
+            }
+
+            if scanned < scan_limit {
+                for entry in state.iter().take(scan_limit - scanned) {
+                    let key = *entry.key();
+                    let fail_streak = entry.value().fail_streak;
+                    let last_seen = entry.value().last_seen;
+                    match eviction_candidate {
+                        Some((_, current_fail, current_seen))
+                            if fail_streak > current_fail
+                                || (fail_streak == current_fail && last_seen >= current_seen) =>
+                        {
+                        }
+                        _ => eviction_candidate = Some((key, fail_streak, last_seen)),
+                    }
+                    if auth_probe_state_expired(entry.value(), now) {
+                        stale_keys.push(key);
+                    }
+                }
+            }
+
+            for stale_key in stale_keys {
+                state.remove(&stale_key);
+            }
+
+            if state.len() < AUTH_PROBE_TRACK_MAX_ENTRIES {
+                break;
+            }
+
+            let Some((evict_key, _, _)) = eviction_candidate else {
+                auth_probe_note_saturation(now);
+                return;
+            };
+            state.remove(&evict_key);
+            auth_probe_note_saturation(now);
+        }
+    }
+
+    match state.entry(peer_ip) {
+        Entry::Occupied(mut entry) => {
+            update_existing(entry.get_mut());
+        }
+        Entry::Vacant(entry) => {
+            entry.insert(make_new_state());
+        }
+    }
+}
+
+fn auth_probe_record_success(peer_ip: IpAddr) {
+    let peer_ip = normalize_auth_probe_ip(peer_ip);
+    let state = auth_probe_state_map();
+    state.remove(&peer_ip);
+}
+
+#[cfg(test)]
+fn clear_auth_probe_state_for_testing() {
+    if let Some(state) = AUTH_PROBE_STATE.get() {
+        state.clear();
+    }
+    if let Some(saturation) = AUTH_PROBE_SATURATION_STATE.get()
+        && let Ok(mut guard) = saturation.lock()
+    {
+        *guard = None;
+    }
+}
+
+#[cfg(test)]
+fn auth_probe_fail_streak_for_testing(peer_ip: IpAddr) -> Option<u32> {
+    let peer_ip = normalize_auth_probe_ip(peer_ip);
+    let state = AUTH_PROBE_STATE.get()?;
+    state.get(&peer_ip).map(|entry| entry.fail_streak)
+}
+
+#[cfg(test)]
+fn auth_probe_is_throttled_for_testing(peer_ip: IpAddr) -> bool {
+    auth_probe_is_throttled(peer_ip, Instant::now())
+}
+
+#[cfg(test)]
+fn auth_probe_saturation_is_throttled_for_testing() -> bool {
+    auth_probe_saturation_is_throttled(Instant::now())
+}
+
+#[cfg(test)]
+fn auth_probe_saturation_is_throttled_at_for_testing(now: Instant) -> bool {
+    auth_probe_saturation_is_throttled(now)
+}
+
+#[cfg(test)]
+fn auth_probe_test_lock() -> &'static Mutex<()> {
+    static TEST_LOCK: OnceLock<Mutex<()>> = OnceLock::new();
+    TEST_LOCK.get_or_init(|| Mutex::new(()))
+}
+
+#[cfg(test)]
+fn clear_warned_secrets_for_testing() {
+    if let Some(warned) = INVALID_SECRET_WARNED.get()
+        && let Ok(mut guard) = warned.lock()
+    {
+        guard.clear();
+    }
+}
+
+#[cfg(test)]
+fn warned_secrets_test_lock() -> &'static Mutex<()> {
+    static TEST_LOCK: OnceLock<Mutex<()>> = OnceLock::new();
+    TEST_LOCK.get_or_init(|| Mutex::new(()))
+}
+
+fn warn_invalid_secret_once(name: &str, reason: &str, expected: usize, got: Option<usize>) {
+    let key = (name.to_string(), reason.to_string());
+    let warned = INVALID_SECRET_WARNED.get_or_init(|| Mutex::new(HashSet::new()));
+    let should_warn = match warned.lock() {
+        Ok(mut guard) => {
+            if !guard.contains(&key) && guard.len() >= WARNED_SECRET_MAX_ENTRIES {
+                false
+            } else {
+                guard.insert(key)
+            }
+        }
+        Err(_) => true,
+    };
+
+    if !should_warn {
+        return;
+    }
+
+    match got {
+        Some(actual) => {
+            warn!(
+                user = %name,
+                expected = expected,
+                got = actual,
+                "Skipping user: access secret has unexpected length"
+            );
+        }
+        None => {
+            warn!(
+                user = %name,
+                "Skipping user: access secret is not valid hex"
+            );
+        }
+    }
+}
+
+fn decode_user_secret(name: &str, secret_hex: &str) -> Option<Vec<u8>> {
+    match hex::decode(secret_hex) {
+        Ok(bytes) if bytes.len() == ACCESS_SECRET_BYTES => Some(bytes),
+        Ok(bytes) => {
+            warn_invalid_secret_once(
+                name,
+                "invalid_length",
+                ACCESS_SECRET_BYTES,
+                Some(bytes.len()),
+            );
+            None
+        }
+        Err(_) => {
+            warn_invalid_secret_once(name, "invalid_hex", ACCESS_SECRET_BYTES, None);
+            None
+        }
+    }
+}
+
+// Decide whether a client-supplied proto tag is allowed given the configured
+// proxy modes and the transport that carried the handshake.
+//
+// A common mistake is to treat `modes.tls` and `modes.secure` as interchangeable
+// even though they correspond to different transport profiles: `modes.tls` is
+// for the TLS-fronted (EE-TLS) path, while `modes.secure` is for direct MTProto
+// over TCP (DD). Enforcing this separation prevents an attacker from using a
+// TLS-capable client to bypass the operator intent for the direct MTProto mode,
+// and vice versa.
+fn mode_enabled_for_proto(config: &ProxyConfig, proto_tag: ProtoTag, is_tls: bool) -> bool {
+    match proto_tag {
+        ProtoTag::Secure => {
+            if is_tls {
+                config.general.modes.tls
+            } else {
+                config.general.modes.secure
+            }
+        }
+        ProtoTag::Intermediate | ProtoTag::Abridged => config.general.modes.classic,
+    }
+}
+
 fn decode_user_secrets(
    config: &ProxyConfig,
    preferred_user: Option<&str>,
@@ -27,7 +491,7 @@ fn decode_user_secrets(

    if let Some(preferred) = preferred_user
        && let Some(secret_hex) = config.access.users.get(preferred)
-        && let Ok(bytes) = hex::decode(secret_hex)
+        && let Some(bytes) = decode_user_secret(preferred, secret_hex)
    {
        secrets.push((preferred.to_string(), bytes));
    }
@@ -36,7 +500,7 @@ fn decode_user_secrets(
        if preferred_user.is_some_and(|preferred| preferred == name.as_str()) {
            continue;
        }
-        if let Ok(bytes) = hex::decode(secret_hex) {
+        if let Some(bytes) = decode_user_secret(name, secret_hex) {
            secrets.push((name.clone(), bytes));
        }
    }
@@ -44,11 +508,29 @@ fn decode_user_secrets(
    secrets
 }

+async fn maybe_apply_server_hello_delay(config: &ProxyConfig) {
+    if config.censorship.server_hello_delay_max_ms == 0 {
+        return;
+    }
+
+    let min = config.censorship.server_hello_delay_min_ms;
+    let max = config.censorship.server_hello_delay_max_ms.max(min);
+    let delay_ms = if max == min {
+        max
+    } else {
+        rand::rng().random_range(min..=max)
+    };
+
+    if delay_ms > 0 {
+        tokio::time::sleep(Duration::from_millis(delay_ms)).await;
+    }
+}
+
 /// Result of successful handshake
 ///
 /// Key material (`dec_key`, `dec_iv`, `enc_key`, `enc_iv`) is
 /// zeroized on drop.
-#[derive(Debug, Clone)]
+#[derive(Debug)]
 pub struct HandshakeSuccess {
    /// Authenticated user name
    pub user: String,
@@ -65,6 +547,7 @@ pub struct HandshakeSuccess {
    /// Client address
    pub peer: SocketAddr,
    /// Whether TLS was used
+    
    pub is_tls: bool,
 }

@@ -94,28 +577,33 @@ where
 {
    debug!(peer = %peer, handshake_len = handshake.len(), "Processing TLS handshake");

+    let throttle_now = Instant::now();
+    if auth_probe_should_apply_preauth_throttle(peer.ip(), throttle_now) {
+        maybe_apply_server_hello_delay(config).await;
+        debug!(peer = %peer, "TLS handshake rejected by pre-auth probe throttle");
+        return HandshakeResult::BadClient { reader, writer };
+    }
+
    if handshake.len() < tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN + 1 {
+        auth_probe_record_failure(peer.ip(), Instant::now());
+        maybe_apply_server_hello_delay(config).await;
        debug!(peer = %peer, "TLS handshake too short");
        return HandshakeResult::BadClient { reader, writer };
    }

-    let digest = &handshake[tls::TLS_DIGEST_POS..tls::TLS_DIGEST_POS + tls::TLS_DIGEST_LEN];
-    let digest_half = &digest[..tls::TLS_DIGEST_HALF_LEN];
+    let client_sni = tls::extract_sni_from_client_hello(handshake);
+    let secrets = decode_user_secrets(config, client_sni.as_deref());

-    if replay_checker.check_and_add_tls_digest(digest_half) {
-        warn!(peer = %peer, "TLS replay attack detected (duplicate digest)");
-        return HandshakeResult::BadClient { reader, writer };
-    }
-
-    let secrets = decode_user_secrets(config, None);
-
-    let validation = match tls::validate_tls_handshake(
+    let validation = match tls::validate_tls_handshake_with_replay_window(
        handshake,
        &secrets,
        config.access.ignore_time_skew,
+        config.access.replay_window_secs,
    ) {
        Some(v) => v,
        None => {
+            auth_probe_record_failure(peer.ip(), Instant::now());
+            maybe_apply_server_hello_delay(config).await;
            debug!(
                peer = %peer, 
                ignore_time_skew = config.access.ignore_time_skew,
@@ -125,16 +613,29 @@ where
        }
    };

+    // Replay tracking is applied only after successful authentication to avoid
+    // letting unauthenticated probes evict valid entries from the replay cache.
+    let digest_half = &validation.digest[..tls::TLS_DIGEST_HALF_LEN];
+    if replay_checker.check_and_add_tls_digest(digest_half) {
+        auth_probe_record_failure(peer.ip(), Instant::now());
+        maybe_apply_server_hello_delay(config).await;
+        warn!(peer = %peer, "TLS replay attack detected (duplicate digest)");
+        return HandshakeResult::BadClient { reader, writer };
+    }
+
    let secret = match secrets.iter().find(|(name, _)| *name == validation.user) {
        Some((_, s)) => s,
-        None => return HandshakeResult::BadClient { reader, writer },
+        None => {
+            maybe_apply_server_hello_delay(config).await;
+            return HandshakeResult::BadClient { reader, writer };
+        }
    };

    let cached = if config.censorship.tls_emulation {
        if let Some(cache) = tls_cache.as_ref() {
-            let selected_domain = if let Some(sni) = tls::extract_sni_from_client_hello(handshake) {
+            let selected_domain = if let Some(sni) = client_sni.as_ref() {
                if cache.contains_domain(&sni).await {
-                    sni
+                    sni.clone()
                } else {
                    config.censorship.tls_domain.clone()
                }
@@ -166,6 +667,10 @@ where
            Some(b"h2".to_vec())
        } else if alpn_list.iter().any(|p| p == b"http/1.1") {
            Some(b"http/1.1".to_vec())
+        } else if !alpn_list.is_empty() {
+            maybe_apply_server_hello_delay(config).await;
+            debug!(peer = %peer, "Client ALPN list has no supported protocol; using masking fallback");
+            return HandshakeResult::BadClient { reader, writer };
        } else {
            None
        }
@@ -196,19 +701,9 @@ where
        )
    };

-    // Optional anti-fingerprint delay before sending ServerHello.
-    if config.censorship.server_hello_delay_max_ms > 0 {
-        let min = config.censorship.server_hello_delay_min_ms;
-        let max = config.censorship.server_hello_delay_max_ms.max(min);
-        let delay_ms = if max == min {
-            max
-        } else {
-            rand::rng().random_range(min..=max)
-        };
-        if delay_ms > 0 {
-            tokio::time::sleep(std::time::Duration::from_millis(delay_ms)).await;
-        }
-    }
+    // Apply the same optional delay budget used by reject paths to reduce
+    // distinguishability between success and fail-closed handshakes.
+    maybe_apply_server_hello_delay(config).await;

    debug!(peer = %peer, response_len = response.len(), "Sending TLS ServerHello");

@@ -228,6 +723,8 @@ where
        "TLS handshake successful"
    );

+    auth_probe_record_success(peer.ip());
+
    HandshakeResult::Success((
        FakeTlsReader::new(reader),
        FakeTlsWriter::new(writer),
@@ -250,15 +747,25 @@ where
    R: AsyncRead + Unpin + Send,
    W: AsyncWrite + Unpin + Send,
 {
-    trace!(peer = %peer, handshake = ?hex::encode(handshake), "MTProto handshake bytes");
+    let handshake_fingerprint = {
+        let digest = sha256(&handshake[..8]);
+        hex::encode(&digest[..4])
+    };
+    trace!(
+        peer = %peer,
+        handshake_fingerprint = %handshake_fingerprint,
+        "MTProto handshake prefix"
+    );

-    let dec_prekey_iv = &handshake[SKIP_LEN..SKIP_LEN + PREKEY_LEN + IV_LEN];
-
-    if replay_checker.check_and_add_handshake(dec_prekey_iv) {
-        warn!(peer = %peer, "MTProto replay attack detected");
+    let throttle_now = Instant::now();
+    if auth_probe_should_apply_preauth_throttle(peer.ip(), throttle_now) {
+        maybe_apply_server_hello_delay(config).await;
+        debug!(peer = %peer, "MTProto handshake rejected by pre-auth probe throttle");
        return HandshakeResult::BadClient { reader, writer };
    }

+    let dec_prekey_iv = &handshake[SKIP_LEN..SKIP_LEN + PREKEY_LEN + IV_LEN];
+
    let enc_prekey_iv: Vec<u8> = dec_prekey_iv.iter().rev().copied().collect();

    let decoded_users = decode_user_secrets(config, preferred_user);
@@ -268,57 +775,66 @@ where
        let dec_prekey = &dec_prekey_iv[..PREKEY_LEN];
        let dec_iv_bytes = &dec_prekey_iv[PREKEY_LEN..];

-        let mut dec_key_input = Vec::with_capacity(PREKEY_LEN + secret.len());
+        let mut dec_key_input = Zeroizing::new(Vec::with_capacity(PREKEY_LEN + secret.len()));
        dec_key_input.extend_from_slice(dec_prekey);
        dec_key_input.extend_from_slice(&secret);
        let dec_key = sha256(&dec_key_input);

-        let dec_iv = u128::from_be_bytes(dec_iv_bytes.try_into().unwrap());
+        let mut dec_iv_arr = [0u8; IV_LEN];
+        dec_iv_arr.copy_from_slice(dec_iv_bytes);
+        let dec_iv = u128::from_be_bytes(dec_iv_arr);

        let mut decryptor = AesCtr::new(&dec_key, dec_iv);
        let decrypted = decryptor.decrypt(handshake);

-        let tag_bytes: [u8; 4] = decrypted[PROTO_TAG_POS..PROTO_TAG_POS + 4]
-            .try_into()
-            .unwrap();
+        let tag_bytes: [u8; 4] = [
+            decrypted[PROTO_TAG_POS],
+            decrypted[PROTO_TAG_POS + 1],
+            decrypted[PROTO_TAG_POS + 2],
+            decrypted[PROTO_TAG_POS + 3],
+        ];

        let proto_tag = match ProtoTag::from_bytes(tag_bytes) {
            Some(tag) => tag,
            None => continue,
        };

-        let mode_ok = match proto_tag {
-            ProtoTag::Secure => {
-                if is_tls {
-                    config.general.modes.tls || config.general.modes.secure
-                } else {
-                    config.general.modes.secure || config.general.modes.tls
-                }
-            }
-            ProtoTag::Intermediate | ProtoTag::Abridged => config.general.modes.classic,
-        };
+        let mode_ok = mode_enabled_for_proto(config, proto_tag, is_tls);

        if !mode_ok {
            debug!(peer = %peer, user = %user, proto = ?proto_tag, "Mode not enabled");
            continue;
        }

-        let dc_idx = i16::from_le_bytes(
-            decrypted[DC_IDX_POS..DC_IDX_POS + 2].try_into().unwrap()
-        );
+        let dc_idx = i16::from_le_bytes([decrypted[DC_IDX_POS], decrypted[DC_IDX_POS + 1]]);

        let enc_prekey = &enc_prekey_iv[..PREKEY_LEN];
        let enc_iv_bytes = &enc_prekey_iv[PREKEY_LEN..];

-        let mut enc_key_input = Vec::with_capacity(PREKEY_LEN + secret.len());
+        let mut enc_key_input = Zeroizing::new(Vec::with_capacity(PREKEY_LEN + secret.len()));
        enc_key_input.extend_from_slice(enc_prekey);
        enc_key_input.extend_from_slice(&secret);
        let enc_key = sha256(&enc_key_input);

-        let enc_iv = u128::from_be_bytes(enc_iv_bytes.try_into().unwrap());
+        let mut enc_iv_arr = [0u8; IV_LEN];
+        enc_iv_arr.copy_from_slice(enc_iv_bytes);
+        let enc_iv = u128::from_be_bytes(enc_iv_arr);

        let encryptor = AesCtr::new(&enc_key, enc_iv);

+// Apply replay tracking only after successful authentication.
+    //
+    // This ordering prevents an attacker from producing invalid handshakes that
+    // still collide with a valid handshake's replay slot and thus evict a valid
+    // entry from the cache. We accept the cost of performing the full
+    // authentication check first to avoid poisoning the replay cache.
+        if replay_checker.check_and_add_handshake(dec_prekey_iv) {
+            auth_probe_record_failure(peer.ip(), Instant::now());
+            maybe_apply_server_hello_delay(config).await;
+            warn!(peer = %peer, user = %user, "MTProto replay attack detected");
+            return HandshakeResult::BadClient { reader, writer };
+        }
+
        let success = HandshakeSuccess {
            user: user.clone(),
            dc_idx,
@@ -340,6 +856,8 @@ where
            "MTProto handshake successful"
        );

+        auth_probe_record_success(peer.ip());
+
        let max_pending = config.general.crypto_pending_buffer;
        return HandshakeResult::Success((
            CryptoReader::new(reader, decryptor),
@@ -348,6 +866,8 @@ where
        ));
    }

+    auth_probe_record_failure(peer.ip(), Instant::now());
+    maybe_apply_server_hello_delay(config).await;
    debug!(peer = %peer, "MTProto handshake: no matching user found");
    HandshakeResult::BadClient { reader, writer }
 }
@@ -356,8 +876,6 @@ where
 pub fn generate_tg_nonce(
    proto_tag: ProtoTag, 
    dc_idx: i16,
-    _client_dec_key: &[u8; 32],
-    _client_dec_iv: u128,
    client_enc_key: &[u8; 32],
    client_enc_iv: u128,
    rng: &SecureRandom,
@@ -365,14 +883,16 @@ pub fn generate_tg_nonce(
 ) -> ([u8; HANDSHAKE_LEN], [u8; 32], u128, [u8; 32], u128) {
    loop {
        let bytes = rng.bytes(HANDSHAKE_LEN);
-        let mut nonce: [u8; HANDSHAKE_LEN] = bytes.try_into().unwrap();
+        let Ok(mut nonce): Result<[u8; HANDSHAKE_LEN], _> = bytes.try_into() else {
+            continue;
+        };

        if RESERVED_NONCE_FIRST_BYTES.contains(&nonce[0]) { continue; }

-        let first_four: [u8; 4] = nonce[..4].try_into().unwrap();
+        let first_four: [u8; 4] = [nonce[0], nonce[1], nonce[2], nonce[3]];
        if RESERVED_NONCE_BEGINNINGS.contains(&first_four) { continue; }

-        let continue_four: [u8; 4] = nonce[4..8].try_into().unwrap();
+        let continue_four: [u8; 4] = [nonce[4], nonce[5], nonce[6], nonce[7]];
        if RESERVED_NONCE_CONTINUES.contains(&continue_four) { continue; }

        nonce[PROTO_TAG_POS..PROTO_TAG_POS + 4].copy_from_slice(&proto_tag.to_bytes());
@@ -380,7 +900,7 @@ pub fn generate_tg_nonce(
        nonce[DC_IDX_POS..DC_IDX_POS + 2].copy_from_slice(&dc_idx.to_le_bytes());

        if fast_mode {
-            let mut key_iv = Vec::with_capacity(KEY_LEN + IV_LEN);
+            let mut key_iv = Zeroizing::new(Vec::with_capacity(KEY_LEN + IV_LEN));
            key_iv.extend_from_slice(client_enc_key);
            key_iv.extend_from_slice(&client_enc_iv.to_be_bytes());
            key_iv.reverse(); // Python/C behavior: reversed enc_key+enc_iv in nonce
@@ -388,13 +908,19 @@ pub fn generate_tg_nonce(
        }

        let enc_key_iv = &nonce[SKIP_LEN..SKIP_LEN + KEY_LEN + IV_LEN];
-        let dec_key_iv: Vec<u8> = enc_key_iv.iter().rev().copied().collect();
+        let dec_key_iv = Zeroizing::new(enc_key_iv.iter().rev().copied().collect::<Vec<u8>>());

-        let tg_enc_key: [u8; 32] = enc_key_iv[..KEY_LEN].try_into().unwrap();
-        let tg_enc_iv = u128::from_be_bytes(enc_key_iv[KEY_LEN..].try_into().unwrap());
+        let mut tg_enc_key = [0u8; 32];
+        tg_enc_key.copy_from_slice(&enc_key_iv[..KEY_LEN]);
+        let mut tg_enc_iv_arr = [0u8; IV_LEN];
+        tg_enc_iv_arr.copy_from_slice(&enc_key_iv[KEY_LEN..]);
+        let tg_enc_iv = u128::from_be_bytes(tg_enc_iv_arr);

-        let tg_dec_key: [u8; 32] = dec_key_iv[..KEY_LEN].try_into().unwrap();
-        let tg_dec_iv = u128::from_be_bytes(dec_key_iv[KEY_LEN..].try_into().unwrap());
+        let mut tg_dec_key = [0u8; 32];
+        tg_dec_key.copy_from_slice(&dec_key_iv[..KEY_LEN]);
+        let mut tg_dec_iv_arr = [0u8; IV_LEN];
+        tg_dec_iv_arr.copy_from_slice(&dec_key_iv[KEY_LEN..]);
+        let tg_dec_iv = u128::from_be_bytes(tg_dec_iv_arr);

        return (nonce, tg_enc_key, tg_enc_iv, tg_dec_key, tg_dec_iv);
    }
@@ -403,13 +929,19 @@ pub fn generate_tg_nonce(
 /// Encrypt nonce for sending to Telegram and return cipher objects with correct counter state
 pub fn encrypt_tg_nonce_with_ciphers(nonce: &[u8; HANDSHAKE_LEN]) -> (Vec<u8>, AesCtr, AesCtr) {
    let enc_key_iv = &nonce[SKIP_LEN..SKIP_LEN + KEY_LEN + IV_LEN];
-    let dec_key_iv: Vec<u8> = enc_key_iv.iter().rev().copied().collect();
+    let dec_key_iv = Zeroizing::new(enc_key_iv.iter().rev().copied().collect::<Vec<u8>>());

-    let enc_key: [u8; 32] = enc_key_iv[..KEY_LEN].try_into().unwrap();
-    let enc_iv = u128::from_be_bytes(enc_key_iv[KEY_LEN..].try_into().unwrap());
+    let mut enc_key = [0u8; 32];
+    enc_key.copy_from_slice(&enc_key_iv[..KEY_LEN]);
+    let mut enc_iv_arr = [0u8; IV_LEN];
+    enc_iv_arr.copy_from_slice(&enc_key_iv[KEY_LEN..]);
+    let enc_iv = u128::from_be_bytes(enc_iv_arr);

-    let dec_key: [u8; 32] = dec_key_iv[..KEY_LEN].try_into().unwrap();
-    let dec_iv = u128::from_be_bytes(dec_key_iv[KEY_LEN..].try_into().unwrap());
+    let mut dec_key = [0u8; 32];
+    dec_key.copy_from_slice(&dec_key_iv[..KEY_LEN]);
+    let mut dec_iv_arr = [0u8; IV_LEN];
+    dec_iv_arr.copy_from_slice(&dec_key_iv[KEY_LEN..]);
+    let dec_iv = u128::from_be_bytes(dec_iv_arr);

    let mut encryptor = AesCtr::new(&enc_key, enc_iv);
    let encrypted_full = encryptor.encrypt(nonce);  // counter: 0 → 4
@@ -418,91 +950,37 @@ pub fn encrypt_tg_nonce_with_ciphers(nonce: &[u8; HANDSHAKE_LEN]) -> (Vec<u8>, A
    result.extend_from_slice(&encrypted_full[PROTO_TAG_POS..]);

    let decryptor = AesCtr::new(&dec_key, dec_iv);
+    enc_key.zeroize();
+    dec_key.zeroize();

    (result, encryptor, decryptor)
 }

 /// Encrypt nonce for sending to Telegram (legacy function for compatibility)
+
 pub fn encrypt_tg_nonce(nonce: &[u8; HANDSHAKE_LEN]) -> Vec<u8> {
    let (encrypted, _, _) = encrypt_tg_nonce_with_ciphers(nonce);
    encrypted
 }

 #[cfg(test)]
-mod tests {
-    use super::*;
+#[path = "handshake_security_tests.rs"]
+mod security_tests;

-    #[test]
-    fn test_generate_tg_nonce() {
-        let client_dec_key = [0x42u8; 32];
-        let client_dec_iv = 12345u128;
-        let client_enc_key = [0x24u8; 32];
-        let client_enc_iv = 54321u128;
+#[cfg(test)]
+#[path = "handshake_adversarial_tests.rs"]
+mod adversarial_tests;

-        let rng = SecureRandom::new();
-        let (nonce, _tg_enc_key, _tg_enc_iv, _tg_dec_key, _tg_dec_iv) = 
-            generate_tg_nonce(
-                ProtoTag::Secure,
-                2,
-                &client_dec_key,
-                client_dec_iv,
-                &client_enc_key,
-                client_enc_iv,
-                &rng,
-                false,
-            );
+#[cfg(test)]
+#[path = "handshake_fuzz_security_tests.rs"]
+mod fuzz_security_tests;

-        assert_eq!(nonce.len(), HANDSHAKE_LEN);
+/// Compile-time guard: HandshakeSuccess holds cryptographic key material and
+/// must never be Copy.  A Copy impl would allow silent key duplication,
+/// undermining the zeroize-on-drop guarantee.
+mod compile_time_security_checks {
+    use super::HandshakeSuccess;
+    use static_assertions::assert_not_impl_all;

-        let tag_bytes: [u8; 4] = nonce[PROTO_TAG_POS..PROTO_TAG_POS + 4].try_into().unwrap();
-        assert_eq!(ProtoTag::from_bytes(tag_bytes), Some(ProtoTag::Secure));
-    }
-
-    #[test]
-    fn test_encrypt_tg_nonce() {
-        let client_dec_key = [0x42u8; 32];
-        let client_dec_iv = 12345u128;
-        let client_enc_key = [0x24u8; 32];
-        let client_enc_iv = 54321u128;
-
-        let rng = SecureRandom::new();
-        let (nonce, _, _, _, _) = 
-            generate_tg_nonce(
-                ProtoTag::Secure,
-                2,
-                &client_dec_key,
-                client_dec_iv,
-                &client_enc_key,
-                client_enc_iv,
-                &rng,
-                false,
-            );
-
-        let encrypted = encrypt_tg_nonce(&nonce);
-
-        assert_eq!(encrypted.len(), HANDSHAKE_LEN);
-        assert_eq!(&encrypted[..PROTO_TAG_POS], &nonce[..PROTO_TAG_POS]);
-        assert_ne!(&encrypted[PROTO_TAG_POS..], &nonce[PROTO_TAG_POS..]);
-    }
-
-    #[test]
-    fn test_handshake_success_zeroize_on_drop() {
-        let success = HandshakeSuccess {
-            user: "test".to_string(),
-            dc_idx: 2,
-            proto_tag: ProtoTag::Secure,
-            dec_key: [0xAA; 32],
-            dec_iv: 0xBBBBBBBB,
-            enc_key: [0xCC; 32],
-            enc_iv: 0xDDDDDDDD,
-            peer: "127.0.0.1:1234".parse().unwrap(),
-            is_tls: true,
-        };
-
-        assert_eq!(success.dec_key, [0xAA; 32]);
-        assert_eq!(success.enc_key, [0xCC; 32]);
-
-        drop(success);
-        // Drop impl zeroizes key material without panic
-    }
+    assert_not_impl_all!(HandshakeSuccess: Copy, Clone);
 }
@@ -0,0 +1,231 @@
+use super::*;
+use std::sync::Arc;
+use std::net::{IpAddr, Ipv4Addr};
+use std::time::{Duration, Instant};
+use crate::crypto::sha256;
+
+fn make_valid_mtproto_handshake(secret_hex: &str, proto_tag: ProtoTag, dc_idx: i16) -> [u8; HANDSHAKE_LEN] {
+    let secret = hex::decode(secret_hex).expect("secret hex must decode");
+    let mut handshake = [0x5Au8; HANDSHAKE_LEN];
+    for (idx, b) in handshake[SKIP_LEN..SKIP_LEN + PREKEY_LEN + IV_LEN]
+        .iter_mut()
+        .enumerate()
+    {
+        *b = (idx as u8).wrapping_add(1);
+    }
+
+    let dec_prekey = &handshake[SKIP_LEN..SKIP_LEN + PREKEY_LEN];
+    let dec_iv_bytes = &handshake[SKIP_LEN + PREKEY_LEN..SKIP_LEN + PREKEY_LEN + IV_LEN];
+
+    let mut dec_key_input = Vec::with_capacity(PREKEY_LEN + secret.len());
+    dec_key_input.extend_from_slice(dec_prekey);
+    dec_key_input.extend_from_slice(&secret);
+    let dec_key = sha256(&dec_key_input);
+
+    let mut dec_iv_arr = [0u8; IV_LEN];
+    dec_iv_arr.copy_from_slice(dec_iv_bytes);
+    let dec_iv = u128::from_be_bytes(dec_iv_arr);
+
+    let mut stream = AesCtr::new(&dec_key, dec_iv);
+    let keystream = stream.encrypt(&[0u8; HANDSHAKE_LEN]);
+
+    let mut target_plain = [0u8; HANDSHAKE_LEN];
+    target_plain[PROTO_TAG_POS..PROTO_TAG_POS + 4].copy_from_slice(&proto_tag.to_bytes());
+    target_plain[DC_IDX_POS..DC_IDX_POS + 2].copy_from_slice(&dc_idx.to_le_bytes());
+
+    for idx in PROTO_TAG_POS..HANDSHAKE_LEN {
+        handshake[idx] = target_plain[idx] ^ keystream[idx];
+    }
+
+    handshake
+}
+
+fn auth_probe_test_guard() -> std::sync::MutexGuard<'static, ()> {
+    auth_probe_test_lock()
+        .lock()
+        .unwrap_or_else(|poisoned| poisoned.into_inner())
+}
+
+fn test_config_with_secret_hex(secret_hex: &str) -> ProxyConfig {
+    let mut cfg = ProxyConfig::default();
+    cfg.access.users.clear();
+    cfg.access.users.insert("user".to_string(), secret_hex.to_string());
+    cfg.access.ignore_time_skew = true;
+    cfg.general.modes.secure = true;
+    cfg
+}
+
+// ------------------------------------------------------------------
+// Mutational Bit-Flipping Tests (OWASP ASVS 5.1.4)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn mtproto_handshake_bit_flip_anywhere_rejected() {
+    let _guard = auth_probe_test_guard();
+    clear_auth_probe_state_for_testing();
+
+    let secret_hex = "11223344556677889900aabbccddeeff";
+    let base = make_valid_mtproto_handshake(secret_hex, ProtoTag::Secure, 2);
+    let config = test_config_with_secret_hex(secret_hex);
+    let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
+    let peer: SocketAddr = "192.0.2.1:12345".parse().unwrap();
+
+    // Baseline check
+    let res = handle_mtproto_handshake(&base, tokio::io::empty(), tokio::io::sink(), peer, &config, &replay_checker, false, None).await;
+    match res {
+        HandshakeResult::Success(_) => {},
+        _ => panic!("Baseline failed: expected Success"),
+    }
+
+    // Flip bits in the encrypted part (beyond the key material)
+    for byte_pos in SKIP_LEN..HANDSHAKE_LEN {
+        let mut h = base;
+        h[byte_pos] ^= 0x01; // Flip 1 bit
+        let res = handle_mtproto_handshake(&h, tokio::io::empty(), tokio::io::sink(), peer, &config, &replay_checker, false, None).await;
+        assert!(matches!(res, HandshakeResult::BadClient { .. }), "Flip at byte {byte_pos} bit 0 must be rejected");
+    }
+}
+
+// ------------------------------------------------------------------
+// Adversarial Probing / Timing Neutrality (OWASP ASVS 5.1.7)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn mtproto_handshake_timing_neutrality_mocked() {
+    let secret_hex = "00112233445566778899aabbccddeeff";
+    let base = make_valid_mtproto_handshake(secret_hex, ProtoTag::Secure, 1);
+    let config = test_config_with_secret_hex(secret_hex);
+    let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
+    let peer: SocketAddr = "192.0.2.2:54321".parse().unwrap();
+
+    const ITER: usize = 50;
+    
+    let mut start = Instant::now();
+    for _ in 0..ITER {
+        let _ = handle_mtproto_handshake(&base, tokio::io::empty(), tokio::io::sink(), peer, &config, &replay_checker, false, None).await;
+    }
+    let duration_success = start.elapsed();
+
+    start = Instant::now();
+    for i in 0..ITER {
+        let mut h = base;
+        h[SKIP_LEN + (i % 48)] ^= 0xFF; 
+        let _ = handle_mtproto_handshake(&h, tokio::io::empty(), tokio::io::sink(), peer, &config, &replay_checker, false, None).await;
+    }
+    let duration_fail = start.elapsed();
+
+    let avg_diff_ms = (duration_success.as_millis() as f64 - duration_fail.as_millis() as f64).abs() / ITER as f64;
+    
+    // Threshold (loose for CI)
+    assert!(avg_diff_ms < 100.0, "Timing difference too large: {} ms/iter", avg_diff_ms);
+}
+
+// ------------------------------------------------------------------
+// Stress Tests (OWASP ASVS 5.1.6)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn auth_probe_throttle_saturation_stress() {
+    let _guard = auth_probe_test_guard();
+    clear_auth_probe_state_for_testing();
+
+    let now = Instant::now();
+    
+    // Record enough failures for one IP to trigger backoff
+    let target_ip = IpAddr::V4(Ipv4Addr::new(1, 1, 1, 1));
+    for _ in 0..AUTH_PROBE_BACKOFF_START_FAILS {
+        auth_probe_record_failure(target_ip, now);
+    }
+    
+    assert!(auth_probe_is_throttled(target_ip, now));
+
+    // Stress test with many unique IPs
+    for i in 0..500u32 {
+        let ip = IpAddr::V4(Ipv4Addr::new(203, 0, 113, (i % 256) as u8));
+        auth_probe_record_failure(ip, now);
+    }
+
+    let tracked = AUTH_PROBE_STATE
+        .get()
+        .map(|state| state.len())
+        .unwrap_or(0);
+    assert!(
+        tracked <= AUTH_PROBE_TRACK_MAX_ENTRIES,
+        "auth probe state grew past hard cap: {tracked} > {AUTH_PROBE_TRACK_MAX_ENTRIES}"
+    );
+}
+
+#[tokio::test]
+async fn mtproto_handshake_abridged_prefix_rejected() {
+    let _guard = auth_probe_test_guard();
+    clear_auth_probe_state_for_testing();
+
+    let mut handshake = [0x5Au8; HANDSHAKE_LEN];
+    handshake[0] = 0xef; // Abridged prefix
+    let config = ProxyConfig::default();
+    let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
+    let peer: SocketAddr = "192.0.2.3:12345".parse().unwrap();
+
+    let res = handle_mtproto_handshake(&handshake, tokio::io::empty(), tokio::io::sink(), peer, &config, &replay_checker, false, None).await;
+    // MTProxy stops immediately on 0xef
+    assert!(matches!(res, HandshakeResult::BadClient { .. }));
+}
+
+#[tokio::test]
+async fn mtproto_handshake_preferred_user_mismatch_continues() {
+    let _guard = auth_probe_test_guard();
+    clear_auth_probe_state_for_testing();
+
+    let secret1_hex = "11111111111111111111111111111111";
+    let secret2_hex = "22222222222222222222222222222222";
+    
+    let base = make_valid_mtproto_handshake(secret2_hex, ProtoTag::Secure, 1);
+    let mut config = ProxyConfig::default();
+    config.access.users.insert("user1".to_string(), secret1_hex.to_string());
+    config.access.users.insert("user2".to_string(), secret2_hex.to_string());
+    config.access.ignore_time_skew = true;
+    config.general.modes.secure = true;
+
+    let replay_checker = ReplayChecker::new(128, Duration::from_secs(60));
+    let peer: SocketAddr = "192.0.2.4:12345".parse().unwrap();
+
+    // Even if we prefer user1, if user2 matches, it should succeed.
+    let res = handle_mtproto_handshake(&base, tokio::io::empty(), tokio::io::sink(), peer, &config, &replay_checker, false, Some("user1")).await;
+    if let HandshakeResult::Success((_, _, success)) = res {
+        assert_eq!(success.user, "user2");
+    } else {
+        panic!("Handshake failed even though user2 matched");
+    }
+}
+
+#[tokio::test]
+async fn mtproto_handshake_concurrent_flood_stability() {
+    let _guard = auth_probe_test_guard();
+    clear_auth_probe_state_for_testing();
+
+    let secret_hex = "00112233445566778899aabbccddeeff";
+    let base = make_valid_mtproto_handshake(secret_hex, ProtoTag::Secure, 1);
+    let mut config = test_config_with_secret_hex(secret_hex);
+    config.access.ignore_time_skew = true;
+    let replay_checker = Arc::new(ReplayChecker::new(1024, Duration::from_secs(60)));
+    let config = Arc::new(config);
+    
+    let mut tasks = Vec::new();
+    for i in 0..50 {
+        let base = base;
+        let config = Arc::clone(&config);
+        let replay_checker = Arc::clone(&replay_checker);
+        let peer: SocketAddr = format!("192.0.2.{}:12345", (i % 254) + 1).parse().unwrap();
+        
+        tasks.push(tokio::spawn(async move {
+            let res = handle_mtproto_handshake(&base, tokio::io::empty(), tokio::io::sink(), peer, &config, &replay_checker, false, None).await;
+            matches!(res, HandshakeResult::Success(_))
+        }));
+    }
+    
+    // We don't necessarily care if they all succeed (some might fail due to replay if they hit the same chunk),
+    // but the system must not panic or hang.
+    for task in tasks {
+        let _ = task.await.unwrap();
+    }
+}
@@ -0,0 +1,270 @@
+use super::*;
+use crate::config::ProxyConfig;
+use crate::crypto::AesCtr;
+use crate::crypto::sha256;
+use crate::protocol::constants::ProtoTag;
+use crate::stats::ReplayChecker;
+use std::net::SocketAddr;
+use std::sync::MutexGuard;
+use tokio::time::{timeout, Duration as TokioDuration};
+
+fn make_mtproto_handshake_with_proto_bytes(
+    secret_hex: &str,
+    proto_bytes: [u8; 4],
+    dc_idx: i16,
+) -> [u8; HANDSHAKE_LEN] {
+    let secret = hex::decode(secret_hex).expect("secret hex must decode");
+    let mut handshake = [0x5Au8; HANDSHAKE_LEN];
+    for (idx, b) in handshake[SKIP_LEN..SKIP_LEN + PREKEY_LEN + IV_LEN]
+        .iter_mut()
+        .enumerate()
+    {
+        *b = (idx as u8).wrapping_add(1);
+    }
+
+    let dec_prekey = &handshake[SKIP_LEN..SKIP_LEN + PREKEY_LEN];
+    let dec_iv_bytes = &handshake[SKIP_LEN + PREKEY_LEN..SKIP_LEN + PREKEY_LEN + IV_LEN];
+
+    let mut dec_key_input = Vec::with_capacity(PREKEY_LEN + secret.len());
+    dec_key_input.extend_from_slice(dec_prekey);
+    dec_key_input.extend_from_slice(&secret);
+    let dec_key = sha256(&dec_key_input);
+
+    let mut dec_iv_arr = [0u8; IV_LEN];
+    dec_iv_arr.copy_from_slice(dec_iv_bytes);
+    let dec_iv = u128::from_be_bytes(dec_iv_arr);
+
+    let mut stream = AesCtr::new(&dec_key, dec_iv);
+    let keystream = stream.encrypt(&[0u8; HANDSHAKE_LEN]);
+
+    let mut target_plain = [0u8; HANDSHAKE_LEN];
+    target_plain[PROTO_TAG_POS..PROTO_TAG_POS + 4].copy_from_slice(&proto_bytes);
+    target_plain[DC_IDX_POS..DC_IDX_POS + 2].copy_from_slice(&dc_idx.to_le_bytes());
+
+    for idx in PROTO_TAG_POS..HANDSHAKE_LEN {
+        handshake[idx] = target_plain[idx] ^ keystream[idx];
+    }
+
+    handshake
+}
+
+fn make_valid_mtproto_handshake(secret_hex: &str, proto_tag: ProtoTag, dc_idx: i16) -> [u8; HANDSHAKE_LEN] {
+    make_mtproto_handshake_with_proto_bytes(secret_hex, proto_tag.to_bytes(), dc_idx)
+}
+
+fn test_config_with_secret_hex(secret_hex: &str) -> ProxyConfig {
+    let mut cfg = ProxyConfig::default();
+    cfg.access.users.clear();
+    cfg.access.users.insert("user".to_string(), secret_hex.to_string());
+    cfg.access.ignore_time_skew = true;
+    cfg.general.modes.secure = true;
+    cfg
+}
+
+fn auth_probe_test_guard() -> MutexGuard<'static, ()> {
+    auth_probe_test_lock()
+        .lock()
+        .unwrap_or_else(|poisoned| poisoned.into_inner())
+}
+
+#[tokio::test]
+async fn mtproto_handshake_duplicate_digest_is_replayed_on_second_attempt() {
+    let _guard = auth_probe_test_guard();
+    clear_auth_probe_state_for_testing();
+
+    let secret_hex = "11223344556677889900aabbccddeeff";
+    let base = make_valid_mtproto_handshake(secret_hex, ProtoTag::Secure, 2);
+    let config = test_config_with_secret_hex(secret_hex);
+    let replay_checker = ReplayChecker::new(128, TokioDuration::from_secs(60));
+    let peer: SocketAddr = "192.0.2.1:12345".parse().unwrap();
+
+    let first = handle_mtproto_handshake(
+        &base,
+        tokio::io::empty(),
+        tokio::io::sink(),
+        peer,
+        &config,
+        &replay_checker,
+        false,
+        None,
+    )
+    .await;
+    assert!(matches!(first, HandshakeResult::Success(_)));
+
+    let second = handle_mtproto_handshake(
+        &base,
+        tokio::io::empty(),
+        tokio::io::sink(),
+        peer,
+        &config,
+        &replay_checker,
+        false,
+        None,
+    )
+    .await;
+    assert!(matches!(second, HandshakeResult::BadClient { .. }));
+
+    clear_auth_probe_state_for_testing();
+}
+
+#[tokio::test]
+async fn mtproto_handshake_fuzz_corpus_never_panics_and_stays_fail_closed() {
+    let _guard = auth_probe_test_guard();
+    clear_auth_probe_state_for_testing();
+
+    let secret_hex = "00112233445566778899aabbccddeeff";
+    let base = make_valid_mtproto_handshake(secret_hex, ProtoTag::Secure, 1);
+    let config = test_config_with_secret_hex(secret_hex);
+    let replay_checker = ReplayChecker::new(128, TokioDuration::from_secs(60));
+    let peer: SocketAddr = "192.0.2.2:54321".parse().unwrap();
+
+    let mut corpus = Vec::<[u8; HANDSHAKE_LEN]>::new();
+
+    corpus.push(make_mtproto_handshake_with_proto_bytes(
+        secret_hex,
+        [0x00, 0x00, 0x00, 0x00],
+        1,
+    ));
+    corpus.push(make_mtproto_handshake_with_proto_bytes(
+        secret_hex,
+        [0xff, 0xff, 0xff, 0xff],
+        1,
+    ));
+    corpus.push(make_valid_mtproto_handshake(
+        "ffeeddccbbaa99887766554433221100",
+        ProtoTag::Secure,
+        1,
+    ));
+
+    let mut seed = 0xF0F0_F00D_BAAD_CAFEu64;
+    for _ in 0..32 {
+        let mut mutated = base;
+        for _ in 0..4 {
+            seed = seed.wrapping_mul(2862933555777941757).wrapping_add(3037000493);
+            let idx = SKIP_LEN + (seed as usize % (PREKEY_LEN + IV_LEN));
+            mutated[idx] ^= ((seed >> 19) as u8).wrapping_add(1);
+        }
+        corpus.push(mutated);
+    }
+
+    for (idx, input) in corpus.into_iter().enumerate() {
+        let result = timeout(
+            TokioDuration::from_secs(1),
+            handle_mtproto_handshake(
+                &input,
+                tokio::io::empty(),
+                tokio::io::sink(),
+                peer,
+                &config,
+                &replay_checker,
+                false,
+                None,
+            ),
+        )
+        .await
+        .expect("fuzzed handshake must complete in time");
+
+        assert!(
+            matches!(result, HandshakeResult::BadClient { .. }),
+            "corpus item {idx} must fail closed"
+        );
+    }
+
+    clear_auth_probe_state_for_testing();
+}
+
+#[tokio::test]
+async fn mtproto_handshake_mixed_corpus_never_panics_and_exact_duplicates_are_rejected() {
+    let _guard = auth_probe_test_guard();
+    clear_auth_probe_state_for_testing();
+
+    let secret_hex = "99887766554433221100ffeeddccbbaa";
+    let base = make_valid_mtproto_handshake(secret_hex, ProtoTag::Secure, 4);
+    let config = test_config_with_secret_hex(secret_hex);
+    let replay_checker = ReplayChecker::new(256, TokioDuration::from_secs(60));
+    let peer: SocketAddr = "192.0.2.44:45444".parse().unwrap();
+
+    let first = timeout(
+        TokioDuration::from_secs(1),
+        handle_mtproto_handshake(
+            &base,
+            tokio::io::empty(),
+            tokio::io::sink(),
+            peer,
+            &config,
+            &replay_checker,
+            false,
+            None,
+        ),
+    )
+    .await
+    .expect("base handshake must not hang");
+    assert!(matches!(first, HandshakeResult::Success(_)));
+
+    let replay = timeout(
+        TokioDuration::from_secs(1),
+        handle_mtproto_handshake(
+            &base,
+            tokio::io::empty(),
+            tokio::io::sink(),
+            peer,
+            &config,
+            &replay_checker,
+            false,
+            None,
+        ),
+    )
+    .await
+    .expect("duplicate handshake must not hang");
+    assert!(matches!(replay, HandshakeResult::BadClient { .. }));
+
+    let mut corpus = Vec::<[u8; HANDSHAKE_LEN]>::new();
+
+    let mut prekey_flip = base;
+    prekey_flip[SKIP_LEN] ^= 0x80;
+    corpus.push(prekey_flip);
+
+    let mut iv_flip = base;
+    iv_flip[SKIP_LEN + PREKEY_LEN] ^= 0x01;
+    corpus.push(iv_flip);
+
+    let mut tail_flip = base;
+    tail_flip[SKIP_LEN + PREKEY_LEN + IV_LEN - 1] ^= 0x40;
+    corpus.push(tail_flip);
+
+    let mut seed = 0xBADC_0FFE_EE11_4242u64;
+    for _ in 0..24 {
+        let mut mutated = base;
+        for _ in 0..3 {
+            seed = seed.wrapping_mul(6364136223846793005).wrapping_add(1);
+            let idx = SKIP_LEN + (seed as usize % (PREKEY_LEN + IV_LEN));
+            mutated[idx] ^= ((seed >> 16) as u8).wrapping_add(1);
+        }
+        corpus.push(mutated);
+    }
+
+    for (idx, input) in corpus.iter().enumerate() {
+        let result = timeout(
+            TokioDuration::from_secs(1),
+            handle_mtproto_handshake(
+                input,
+                tokio::io::empty(),
+                tokio::io::sink(),
+                peer,
+                &config,
+                &replay_checker,
+                false,
+                None,
+            ),
+        )
+        .await
+        .expect("fuzzed handshake must complete in time");
+
+        assert!(
+            matches!(result, HandshakeResult::BadClient { .. }),
+            "mixed corpus item {idx} must fail closed"
+        );
+    }
+
+    clear_auth_probe_state_for_testing();
+}
@@ -7,19 +7,90 @@ use tokio::net::TcpStream;
 #[cfg(unix)]
 use tokio::net::UnixStream;
 use tokio::io::{AsyncRead, AsyncWrite, AsyncReadExt, AsyncWriteExt};
-use tokio::time::timeout;
+use tokio::time::{Instant, timeout};
 use tracing::debug;
 use crate::config::ProxyConfig;
 use crate::network::dns_overrides::resolve_socket_addr;
 use crate::stats::beobachten::BeobachtenStore;
 use crate::transport::proxy_protocol::{ProxyProtocolV1Builder, ProxyProtocolV2Builder};

+#[cfg(not(test))]
 const MASK_TIMEOUT: Duration = Duration::from_secs(5);
+#[cfg(test)]
+const MASK_TIMEOUT: Duration = Duration::from_millis(50);
 /// Maximum duration for the entire masking relay.
 /// Limits resource consumption from slow-loris attacks and port scanners.
+#[cfg(not(test))]
 const MASK_RELAY_TIMEOUT: Duration = Duration::from_secs(60);
+#[cfg(test)]
+const MASK_RELAY_TIMEOUT: Duration = Duration::from_millis(200);
+#[cfg(not(test))]
+const MASK_RELAY_IDLE_TIMEOUT: Duration = Duration::from_secs(5);
+#[cfg(test)]
+const MASK_RELAY_IDLE_TIMEOUT: Duration = Duration::from_millis(100);
 const MASK_BUFFER_SIZE: usize = 8192;

+async fn copy_with_idle_timeout<R, W>(reader: &mut R, writer: &mut W)
+where
+    R: AsyncRead + Unpin,
+    W: AsyncWrite + Unpin,
+{
+    let mut buf = [0u8; MASK_BUFFER_SIZE];
+    loop {
+        let read_res = timeout(MASK_RELAY_IDLE_TIMEOUT, reader.read(&mut buf)).await;
+        let n = match read_res {
+            Ok(Ok(n)) => n,
+            Ok(Err(_)) | Err(_) => break,
+        };
+        if n == 0 {
+            break;
+        }
+
+        let write_res = timeout(MASK_RELAY_IDLE_TIMEOUT, writer.write_all(&buf[..n])).await;
+        match write_res {
+            Ok(Ok(())) => {}
+            Ok(Err(_)) | Err(_) => break,
+        }
+    }
+}
+
+async fn write_proxy_header_with_timeout<W>(mask_write: &mut W, header: &[u8]) -> bool
+where
+    W: AsyncWrite + Unpin,
+{
+    match timeout(MASK_TIMEOUT, mask_write.write_all(header)).await {
+        Ok(Ok(())) => true,
+        Ok(Err(_)) => false,
+        Err(_) => {
+            debug!("Timeout writing proxy protocol header to mask backend");
+            false
+        }
+    }
+}
+
+async fn consume_client_data_with_timeout<R>(reader: R)
+where
+    R: AsyncRead + Unpin,
+{
+    if timeout(MASK_RELAY_TIMEOUT, consume_client_data(reader)).await.is_err() {
+        debug!("Timed out while consuming client data on masking fallback path");
+    }
+}
+
+async fn wait_mask_connect_budget(started: Instant) {
+    let elapsed = started.elapsed();
+    if elapsed < MASK_TIMEOUT {
+        tokio::time::sleep(MASK_TIMEOUT - elapsed).await;
+    }
+}
+
+async fn wait_mask_outcome_budget(started: Instant) {
+    let elapsed = started.elapsed();
+    if elapsed < MASK_TIMEOUT {
+        tokio::time::sleep(MASK_TIMEOUT - elapsed).await;
+    }
+}
+
 /// Detect client type based on initial data
 fn detect_client_type(data: &[u8]) -> &'static str {
    // Check for HTTP request
@@ -71,13 +142,15 @@ where

    if !config.censorship.mask {
        // Masking disabled, just consume data
-        consume_client_data(reader).await;
+        consume_client_data_with_timeout(reader).await;
        return;
    }

    // Connect via Unix socket or TCP
    #[cfg(unix)]
    if let Some(ref sock_path) = config.censorship.mask_unix_sock {
+        let outcome_started = Instant::now();
+        let connect_started = Instant::now();
        debug!(
            client_type = client_type,
            sock = %sock_path,
@@ -107,21 +180,26 @@ where
                    }
                };
                if let Some(header) = proxy_header {
-                    if mask_write.write_all(&header).await.is_err() {
+                    if !write_proxy_header_with_timeout(&mut mask_write, &header).await {
+                        wait_mask_outcome_budget(outcome_started).await;
                        return;
                    }
                }
                if timeout(MASK_RELAY_TIMEOUT, relay_to_mask(reader, writer, mask_read, mask_write, initial_data)).await.is_err() {
                    debug!("Mask relay timed out (unix socket)");
                }
+                wait_mask_outcome_budget(outcome_started).await;
            }
            Ok(Err(e)) => {
+                wait_mask_connect_budget(connect_started).await;
                debug!(error = %e, "Failed to connect to mask unix socket");
-                consume_client_data(reader).await;
+                consume_client_data_with_timeout(reader).await;
+                wait_mask_outcome_budget(outcome_started).await;
            }
            Err(_) => {
                debug!("Timeout connecting to mask unix socket");
-                consume_client_data(reader).await;
+                consume_client_data_with_timeout(reader).await;
+                wait_mask_outcome_budget(outcome_started).await;
            }
        }
        return;
@@ -143,6 +221,8 @@ where
    let mask_addr = resolve_socket_addr(mask_host, mask_port)
        .map(|addr| addr.to_string())
        .unwrap_or_else(|| format!("{}:{}", mask_host, mask_port));
+    let outcome_started = Instant::now();
+    let connect_started = Instant::now();
    let connect_result = timeout(MASK_TIMEOUT, TcpStream::connect(&mask_addr)).await;
    match connect_result {
        Ok(Ok(stream)) => {
@@ -166,21 +246,26 @@ where

            let (mask_read, mut mask_write) = stream.into_split();
            if let Some(header) = proxy_header {
-                if mask_write.write_all(&header).await.is_err() {
+                if !write_proxy_header_with_timeout(&mut mask_write, &header).await {
+                    wait_mask_outcome_budget(outcome_started).await;
                    return;
                }
            }
            if timeout(MASK_RELAY_TIMEOUT, relay_to_mask(reader, writer, mask_read, mask_write, initial_data)).await.is_err() {
                debug!("Mask relay timed out");
            }
+            wait_mask_outcome_budget(outcome_started).await;
        }
        Ok(Err(e)) => {
+            wait_mask_connect_budget(connect_started).await;
            debug!(error = %e, "Failed to connect to mask host");
-            consume_client_data(reader).await;
+            consume_client_data_with_timeout(reader).await;
+            wait_mask_outcome_budget(outcome_started).await;
        }
        Err(_) => {
            debug!("Timeout connecting to mask host");
-            consume_client_data(reader).await;
+            consume_client_data_with_timeout(reader).await;
+            wait_mask_outcome_budget(outcome_started).await;
        }
    }
 }
@@ -203,47 +288,20 @@ where
    if mask_write.write_all(initial_data).await.is_err() {
        return;
    }
-
-    // Relay traffic
-    let c2m = tokio::spawn(async move {
-        let mut buf = vec![0u8; MASK_BUFFER_SIZE];
-        loop {
-            match reader.read(&mut buf).await {
-                Ok(0) | Err(_) => {
-                    let _ = mask_write.shutdown().await;
-                    break;
-                }
-                Ok(n) => {
-                    if mask_write.write_all(&buf[..n]).await.is_err() {
-                        break;
-                    }
-                }
-            }
-        }
-    });
-
-    let m2c = tokio::spawn(async move {
-        let mut buf = vec![0u8; MASK_BUFFER_SIZE];
-        loop {
-            match mask_read.read(&mut buf).await {
-                Ok(0) | Err(_) => {
-                    let _ = writer.shutdown().await;
-                    break;
-                }
-                Ok(n) => {
-                    if writer.write_all(&buf[..n]).await.is_err() {
-                        break;
-                    }
-                }
-            }
-        }
-    });
-
-    // Wait for either to complete
-    tokio::select! {
-        _ = c2m => {}
-        _ = m2c => {}
+    if mask_write.flush().await.is_err() {
+        return;
    }
+
+    let _ = tokio::join!(
+        async {
+            copy_with_idle_timeout(&mut reader, &mut mask_write).await;
+            let _ = mask_write.shutdown().await;
+        },
+        async {
+            copy_with_idle_timeout(&mut mask_read, &mut writer).await;
+            let _ = writer.shutdown().await;
+        }
+    );
 }

 /// Just consume all data from client without responding
@@ -255,3 +313,11 @@ async fn consume_client_data<R: AsyncRead + Unpin>(mut reader: R) {
        }
    }
 }
+
+#[cfg(test)]
+#[path = "masking_security_tests.rs"]
+mod security_tests;
+
+#[cfg(test)]
+#[path = "masking_adversarial_tests.rs"]
+mod adversarial_tests;
@@ -0,0 +1,213 @@
+use super::*;
+use std::sync::Arc;
+use tokio::io::duplex;
+use tokio::net::TcpListener;
+use tokio::time::{Instant, Duration};
+use crate::config::ProxyConfig;
+use crate::stats::beobachten::BeobachtenStore;
+
+// ------------------------------------------------------------------
+// Probing Indistinguishability (OWASP ASVS 5.1.7)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn masking_probes_indistinguishable_timing() {
+    let mut config = ProxyConfig::default();
+    config.censorship.mask = true;
+    config.censorship.mask_host = Some("127.0.0.1".to_string());
+    config.censorship.mask_port = 80; // Should timeout/refuse
+    
+    let peer: SocketAddr = "192.0.2.10:443".parse().unwrap();
+    let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
+    let beobachten = BeobachtenStore::new();
+
+    // Test different probe types
+    let probes = vec![
+        (b"GET / HTTP/1.1\r\nHost: x\r\n\r\n".to_vec(), "HTTP"),
+        (b"SSH-2.0-probe".to_vec(), "SSH"),
+        (vec![0x16, 0x03, 0x03, 0x00, 0x05, 0x01, 0x00, 0x00, 0x01, 0x00], "TLS-scanner"),
+        (vec![0x42; 5], "port-scanner"),
+    ];
+
+    for (probe, type_name) in probes {
+        let (client_reader, _client_writer) = duplex(256);
+        let (_client_visible_reader, client_visible_writer) = duplex(256);
+        
+        let start = Instant::now();
+        handle_bad_client(
+            client_reader,
+            client_visible_writer,
+            &probe,
+            peer,
+            local_addr,
+            &config,
+            &beobachten,
+        ).await;
+        
+        let elapsed = start.elapsed();
+        
+        // We expect any outcome to take roughly MASK_TIMEOUT (50ms in tests)
+        // to mask whether the backend was reachable or refused.
+        assert!(elapsed >= Duration::from_millis(30), "Probe {type_name} finished too fast: {elapsed:?}");
+    }
+}
+
+// ------------------------------------------------------------------
+// Masking Budget Stress Tests (OWASP ASVS 5.1.6)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn masking_budget_stress_under_load() {
+    let mut config = ProxyConfig::default();
+    config.censorship.mask = true;
+    config.censorship.mask_host = Some("127.0.0.1".to_string());
+    config.censorship.mask_port = 1; // Unlikely port
+
+    let peer: SocketAddr = "192.0.2.20:443".parse().unwrap();
+    let local_addr: SocketAddr = "127.0.0.1:443".parse().unwrap();
+    let beobachten = Arc::new(BeobachtenStore::new());
+
+    let mut tasks = Vec::new();
+    for _ in 0..50 {
+        let (client_reader, _client_writer) = duplex(256);
+        let (_client_visible_reader, client_visible_writer) = duplex(256);
+        let config = config.clone();
+        let beobachten = Arc::clone(&beobachten);
+        
+        tasks.push(tokio::spawn(async move {
+            let start = Instant::now();
+            handle_bad_client(
+                client_reader,
+                client_visible_writer,
+                b"probe",
+                peer,
+                local_addr,
+                &config,
+                &beobachten,
+            ).await;
+            start.elapsed()
+        }));
+    }
+
+    for task in tasks {
+        let elapsed = task.await.unwrap();
+        assert!(elapsed >= Duration::from_millis(30), "Stress probe finished too fast: {elapsed:?}");
+    }
+}
+
+// ------------------------------------------------------------------
+// detect_client_type Fingerprint Check
+// ------------------------------------------------------------------
+
+#[test]
+fn test_detect_client_type_boundary_cases() {
+    // 9 bytes = port-scanner
+    assert_eq!(detect_client_type(&[0x42; 9]), "port-scanner");
+    // 10 bytes = unknown
+    assert_eq!(detect_client_type(&[0x42; 10]), "unknown");
+    
+    // HTTP verbs without trailing space
+    assert_eq!(detect_client_type(b"GET/"), "port-scanner"); // because len < 10
+    assert_eq!(detect_client_type(b"GET /path"), "HTTP"); 
+}
+
+// ------------------------------------------------------------------
+// Priority 2: Slowloris and Slow Read Attacks (OWASP ASVS 5.1.5)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn masking_slowloris_client_idle_timeout_rejected() {
+    let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
+    let backend_addr = listener.local_addr().unwrap();
+    let initial = b"GET / HTTP/1.1\r\nHost: front.example\r\n\r\n".to_vec();
+
+    let accept_task = tokio::spawn({
+        let initial = initial.clone();
+        async move {
+            let (mut stream, _) = listener.accept().await.unwrap();
+            let mut observed = vec![0u8; initial.len()];
+            stream.read_exact(&mut observed).await.unwrap();
+            assert_eq!(observed, initial);
+
+            let mut drip = [0u8; 1];
+            let drip_read = tokio::time::timeout(Duration::from_millis(220), stream.read_exact(&mut drip)).await;
+            assert!(
+                drip_read.is_err() || drip_read.unwrap().is_err(),
+                "backend must not receive post-timeout slowloris drip bytes"
+            );
+        }
+    });
+
+    let mut config = ProxyConfig::default();
+    config.censorship.mask = true;
+    config.censorship.mask_host = Some("127.0.0.1".to_string());
+    config.censorship.mask_port = backend_addr.port();
+
+    let beobachten = BeobachtenStore::new();
+    let peer: SocketAddr = "192.0.2.10:12345".parse().unwrap();
+    let local: SocketAddr = "192.0.2.1:443".parse().unwrap();
+
+    let (mut client_writer, client_reader) = duplex(1024);
+    let (_client_visible_reader, client_visible_writer) = duplex(1024);
+
+    let handle = tokio::spawn(async move {
+        handle_bad_client(
+            client_reader,
+            client_visible_writer,
+            &initial,
+            peer,
+            local,
+            &config,
+            &beobachten,
+        )
+        .await;
+    });
+
+    tokio::time::sleep(Duration::from_millis(160)).await;
+    let _ = client_writer.write_all(b"X").await;
+
+    handle.await.unwrap();
+    accept_task.await.unwrap();
+}
+
+// ------------------------------------------------------------------
+// Priority 2: Fallback Server Down / Fingerprinting (OWASP ASVS 5.1.7)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn masking_fallback_down_mimics_timeout() {
+    let mut config = ProxyConfig::default();
+    config.censorship.mask = true;
+    config.censorship.mask_host = Some("127.0.0.1".to_string());
+    config.censorship.mask_port = 1; // Unlikely port
+    
+    let (server_reader, server_writer) = duplex(1024);
+    let beobachten = BeobachtenStore::new();
+    let peer: SocketAddr = "192.0.2.12:12345".parse().unwrap();
+    let local: SocketAddr = "192.0.2.1:443".parse().unwrap();
+
+    let start = Instant::now();
+    handle_bad_client(server_reader, server_writer, b"GET / HTTP/1.1\r\n", peer, local, &config, &beobachten).await;
+    
+    let elapsed = start.elapsed();
+    // It should wait for MASK_TIMEOUT (50ms in tests) even if connection was refused immediately
+    assert!(elapsed >= Duration::from_millis(40), "Must respect connect budget even on failure: {:?}", elapsed);
+}
+
+// ------------------------------------------------------------------
+// Priority 2: SSRF Prevention (OWASP ASVS 5.1.2)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn masking_ssrf_resolve_internal_ranges_blocked() {
+    use crate::network::dns_overrides::resolve_socket_addr;
+
+    let blocked_ips = ["127.0.0.1", "169.254.169.254", "10.0.0.1", "192.168.1.1", "0.0.0.0"];
+
+    for ip in blocked_ips {
+        assert!(
+            resolve_socket_addr(ip, 80).is_none(),
+            "runtime DNS overrides must not resolve unconfigured literal host targets"
+        );
+    }
+}
@@ -1,14 +1,15 @@
-use std::collections::HashMap;
-use std::collections::hash_map::DefaultHasher;
+use std::collections::hash_map::RandomState;
+use std::hash::BuildHasher;
 use std::hash::{Hash, Hasher};
 use std::net::{IpAddr, SocketAddr};
-use std::sync::atomic::{AtomicU64, Ordering};
+use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
 use std::sync::{Arc, Mutex, OnceLock};
 use std::time::{Duration, Instant};

-use bytes::Bytes;
+use dashmap::DashMap;
 use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite, AsyncWriteExt};
-use tokio::sync::{mpsc, oneshot, watch};
+use tokio::sync::{mpsc, oneshot, watch, Mutex as AsyncMutex};
+use tokio::time::timeout;
 use tracing::{debug, trace, warn};

 use crate::config::ProxyConfig;
@@ -20,25 +21,38 @@ use crate::proxy::route_mode::{
    RelayRouteMode, RouteCutoverState, ROUTE_SWITCH_ERROR_MSG, affected_cutover_state,
    cutover_stagger_delay,
 };
-use crate::proxy::adaptive_buffers::{self, AdaptiveTier};
-use crate::proxy::session_eviction::SessionLease;
 use crate::stats::Stats;
-use crate::stream::{BufferPool, CryptoReader, CryptoWriter};
+use crate::stream::{BufferPool, CryptoReader, CryptoWriter, PooledBuffer};
 use crate::transport::middle_proxy::{MePool, MeResponse, proto_flags_for_tag};

 enum C2MeCommand {
-    Data { payload: Bytes, flags: u32 },
+    Data { payload: PooledBuffer, flags: u32 },
    Close,
 }

 const DESYNC_DEDUP_WINDOW: Duration = Duration::from_secs(60);
+const DESYNC_DEDUP_MAX_ENTRIES: usize = 65_536;
+const DESYNC_DEDUP_PRUNE_SCAN_LIMIT: usize = 1024;
+const DESYNC_FULL_CACHE_EMIT_MIN_INTERVAL: Duration = Duration::from_millis(1000);
 const DESYNC_ERROR_CLASS: &str = "frame_too_large_crypto_desync";
 const C2ME_CHANNEL_CAPACITY_FALLBACK: usize = 128;
 const C2ME_SOFT_PRESSURE_MIN_FREE_SLOTS: usize = 64;
 const C2ME_SENDER_FAIRNESS_BUDGET: usize = 32;
+#[cfg(test)]
+const C2ME_SEND_TIMEOUT: Duration = Duration::from_millis(50);
+#[cfg(not(test))]
+const C2ME_SEND_TIMEOUT: Duration = Duration::from_secs(5);
 const ME_D2C_FLUSH_BATCH_MAX_FRAMES_MIN: usize = 1;
 const ME_D2C_FLUSH_BATCH_MAX_BYTES_MIN: usize = 4096;
-static DESYNC_DEDUP: OnceLock<Mutex<HashMap<u64, Instant>>> = OnceLock::new();
+#[cfg(test)]
+const QUOTA_USER_LOCKS_MAX: usize = 64;
+#[cfg(not(test))]
+const QUOTA_USER_LOCKS_MAX: usize = 4_096;
+static DESYNC_DEDUP: OnceLock<DashMap<u64, Instant>> = OnceLock::new();
+static DESYNC_HASHER: OnceLock<RandomState> = OnceLock::new();
+static DESYNC_FULL_CACHE_LAST_EMIT_AT: OnceLock<Mutex<Option<Instant>>> = OnceLock::new();
+static DESYNC_DEDUP_EVER_SATURATED: OnceLock<AtomicBool> = OnceLock::new();
+static QUOTA_USER_LOCKS: OnceLock<DashMap<String, Arc<AsyncMutex<()>>>> = OnceLock::new();

 struct RelayForensicsState {
    trace_id: u64,
@@ -61,8 +75,8 @@ struct MeD2cFlushPolicy {
 }

 impl MeD2cFlushPolicy {
-    fn from_config(config: &ProxyConfig, tier: AdaptiveTier) -> Self {
-        let base = Self {
+    fn from_config(config: &ProxyConfig) -> Self {
+        Self {
            max_frames: config
                .general
                .me_d2c_flush_batch_max_frames
@@ -73,24 +87,13 @@ impl MeD2cFlushPolicy {
                .max(ME_D2C_FLUSH_BATCH_MAX_BYTES_MIN),
            max_delay: Duration::from_micros(config.general.me_d2c_flush_batch_max_delay_us),
            ack_flush_immediate: config.general.me_d2c_ack_flush_immediate,
-        };
-        let (max_frames, max_bytes, max_delay) = adaptive_buffers::me_flush_policy_for_tier(
-            tier,
-            base.max_frames,
-            base.max_bytes,
-            base.max_delay,
-        );
-        Self {
-            max_frames,
-            max_bytes,
-            max_delay,
-            ack_flush_immediate: base.ack_flush_immediate,
        }
    }
 }

 fn hash_value<T: Hash>(value: &T) -> u64 {
-    let mut hasher = DefaultHasher::new();
+    let state = DESYNC_HASHER.get_or_init(RandomState::new);
+    let mut hasher = state.build_hasher();
    value.hash(&mut hasher);
    hasher.finish()
 }
@@ -104,26 +107,122 @@ fn should_emit_full_desync(key: u64, all_full: bool, now: Instant) -> bool {
        return true;
    }

-    let dedup = DESYNC_DEDUP.get_or_init(|| Mutex::new(HashMap::new()));
-    let mut guard = dedup.lock().expect("desync dedup mutex poisoned");
-    guard.retain(|_, seen_at| now.duration_since(*seen_at) < DESYNC_DEDUP_WINDOW);
+    let dedup = DESYNC_DEDUP.get_or_init(DashMap::new);
+    let saturated_before = dedup.len() >= DESYNC_DEDUP_MAX_ENTRIES;
+    let ever_saturated = DESYNC_DEDUP_EVER_SATURATED.get_or_init(|| AtomicBool::new(false));
+    if saturated_before {
+        ever_saturated.store(true, Ordering::Relaxed);
+    }

-    match guard.get_mut(&key) {
-        Some(seen_at) => {
-            if now.duration_since(*seen_at) >= DESYNC_DEDUP_WINDOW {
-                *seen_at = now;
+    if let Some(mut seen_at) = dedup.get_mut(&key) {
+        if now.duration_since(*seen_at) >= DESYNC_DEDUP_WINDOW {
+            *seen_at = now;
+            return true;
+        }
+        return false;
+    }
+
+    if dedup.len() >= DESYNC_DEDUP_MAX_ENTRIES {
+        let mut stale_keys = Vec::new();
+        let mut oldest_candidate: Option<(u64, Instant)> = None;
+        for entry in dedup.iter().take(DESYNC_DEDUP_PRUNE_SCAN_LIMIT) {
+            let key = *entry.key();
+            let seen_at = *entry.value();
+
+            match oldest_candidate {
+                Some((_, oldest_seen)) if seen_at >= oldest_seen => {}
+                _ => oldest_candidate = Some((key, seen_at)),
+            }
+
+            if now.duration_since(seen_at) >= DESYNC_DEDUP_WINDOW {
+                stale_keys.push(*entry.key());
+            }
+        }
+        for stale_key in stale_keys {
+            dedup.remove(&stale_key);
+        }
+        if dedup.len() >= DESYNC_DEDUP_MAX_ENTRIES {
+            let Some((evict_key, _)) = oldest_candidate else {
+                return false;
+            };
+            dedup.remove(&evict_key);
+            dedup.insert(key, now);
+            return should_emit_full_desync_full_cache(now);
+        }
+    }
+
+    dedup.insert(key, now);
+    let saturated_after = dedup.len() >= DESYNC_DEDUP_MAX_ENTRIES;
+    // Preserve the first sequential insert that reaches capacity as a normal
+    // emit, while still gating concurrent newcomer churn after the cache has
+    // ever been observed at saturation.
+    let was_ever_saturated = if saturated_after {
+        ever_saturated.swap(true, Ordering::Relaxed)
+    } else {
+        ever_saturated.load(Ordering::Relaxed)
+    };
+
+    if saturated_before || (saturated_after && was_ever_saturated) {
+        should_emit_full_desync_full_cache(now)
+    } else {
+        true
+    }
+}
+
+fn should_emit_full_desync_full_cache(now: Instant) -> bool {
+    let gate = DESYNC_FULL_CACHE_LAST_EMIT_AT.get_or_init(|| Mutex::new(None));
+    let Ok(mut last_emit_at) = gate.lock() else {
+        return false;
+    };
+
+    match *last_emit_at {
+        None => {
+            *last_emit_at = Some(now);
+            true
+        }
+        Some(last) => {
+            let Some(elapsed) = now.checked_duration_since(last) else {
+                *last_emit_at = Some(now);
+                return true;
+            };
+            if elapsed >= DESYNC_FULL_CACHE_EMIT_MIN_INTERVAL {
+                *last_emit_at = Some(now);
                true
            } else {
                false
            }
        }
-        None => {
-            guard.insert(key, now);
-            true
+    }
+}
+
+#[cfg(test)]
+fn clear_desync_dedup_for_testing() {
+    if let Some(dedup) = DESYNC_DEDUP.get() {
+        dedup.clear();
+    }
+    if let Some(ever_saturated) = DESYNC_DEDUP_EVER_SATURATED.get() {
+        ever_saturated.store(false, Ordering::Relaxed);
+    }
+    if let Some(last_emit_at) = DESYNC_FULL_CACHE_LAST_EMIT_AT.get() {
+        match last_emit_at.lock() {
+            Ok(mut guard) => {
+                *guard = None;
+            }
+            Err(poisoned) => {
+                let mut guard = poisoned.into_inner();
+                *guard = None;
+                last_emit_at.clear_poison();
+            }
        }
    }
 }

+#[cfg(test)]
+fn desync_dedup_test_lock() -> &'static Mutex<()> {
+    static TEST_LOCK: OnceLock<Mutex<()>> = OnceLock::new();
+    TEST_LOCK.get_or_init(|| Mutex::new(()))
+}
+
 fn report_desync_frame_too_large(
    state: &RelayForensicsState,
    proto_tag: ProtoTag,
@@ -219,6 +318,46 @@ fn should_yield_c2me_sender(sent_since_yield: usize, has_backlog: bool) -> bool
    has_backlog && sent_since_yield >= C2ME_SENDER_FAIRNESS_BUDGET
 }

+fn quota_exceeded_for_user(stats: &Stats, user: &str, quota_limit: Option<u64>) -> bool {
+    quota_limit.is_some_and(|quota| stats.get_user_total_octets(user) >= quota)
+}
+
+fn quota_would_be_exceeded_for_user(
+    stats: &Stats,
+    user: &str,
+    quota_limit: Option<u64>,
+    bytes: u64,
+) -> bool {
+    quota_limit.is_some_and(|quota| {
+        let used = stats.get_user_total_octets(user);
+        used >= quota || bytes > quota.saturating_sub(used)
+    })
+}
+
+fn quota_user_lock(user: &str) -> Arc<AsyncMutex<()>> {
+    let locks = QUOTA_USER_LOCKS.get_or_init(DashMap::new);
+    if let Some(existing) = locks.get(user) {
+        return Arc::clone(existing.value());
+    }
+
+    if locks.len() >= QUOTA_USER_LOCKS_MAX {
+        locks.retain(|_, value| Arc::strong_count(value) > 1);
+    }
+
+    if locks.len() >= QUOTA_USER_LOCKS_MAX {
+        return Arc::new(AsyncMutex::new(()));
+    }
+
+    let created = Arc::new(AsyncMutex::new(()));
+    match locks.entry(user.to_string()) {
+        dashmap::mapref::entry::Entry::Occupied(entry) => Arc::clone(entry.get()),
+        dashmap::mapref::entry::Entry::Vacant(entry) => {
+            entry.insert(Arc::clone(&created));
+            created
+        }
+    }
+}
+
 async fn enqueue_c2me_command(
    tx: &mpsc::Sender<C2MeCommand>,
    cmd: C2MeCommand,
@@ -231,7 +370,14 @@ async fn enqueue_c2me_command(
            if tx.capacity() <= C2ME_SOFT_PRESSURE_MIN_FREE_SLOTS {
                tokio::task::yield_now().await;
            }
-            tx.send(cmd).await
+            match timeout(C2ME_SEND_TIMEOUT, tx.reserve()).await {
+                Ok(Ok(permit)) => {
+                    permit.send(cmd);
+                    Ok(())
+                }
+                Ok(Err(_)) => Err(mpsc::error::SendError(cmd)),
+                Err(_) => Err(mpsc::error::SendError(cmd)),
+            }
        }
    }
 }
@@ -243,23 +389,22 @@ pub(crate) async fn handle_via_middle_proxy<R, W>(
    me_pool: Arc<MePool>,
    stats: Arc<Stats>,
    config: Arc<ProxyConfig>,
-    _buffer_pool: Arc<BufferPool>,
+    buffer_pool: Arc<BufferPool>,
    local_addr: SocketAddr,
    rng: Arc<SecureRandom>,
    mut route_rx: watch::Receiver<RouteCutoverState>,
    route_snapshot: RouteCutoverState,
    session_id: u64,
-    session_lease: SessionLease,
 ) -> Result<()>
 where
    R: AsyncRead + Unpin + Send + 'static,
    W: AsyncWrite + Unpin + Send + 'static,
 {
    let user = success.user.clone();
+    let quota_limit = config.access.user_data_quota.get(&user).copied();
    let peer = success.peer;
    let proto_tag = success.proto_tag;
    let pool_generation = me_pool.current_generation();
-    let seed_tier = adaptive_buffers::seed_tier_for_user(&user);

    debug!(
        user = %user,
@@ -272,7 +417,7 @@ where
    );

    let (conn_id, me_rx) = me_pool.registry().register().await;
-    let trace_id = conn_id;
+    let trace_id = session_id;
    let bytes_me2c = Arc::new(AtomicU64::new(0));
    let mut forensics = RelayForensicsState {
        trace_id,
@@ -287,8 +432,7 @@ where
    };

    stats.increment_user_connects(&user);
-    stats.increment_user_curr_connects(&user);
-    stats.increment_current_connections_me();
+    let _me_connection_lease = stats.acquire_me_connection_lease();

    if let Some(cutover) = affected_cutover_state(
        &route_rx,
@@ -306,20 +450,9 @@ where
        tokio::time::sleep(delay).await;
        let _ = me_pool.send_close(conn_id).await;
        me_pool.registry().unregister(conn_id).await;
-        stats.decrement_current_connections_me();
-        stats.decrement_user_curr_connects(&user);
        return Err(ProxyError::Proxy(ROUTE_SWITCH_ERROR_MSG.to_string()));
    }

-    if session_lease.is_stale() {
-        stats.increment_reconnect_stale_close_total();
-        let _ = me_pool.send_close(conn_id).await;
-        me_pool.registry().unregister(conn_id).await;
-        stats.decrement_current_connections_me();
-        stats.decrement_user_curr_connects(&user);
-        return Err(ProxyError::Proxy("Session evicted by reconnect".to_string()));
-    }
-
    // Per-user ad_tag from access.user_ad_tags; fallback to general.ad_tag (hot-reloadable)
    let user_tag: Option<Vec<u8>> = config
        .access
@@ -393,7 +526,7 @@ where
    let rng_clone = rng.clone();
    let user_clone = user.clone();
    let bytes_me2c_clone = bytes_me2c.clone();
-    let d2c_flush_policy = MeD2cFlushPolicy::from_config(&config, seed_tier);
+    let d2c_flush_policy = MeD2cFlushPolicy::from_config(&config);
    let me_writer = tokio::spawn(async move {
        let mut writer = crypto_writer;
        let mut frame_buf = Vec::with_capacity(16 * 1024);
@@ -417,6 +550,7 @@ where
                        &mut frame_buf,
                        stats_clone.as_ref(),
                        &user_clone,
+                        quota_limit,
                        bytes_me2c_clone.as_ref(),
                        conn_id,
                        d2c_flush_policy.ack_flush_immediate,
@@ -449,6 +583,7 @@ where
                            &mut frame_buf,
                            stats_clone.as_ref(),
                            &user_clone,
+                            quota_limit,
                            bytes_me2c_clone.as_ref(),
                            conn_id,
                            d2c_flush_policy.ack_flush_immediate,
@@ -481,6 +616,7 @@ where
                                    &mut frame_buf,
                                    stats_clone.as_ref(),
                                    &user_clone,
+                                    quota_limit,
                                    bytes_me2c_clone.as_ref(),
                                    conn_id,
                                    d2c_flush_policy.ack_flush_immediate,
@@ -513,6 +649,7 @@ where
                                        &mut frame_buf,
                                        stats_clone.as_ref(),
                                        &user_clone,
+                                        quota_limit,
                                        bytes_me2c_clone.as_ref(),
                                        conn_id,
                                        d2c_flush_policy.ack_flush_immediate,
@@ -553,12 +690,6 @@ where
    let mut frame_counter: u64 = 0;
    let mut route_watch_open = true;
    loop {
-        if session_lease.is_stale() {
-            stats.increment_reconnect_stale_close_total();
-            let _ = enqueue_c2me_command(&c2me_tx, C2MeCommand::Close).await;
-            main_result = Err(ProxyError::Proxy("Session evicted by reconnect".to_string()));
-            break;
-        }
        if let Some(cutover) = affected_cutover_state(
            &route_rx,
            RelayRouteMode::Middle,
@@ -588,6 +719,8 @@ where
                &mut crypto_reader,
                proto_tag,
                frame_limit,
+                Duration::from_secs(config.timeouts.client_handshake.max(1)),
+                &buffer_pool,
                &forensics,
                &mut frame_counter,
                &stats,
@@ -598,7 +731,19 @@ where
                        forensics.bytes_c2me = forensics
                            .bytes_c2me
                            .saturating_add(payload.len() as u64);
-                        stats.add_user_octets_from(&user, payload.len() as u64);
+                        if let Some(limit) = quota_limit {
+                            let quota_lock = quota_user_lock(&user);
+                            let _quota_guard = quota_lock.lock().await;
+                            stats.add_user_octets_from(&user, payload.len() as u64);
+                            if quota_exceeded_for_user(stats.as_ref(), &user, Some(limit)) {
+                                main_result = Err(ProxyError::DataQuotaExceeded {
+                                    user: user.clone(),
+                                });
+                                break;
+                            }
+                        } else {
+                            stats.add_user_octets_from(&user, payload.len() as u64);
+                        }
                        let mut flags = proto_flags;
                        if quickack {
                            flags |= RPC_FLAG_QUICKACK;
@@ -667,10 +812,7 @@ where
        frames_ok = frame_counter,
        "ME relay cleanup"
    );
-    adaptive_buffers::record_user_tier(&user, seed_tier);
    me_pool.registry().unregister(conn_id).await;
-    stats.decrement_current_connections_me();
-    stats.decrement_user_curr_connects(&user);
    result
 }

@@ -678,30 +820,49 @@ async fn read_client_payload<R>(
    client_reader: &mut CryptoReader<R>,
    proto_tag: ProtoTag,
    max_frame: usize,
+    frame_read_timeout: Duration,
+    buffer_pool: &Arc<BufferPool>,
    forensics: &RelayForensicsState,
    frame_counter: &mut u64,
    stats: &Stats,
-) -> Result<Option<(Bytes, bool)>>
+) -> Result<Option<(PooledBuffer, bool)>>
 where
    R: AsyncRead + Unpin + Send + 'static,
 {
+    async fn read_exact_with_timeout<R>(
+        client_reader: &mut CryptoReader<R>,
+        buf: &mut [u8],
+        frame_read_timeout: Duration,
+    ) -> Result<()>
+    where
+        R: AsyncRead + Unpin + Send + 'static,
+    {
+        match timeout(frame_read_timeout, client_reader.read_exact(buf)).await {
+            Ok(Ok(_)) => Ok(()),
+            Ok(Err(e)) => Err(ProxyError::Io(e)),
+            Err(_) => Err(ProxyError::Io(std::io::Error::new(
+                std::io::ErrorKind::TimedOut,
+                "middle-relay client frame read timeout",
+            ))),
+        }
+    }
+
    loop {
        let (len, quickack, raw_len_bytes) = match proto_tag {
            ProtoTag::Abridged => {
                let mut first = [0u8; 1];
-                match client_reader.read_exact(&mut first).await {
-                    Ok(_) => {}
-                    Err(e) if e.kind() == std::io::ErrorKind::UnexpectedEof => return Ok(None),
-                    Err(e) => return Err(ProxyError::Io(e)),
+                match read_exact_with_timeout(client_reader, &mut first, frame_read_timeout).await {
+                    Ok(()) => {}
+                    Err(ProxyError::Io(e)) if e.kind() == std::io::ErrorKind::UnexpectedEof => {
+                        return Ok(None);
+                    }
+                    Err(e) => return Err(e),
                }

                let quickack = (first[0] & 0x80) != 0;
                let len_words = if (first[0] & 0x7f) == 0x7f {
                    let mut ext = [0u8; 3];
-                    client_reader
-                        .read_exact(&mut ext)
-                        .await
-                        .map_err(ProxyError::Io)?;
+                    read_exact_with_timeout(client_reader, &mut ext, frame_read_timeout).await?;
                    u32::from_le_bytes([ext[0], ext[1], ext[2], 0]) as usize
                } else {
                    (first[0] & 0x7f) as usize
@@ -714,10 +875,12 @@ where
            }
            ProtoTag::Intermediate | ProtoTag::Secure => {
                let mut len_buf = [0u8; 4];
-                match client_reader.read_exact(&mut len_buf).await {
-                    Ok(_) => {}
-                    Err(e) if e.kind() == std::io::ErrorKind::UnexpectedEof => return Ok(None),
-                    Err(e) => return Err(ProxyError::Io(e)),
+                match read_exact_with_timeout(client_reader, &mut len_buf, frame_read_timeout).await {
+                    Ok(()) => {}
+                    Err(ProxyError::Io(e)) if e.kind() == std::io::ErrorKind::UnexpectedEof => {
+                        return Ok(None);
+                    }
+                    Err(e) => return Err(e),
                }
                let quickack = (len_buf[3] & 0x80) != 0;
                (
@@ -769,18 +932,21 @@ where
            len
        };

-        let mut payload = vec![0u8; len];
-        client_reader
-            .read_exact(&mut payload)
-            .await
-            .map_err(ProxyError::Io)?;
+        let mut payload = buffer_pool.get();
+        payload.clear();
+        let current_cap = payload.capacity();
+        if current_cap < len {
+            payload.reserve(len - current_cap);
+        }
+        payload.resize(len, 0);
+        read_exact_with_timeout(client_reader, &mut payload[..len], frame_read_timeout).await?;

        // Secure Intermediate: strip validated trailing padding bytes.
        if proto_tag == ProtoTag::Secure {
            payload.truncate(secure_payload_len);
        }
        *frame_counter += 1;
-        return Ok(Some((Bytes::from(payload), quickack)));
+        return Ok(Some((payload, quickack)));
    }
 }

@@ -801,6 +967,7 @@ async fn process_me_writer_response<W>(
    frame_buf: &mut Vec<u8>,
    stats: &Stats,
    user: &str,
+    quota_limit: Option<u64>,
    bytes_me2c: &AtomicU64,
    conn_id: u64,
    ack_flush_immediate: bool,
@@ -816,17 +983,47 @@ where
            } else {
                trace!(conn_id, bytes = data.len(), flags, "ME->C data");
            }
-            bytes_me2c.fetch_add(data.len() as u64, Ordering::Relaxed);
-            stats.add_user_octets_to(user, data.len() as u64);
-            write_client_payload(
-                client_writer,
-                proto_tag,
-                flags,
-                &data,
-                rng,
-                frame_buf,
-            )
-            .await?;
+            let data_len = data.len() as u64;
+            if let Some(limit) = quota_limit {
+                let quota_lock = quota_user_lock(user);
+                let _quota_guard = quota_lock.lock().await;
+                if quota_would_be_exceeded_for_user(stats, user, Some(limit), data_len) {
+                    return Err(ProxyError::DataQuotaExceeded {
+                        user: user.to_string(),
+                    });
+                }
+                write_client_payload(
+                    client_writer,
+                    proto_tag,
+                    flags,
+                    &data,
+                    rng,
+                    frame_buf,
+                )
+                .await?;
+
+                bytes_me2c.fetch_add(data.len() as u64, Ordering::Relaxed);
+                stats.add_user_octets_to(user, data.len() as u64);
+
+                if quota_exceeded_for_user(stats, user, Some(limit)) {
+                    return Err(ProxyError::DataQuotaExceeded {
+                        user: user.to_string(),
+                    });
+                }
+            } else {
+                write_client_payload(
+                    client_writer,
+                    proto_tag,
+                    flags,
+                    &data,
+                    rng,
+                    frame_buf,
+                )
+                .await?;
+
+                bytes_me2c.fetch_add(data.len() as u64, Ordering::Relaxed);
+                stats.add_user_octets_to(user, data.len() as u64);
+            }

            Ok(MeWriterResponseOutcome::Continue {
                frames: 1,
@@ -972,82 +1169,5 @@ where
 }

 #[cfg(test)]
-mod tests {
-    use super::*;
-    use tokio::time::{Duration as TokioDuration, timeout};
-
-    #[test]
-    fn should_yield_sender_only_on_budget_with_backlog() {
-        assert!(!should_yield_c2me_sender(0, true));
-        assert!(!should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET - 1, true));
-        assert!(!should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET, false));
-        assert!(should_yield_c2me_sender(C2ME_SENDER_FAIRNESS_BUDGET, true));
-    }
-
-    #[tokio::test]
-    async fn enqueue_c2me_command_uses_try_send_fast_path() {
-        let (tx, mut rx) = mpsc::channel::<C2MeCommand>(2);
-        enqueue_c2me_command(
-            &tx,
-            C2MeCommand::Data {
-                payload: Bytes::from_static(&[1, 2, 3]),
-                flags: 0,
-            },
-        )
-        .await
-        .unwrap();
-
-        let recv = timeout(TokioDuration::from_millis(50), rx.recv())
-            .await
-            .unwrap()
-            .unwrap();
-        match recv {
-            C2MeCommand::Data { payload, flags } => {
-                assert_eq!(payload.as_ref(), &[1, 2, 3]);
-                assert_eq!(flags, 0);
-            }
-            C2MeCommand::Close => panic!("unexpected close command"),
-        }
-    }
-
-    #[tokio::test]
-    async fn enqueue_c2me_command_falls_back_to_send_when_queue_is_full() {
-        let (tx, mut rx) = mpsc::channel::<C2MeCommand>(1);
-        tx.send(C2MeCommand::Data {
-            payload: Bytes::from_static(&[9]),
-            flags: 9,
-        })
-        .await
-        .unwrap();
-
-        let tx2 = tx.clone();
-        let producer = tokio::spawn(async move {
-            enqueue_c2me_command(
-                &tx2,
-                C2MeCommand::Data {
-                    payload: Bytes::from_static(&[7, 7]),
-                    flags: 7,
-                },
-            )
-            .await
-            .unwrap();
-        });
-
-        let _ = timeout(TokioDuration::from_millis(100), rx.recv())
-            .await
-            .unwrap();
-        producer.await.unwrap();
-
-        let recv = timeout(TokioDuration::from_millis(100), rx.recv())
-            .await
-            .unwrap()
-            .unwrap();
-        match recv {
-            C2MeCommand::Data { payload, flags } => {
-                assert_eq!(payload.as_ref(), &[7, 7]);
-                assert_eq!(flags, 7);
-            }
-            C2MeCommand::Close => panic!("unexpected close command"),
-        }
-    }
-}
+#[path = "middle_relay_security_tests.rs"]
+mod security_tests;
@@ -1,6 +1,5 @@
 //! Proxy Defs

-pub mod adaptive_buffers;
 pub mod client;
 pub mod direct_relay;
 pub mod handshake;
@@ -8,7 +7,6 @@ pub mod masking;
 pub mod middle_relay;
 pub mod route_mode;
 pub mod relay;
-pub mod session_eviction;

 pub use client::ClientHandler;
 #[allow(unused_imports)]
@@ -53,20 +53,17 @@

 use std::io;
 use std::pin::Pin;
-use std::sync::Arc;
-use std::sync::atomic::{AtomicU64, Ordering};
+use std::sync::{Arc, Mutex, OnceLock};
+use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
 use std::task::{Context, Poll};
 use std::time::Duration;
+use dashmap::DashMap;
 use tokio::io::{
    AsyncRead, AsyncWrite, AsyncWriteExt, ReadBuf, copy_bidirectional_with_sizes,
 };
 use tokio::time::Instant;
 use tracing::{debug, trace, warn};
-use crate::error::Result;
-use crate::proxy::adaptive_buffers::{
-    self, AdaptiveTier, RelaySignalSample, SessionAdaptiveController, TierTransitionReason,
-};
-use crate::proxy::session_eviction::SessionLease;
+use crate::error::{ProxyError, Result};
 use crate::stats::Stats;
 use crate::stream::BufferPool;

@@ -83,7 +80,6 @@ const ACTIVITY_TIMEOUT: Duration = Duration::from_secs(1800);
 /// 10 seconds gives responsive timeout detection (±10s accuracy)
 /// without measurable overhead from atomic reads.
 const WATCHDOG_INTERVAL: Duration = Duration::from_secs(10);
-const ADAPTIVE_TICK: Duration = Duration::from_millis(250);

 // ============= CombinedStream =============

@@ -160,16 +156,6 @@ struct SharedCounters {
    s2c_ops: AtomicU64,
    /// Milliseconds since relay epoch of last I/O activity
    last_activity_ms: AtomicU64,
-    /// Bytes requested to write to client (S→C direction).
-    s2c_requested_bytes: AtomicU64,
-    /// Total write operations for S→C direction.
-    s2c_write_ops: AtomicU64,
-    /// Number of partial writes to client.
-    s2c_partial_writes: AtomicU64,
-    /// Number of times S→C poll_write returned Pending.
-    s2c_pending_writes: AtomicU64,
-    /// Consecutive pending writes in S→C direction.
-    s2c_consecutive_pending_writes: AtomicU64,
 }

 impl SharedCounters {
@@ -180,11 +166,6 @@ impl SharedCounters {
            c2s_ops: AtomicU64::new(0),
            s2c_ops: AtomicU64::new(0),
            last_activity_ms: AtomicU64::new(0),
-            s2c_requested_bytes: AtomicU64::new(0),
-            s2c_write_ops: AtomicU64::new(0),
-            s2c_partial_writes: AtomicU64::new(0),
-            s2c_pending_writes: AtomicU64::new(0),
-            s2c_consecutive_pending_writes: AtomicU64::new(0),
        }
    }

@@ -225,6 +206,10 @@ struct StatsIo<S> {
    counters: Arc<SharedCounters>,
    stats: Arc<Stats>,
    user: String,
+    quota_limit: Option<u64>,
+    quota_exceeded: Arc<AtomicBool>,
+    quota_read_wake_scheduled: bool,
+    quota_write_wake_scheduled: bool,
    epoch: Instant,
 }

@@ -234,11 +219,64 @@ impl<S> StatsIo<S> {
        counters: Arc<SharedCounters>,
        stats: Arc<Stats>,
        user: String,
+        quota_limit: Option<u64>,
+        quota_exceeded: Arc<AtomicBool>,
        epoch: Instant,
    ) -> Self {
        // Mark initial activity so the watchdog doesn't fire before data flows
        counters.touch(Instant::now(), epoch);
-        Self { inner, counters, stats, user, epoch }
+        Self {
+            inner,
+            counters,
+            stats,
+            user,
+            quota_limit,
+            quota_exceeded,
+            quota_read_wake_scheduled: false,
+            quota_write_wake_scheduled: false,
+            epoch,
+        }
+    }
+}
+
+#[derive(Debug)]
+struct QuotaIoSentinel;
+
+impl std::fmt::Display for QuotaIoSentinel {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.write_str("user data quota exceeded")
+    }
+}
+
+impl std::error::Error for QuotaIoSentinel {}
+
+fn quota_io_error() -> io::Error {
+    io::Error::new(io::ErrorKind::PermissionDenied, QuotaIoSentinel)
+}
+
+fn is_quota_io_error(err: &io::Error) -> bool {
+    err.kind() == io::ErrorKind::PermissionDenied
+        && err
+            .get_ref()
+            .and_then(|source| source.downcast_ref::<QuotaIoSentinel>())
+            .is_some()
+}
+
+static QUOTA_USER_LOCKS: OnceLock<DashMap<String, Arc<Mutex<()>>>> = OnceLock::new();
+
+fn quota_user_lock(user: &str) -> Arc<Mutex<()>> {
+    let locks = QUOTA_USER_LOCKS.get_or_init(DashMap::new);
+    if let Some(existing) = locks.get(user) {
+        return Arc::clone(existing.value());
+    }
+
+    let created = Arc::new(Mutex::new(()));
+    match locks.entry(user.to_string()) {
+        dashmap::mapref::entry::Entry::Occupied(entry) => Arc::clone(entry.get()),
+        dashmap::mapref::entry::Entry::Vacant(entry) => {
+            entry.insert(Arc::clone(&created));
+            created
+        }
    }
 }

@@ -249,6 +287,42 @@ impl<S: AsyncRead + Unpin> AsyncRead for StatsIo<S> {
        buf: &mut ReadBuf<'_>,
    ) -> Poll<io::Result<()>> {
        let this = self.get_mut();
+        if this.quota_exceeded.load(Ordering::Relaxed) {
+            return Poll::Ready(Err(quota_io_error()));
+        }
+
+        let quota_lock = this
+            .quota_limit
+            .is_some()
+            .then(|| quota_user_lock(&this.user));
+        let _quota_guard = if let Some(lock) = quota_lock.as_ref() {
+            match lock.try_lock() {
+                Ok(guard) => {
+                    this.quota_read_wake_scheduled = false;
+                    Some(guard)
+                }
+                Err(_) => {
+                    if !this.quota_read_wake_scheduled {
+                        this.quota_read_wake_scheduled = true;
+                        let waker = cx.waker().clone();
+                        tokio::task::spawn(async move {
+                            tokio::task::yield_now().await;
+                            waker.wake();
+                        });
+                    }
+                    return Poll::Pending;
+                }
+            }
+        } else {
+            None
+        };
+
+        if let Some(limit) = this.quota_limit
+            && this.stats.get_user_total_octets(&this.user) >= limit
+        {
+            this.quota_exceeded.store(true, Ordering::Relaxed);
+            return Poll::Ready(Err(quota_io_error()));
+        }
        let before = buf.filled().len();

        match Pin::new(&mut this.inner).poll_read(cx, buf) {
@@ -263,6 +337,13 @@ impl<S: AsyncRead + Unpin> AsyncRead for StatsIo<S> {
                    this.stats.add_user_octets_from(&this.user, n as u64);
                    this.stats.increment_user_msgs_from(&this.user);

+                    if let Some(limit) = this.quota_limit
+                        && this.stats.get_user_total_octets(&this.user) >= limit
+                    {
+                        this.quota_exceeded.store(true, Ordering::Relaxed);
+                        return Poll::Ready(Err(quota_io_error()));
+                    }
+
                    trace!(user = %this.user, bytes = n, "C->S");
                }
                Poll::Ready(Ok(()))
@@ -279,21 +360,57 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
        buf: &[u8],
    ) -> Poll<io::Result<usize>> {
        let this = self.get_mut();
-        this.counters
-            .s2c_requested_bytes
-            .fetch_add(buf.len() as u64, Ordering::Relaxed);
+        if this.quota_exceeded.load(Ordering::Relaxed) {
+            return Poll::Ready(Err(quota_io_error()));
+        }

-        match Pin::new(&mut this.inner).poll_write(cx, buf) {
-            Poll::Ready(Ok(n)) => {
-                this.counters.s2c_write_ops.fetch_add(1, Ordering::Relaxed);
-                this.counters
-                    .s2c_consecutive_pending_writes
-                    .store(0, Ordering::Relaxed);
-                if n < buf.len() {
-                    this.counters
-                        .s2c_partial_writes
-                        .fetch_add(1, Ordering::Relaxed);
+        let quota_lock = this
+            .quota_limit
+            .is_some()
+            .then(|| quota_user_lock(&this.user));
+        let _quota_guard = if let Some(lock) = quota_lock.as_ref() {
+            match lock.try_lock() {
+                Ok(guard) => {
+                    this.quota_write_wake_scheduled = false;
+                    Some(guard)
                }
+                Err(_) => {
+                    if !this.quota_write_wake_scheduled {
+                        this.quota_write_wake_scheduled = true;
+                        let waker = cx.waker().clone();
+                        tokio::task::spawn(async move {
+                            tokio::task::yield_now().await;
+                            waker.wake();
+                        });
+                    }
+                    return Poll::Pending;
+                }
+            }
+        } else {
+            None
+        };
+
+        let write_buf = if let Some(limit) = this.quota_limit {
+            let used = this.stats.get_user_total_octets(&this.user);
+            if used >= limit {
+                this.quota_exceeded.store(true, Ordering::Relaxed);
+                return Poll::Ready(Err(quota_io_error()));
+            }
+
+            let remaining = (limit - used) as usize;
+            if buf.len() > remaining {
+                // Fail closed: do not emit partial S->C payload when remaining
+                // quota cannot accommodate the pending write request.
+                this.quota_exceeded.store(true, Ordering::Relaxed);
+                return Poll::Ready(Err(quota_io_error()));
+            }
+            buf
+        } else {
+            buf
+        };
+
+        match Pin::new(&mut this.inner).poll_write(cx, write_buf) {
+            Poll::Ready(Ok(n)) => {
                if n > 0 {
                    // S→C: data written to client
                    this.counters.s2c_bytes.fetch_add(n as u64, Ordering::Relaxed);
@@ -303,19 +420,17 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
                    this.stats.add_user_octets_to(&this.user, n as u64);
                    this.stats.increment_user_msgs_to(&this.user);

+                    if let Some(limit) = this.quota_limit
+                        && this.stats.get_user_total_octets(&this.user) >= limit
+                    {
+                        this.quota_exceeded.store(true, Ordering::Relaxed);
+                        return Poll::Ready(Err(quota_io_error()));
+                    }
+
                    trace!(user = %this.user, bytes = n, "S->C");
                }
                Poll::Ready(Ok(n))
            }
-            Poll::Pending => {
-                this.counters
-                    .s2c_pending_writes
-                    .fetch_add(1, Ordering::Relaxed);
-                this.counters
-                    .s2c_consecutive_pending_writes
-                    .fetch_add(1, Ordering::Relaxed);
-                Poll::Pending
-            }
            other => other,
        }
    }
@@ -348,7 +463,8 @@ impl<S: AsyncWrite + Unpin> AsyncWrite for StatsIo<S> {
 /// - Per-user stats: bytes and ops counted per direction
 /// - Periodic rate logging: every 10 seconds when active
 /// - Clean shutdown: both write sides are shut down on exit
-/// - Error propagation: I/O errors are returned as `ProxyError::Io`
+/// - Error propagation: quota exits return `ProxyError::DataQuotaExceeded`,
+///   other I/O failures are returned as `ProxyError::Io`
 pub async fn relay_bidirectional<CR, CW, SR, SW>(
    client_reader: CR,
    client_writer: CW,
@@ -357,11 +473,9 @@ pub async fn relay_bidirectional<CR, CW, SR, SW>(
    c2s_buf_size: usize,
    s2c_buf_size: usize,
    user: &str,
-    dc_idx: i16,
    stats: Arc<Stats>,
+    quota_limit: Option<u64>,
    _buffer_pool: Arc<BufferPool>,
-    session_lease: SessionLease,
-    seed_tier: AdaptiveTier,
 ) -> Result<()>
 where
    CR: AsyncRead + Unpin + Send + 'static,
@@ -371,6 +485,7 @@ where
 {
    let epoch = Instant::now();
    let counters = Arc::new(SharedCounters::new());
+    let quota_exceeded = Arc::new(AtomicBool::new(false));
    let user_owned = user.to_string();

    // ── Combine split halves into bidirectional streams ──────────────
@@ -383,43 +498,31 @@ where
        Arc::clone(&counters),
        Arc::clone(&stats),
        user_owned.clone(),
+        quota_limit,
+        Arc::clone(&quota_exceeded),
        epoch,
    );

    // ── Watchdog: activity timeout + periodic rate logging ──────────
    let wd_counters = Arc::clone(&counters);
    let wd_user = user_owned.clone();
-    let wd_dc = dc_idx;
-    let wd_stats = Arc::clone(&stats);
-    let wd_session = session_lease.clone();
+    let wd_quota_exceeded = Arc::clone(&quota_exceeded);

    let watchdog = async {
-        let mut prev_c2s_log: u64 = 0;
-        let mut prev_s2c_log: u64 = 0;
-        let mut prev_c2s_sample: u64 = 0;
-        let mut prev_s2c_requested_sample: u64 = 0;
-        let mut prev_s2c_written_sample: u64 = 0;
-        let mut prev_s2c_write_ops_sample: u64 = 0;
-        let mut prev_s2c_partial_sample: u64 = 0;
-        let mut accumulated_log = Duration::ZERO;
-        let mut adaptive = SessionAdaptiveController::new(seed_tier);
+        let mut prev_c2s: u64 = 0;
+        let mut prev_s2c: u64 = 0;

        loop {
-            tokio::time::sleep(ADAPTIVE_TICK).await;
-
-            if wd_session.is_stale() {
-                wd_stats.increment_reconnect_stale_close_total();
-                warn!(
-                    user = %wd_user,
-                    dc = wd_dc,
-                    "Session evicted by reconnect"
-                );
-                return;
-            }
+            tokio::time::sleep(WATCHDOG_INTERVAL).await;

            let now = Instant::now();
            let idle = wd_counters.idle_duration(now, epoch);

+            if wd_quota_exceeded.load(Ordering::Relaxed) {
+                warn!(user = %wd_user, "User data quota reached, closing relay");
+                return;
+            }
+
            // ── Activity timeout ────────────────────────────────────
            if idle >= ACTIVITY_TIMEOUT {
                let c2s = wd_counters.c2s_bytes.load(Ordering::Relaxed);
@@ -434,80 +537,11 @@ where
                return; // Causes select! to cancel copy_bidirectional
            }

-            let c2s_total = wd_counters.c2s_bytes.load(Ordering::Relaxed);
-            let s2c_requested_total = wd_counters
-                .s2c_requested_bytes
-                .load(Ordering::Relaxed);
-            let s2c_written_total = wd_counters.s2c_bytes.load(Ordering::Relaxed);
-            let s2c_write_ops_total = wd_counters
-                .s2c_write_ops
-                .load(Ordering::Relaxed);
-            let s2c_partial_total = wd_counters
-                .s2c_partial_writes
-                .load(Ordering::Relaxed);
-            let consecutive_pending = wd_counters
-                .s2c_consecutive_pending_writes
-                .load(Ordering::Relaxed) as u32;
-
-            let sample = RelaySignalSample {
-                c2s_bytes: c2s_total.saturating_sub(prev_c2s_sample),
-                s2c_requested_bytes: s2c_requested_total
-                    .saturating_sub(prev_s2c_requested_sample),
-                s2c_written_bytes: s2c_written_total
-                    .saturating_sub(prev_s2c_written_sample),
-                s2c_write_ops: s2c_write_ops_total
-                    .saturating_sub(prev_s2c_write_ops_sample),
-                s2c_partial_writes: s2c_partial_total
-                    .saturating_sub(prev_s2c_partial_sample),
-                s2c_consecutive_pending_writes: consecutive_pending,
-            };
-
-            if let Some(transition) = adaptive.observe(sample, ADAPTIVE_TICK.as_secs_f64()) {
-                match transition.reason {
-                    TierTransitionReason::SoftConfirmed => {
-                        wd_stats.increment_relay_adaptive_promotions_total();
-                    }
-                    TierTransitionReason::HardPressure => {
-                        wd_stats.increment_relay_adaptive_promotions_total();
-                        wd_stats.increment_relay_adaptive_hard_promotions_total();
-                    }
-                    TierTransitionReason::QuietDemotion => {
-                        wd_stats.increment_relay_adaptive_demotions_total();
-                    }
-                }
-                adaptive_buffers::record_user_tier(&wd_user, adaptive.max_tier_seen());
-                debug!(
-                    user = %wd_user,
-                    dc = wd_dc,
-                    from_tier = transition.from.as_u8(),
-                    to_tier = transition.to.as_u8(),
-                    reason = ?transition.reason,
-                    throughput_ema_bps = sample
-                        .c2s_bytes
-                        .max(sample.s2c_written_bytes)
-                        .saturating_mul(8)
-                        .saturating_mul(4),
-                    "Adaptive relay tier transition"
-                );
-            }
-
-            prev_c2s_sample = c2s_total;
-            prev_s2c_requested_sample = s2c_requested_total;
-            prev_s2c_written_sample = s2c_written_total;
-            prev_s2c_write_ops_sample = s2c_write_ops_total;
-            prev_s2c_partial_sample = s2c_partial_total;
-
-            accumulated_log = accumulated_log.saturating_add(ADAPTIVE_TICK);
-            if accumulated_log < WATCHDOG_INTERVAL {
-                continue;
-            }
-            accumulated_log = Duration::ZERO;
-
            // ── Periodic rate logging ───────────────────────────────
            let c2s = wd_counters.c2s_bytes.load(Ordering::Relaxed);
            let s2c = wd_counters.s2c_bytes.load(Ordering::Relaxed);
-            let c2s_delta = c2s.saturating_sub(prev_c2s_log);
-            let s2c_delta = s2c.saturating_sub(prev_s2c_log);
+            let c2s_delta = c2s - prev_c2s;
+            let s2c_delta = s2c - prev_s2c;

            if c2s_delta > 0 || s2c_delta > 0 {
                let secs = WATCHDOG_INTERVAL.as_secs_f64();
@@ -521,8 +555,8 @@ where
                );
            }

-            prev_c2s_log = c2s;
-            prev_s2c_log = s2c;
+            prev_c2s = c2s;
+            prev_s2c = s2c;
        }
    };

@@ -557,7 +591,6 @@ where
    let c2s_ops = counters.c2s_ops.load(Ordering::Relaxed);
    let s2c_ops = counters.s2c_ops.load(Ordering::Relaxed);
    let duration = epoch.elapsed();
-    adaptive_buffers::record_user_tier(&user_owned, seed_tier);

    match copy_result {
        Some(Ok((c2s, s2c))) => {
@@ -573,6 +606,22 @@ where
            );
            Ok(())
        }
+        Some(Err(e)) if is_quota_io_error(&e) => {
+            let c2s = counters.c2s_bytes.load(Ordering::Relaxed);
+            let s2c = counters.s2c_bytes.load(Ordering::Relaxed);
+            warn!(
+                user = %user_owned,
+                c2s_bytes = c2s,
+                s2c_bytes = s2c,
+                c2s_msgs = c2s_ops,
+                s2c_msgs = s2c_ops,
+                duration_secs = duration.as_secs(),
+                "Data quota reached, closing relay"
+            );
+            Err(ProxyError::DataQuotaExceeded {
+                user: user_owned.clone(),
+            })
+        }
        Some(Err(e)) => {
            // I/O error in one of the directions
            let c2s = counters.c2s_bytes.load(Ordering::Relaxed);
@@ -606,3 +655,9 @@ where
        }
    }
 }
+
+#[cfg(test)]
+#[path = "relay_security_tests.rs"]
+mod security_tests;
+#[path = "relay_adversarial_tests.rs"]
+mod adversarial_tests;
@@ -0,0 +1,122 @@
+use super::*;
+use crate::error::ProxyError;
+use crate::stats::Stats;
+use crate::stream::BufferPool;
+use std::sync::Arc;
+use tokio::io::{duplex, AsyncReadExt, AsyncWriteExt};
+use tokio::time::{Duration, Instant, timeout};
+
+// ------------------------------------------------------------------
+// Priority 3: Async Relay HOL Blocking Prevention (OWASP ASVS 5.1.5)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn relay_hol_blocking_prevention_regression() {
+    let stats = Arc::new(Stats::new());
+    let user = "hol-user";
+    
+    let (client_peer, relay_client) = duplex(65536);
+    let (relay_server, server_peer) = duplex(65536);
+
+    let (client_reader, client_writer) = tokio::io::split(relay_client);
+    let (server_reader, server_writer) = tokio::io::split(relay_server);
+    let (mut cp_reader, mut cp_writer) = tokio::io::split(client_peer);
+    let (mut sp_reader, mut sp_writer) = tokio::io::split(server_peer);
+
+    let relay_task = tokio::spawn(relay_bidirectional(
+        client_reader,
+        client_writer,
+        server_reader,
+        server_writer,
+        8192,
+        8192,
+        user,
+        Arc::clone(&stats),
+        None,
+        Arc::new(BufferPool::new()),
+    ));
+
+    let payload_size = 1024 * 10;
+    let s2c_payload = vec![0x41; payload_size];
+    let c2s_payload = vec![0x42; payload_size];
+
+    let s2c_handle = tokio::spawn(async move {
+        sp_writer.write_all(&s2c_payload).await.unwrap();
+        
+        let mut total_read = 0;
+        let mut buf = [0u8; 10];
+        while total_read < payload_size {
+            let n = cp_reader.read(&mut buf).await.unwrap();
+            total_read += n;
+            tokio::time::sleep(Duration::from_millis(100)).await;
+        }
+    });
+
+    let start = Instant::now();
+    cp_writer.write_all(&c2s_payload).await.unwrap();
+    
+    let mut server_buf = vec![0u8; payload_size];
+    sp_reader.read_exact(&mut server_buf).await.unwrap();
+    let elapsed = start.elapsed();
+
+    assert!(elapsed < Duration::from_millis(1000), "C->S must not be blocked by slow S->C (HOL blocking): {:?}", elapsed);
+    assert_eq!(server_buf, c2s_payload);
+
+    s2c_handle.abort();
+    relay_task.abort();
+}
+
+// ------------------------------------------------------------------
+// Priority 3: Data Quota Mid-Session Cutoff (OWASP ASVS 5.1.6)
+// ------------------------------------------------------------------
+
+#[tokio::test]
+async fn relay_quota_mid_session_cutoff() {
+    let stats = Arc::new(Stats::new());
+    let user = "quota-mid-user";
+    let quota = 5000;
+    
+    let (client_peer, relay_client) = duplex(8192);
+    let (relay_server, server_peer) = duplex(8192);
+
+    let (client_reader, client_writer) = tokio::io::split(relay_client);
+    let (server_reader, server_writer) = tokio::io::split(relay_server);
+    let (mut _cp_reader, mut cp_writer) = tokio::io::split(client_peer);
+    let (mut sp_reader, _sp_writer) = tokio::io::split(server_peer);
+
+    let relay_task = tokio::spawn(relay_bidirectional(
+        client_reader,
+        client_writer,
+        server_reader,
+        server_writer,
+        1024,
+        1024,
+        user,
+        Arc::clone(&stats),
+        Some(quota),
+        Arc::new(BufferPool::new()),
+    ));
+
+    // Send 4000 bytes (Ok)
+    let buf1 = vec![0x42; 4000];
+    cp_writer.write_all(&buf1).await.unwrap();
+    let mut server_recv = vec![0u8; 4000];
+    sp_reader.read_exact(&mut server_recv).await.unwrap();
+
+    // Send another 2000 bytes (Total 6000 > 5000)
+    let buf2 = vec![0x42; 2000];
+    let _ = cp_writer.write_all(&buf2).await;
+    
+    let relay_res = timeout(Duration::from_secs(1), relay_task).await.unwrap();
+    
+    match relay_res {
+        Ok(Err(ProxyError::DataQuotaExceeded { .. })) => {
+            // Expected
+        }
+        other => panic!("Expected DataQuotaExceeded error, got: {:?}", other),
+    }
+
+    let mut small_buf = [0u8; 1];
+    let n = sp_reader.read(&mut small_buf).await.unwrap();
+    assert_eq!(n, 0, "Server must see EOF after quota reached");
+}
@@ -1,10 +1,10 @@
 use std::sync::Arc;
-use std::sync::atomic::{AtomicU8, AtomicU64, Ordering};
+use std::sync::atomic::{AtomicU64, Ordering};
 use std::time::{Duration, SystemTime, UNIX_EPOCH};

 use tokio::sync::watch;

-pub(crate) const ROUTE_SWITCH_ERROR_MSG: &str = "Route mode switched by cutover";
+pub(crate) const ROUTE_SWITCH_ERROR_MSG: &str = "Session terminated";

 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
 #[repr(u8)]
@@ -14,17 +14,6 @@ pub(crate) enum RelayRouteMode {
 }

 impl RelayRouteMode {
-    pub(crate) fn as_u8(self) -> u8 {
-        self as u8
-    }
-
-    pub(crate) fn from_u8(value: u8) -> Self {
-        match value {
-            1 => Self::Middle,
-            _ => Self::Direct,
-        }
-    }
-
    pub(crate) fn as_str(self) -> &'static str {
        match self {
            Self::Direct => "direct",
@@ -41,8 +30,6 @@ pub(crate) struct RouteCutoverState {

 #[derive(Clone)]
 pub(crate) struct RouteRuntimeController {
-    mode: Arc<AtomicU8>,
-    generation: Arc<AtomicU64>,
    direct_since_epoch_secs: Arc<AtomicU64>,
    tx: watch::Sender<RouteCutoverState>,
 }
@@ -60,18 +47,13 @@ impl RouteRuntimeController {
            0
        };
        Self {
-            mode: Arc::new(AtomicU8::new(initial_mode.as_u8())),
-            generation: Arc::new(AtomicU64::new(0)),
            direct_since_epoch_secs: Arc::new(AtomicU64::new(direct_since_epoch_secs)),
            tx,
        }
    }

    pub(crate) fn snapshot(&self) -> RouteCutoverState {
-        RouteCutoverState {
-            mode: RelayRouteMode::from_u8(self.mode.load(Ordering::Relaxed)),
-            generation: self.generation.load(Ordering::Relaxed),
-        }
+        *self.tx.borrow()
    }

    pub(crate) fn subscribe(&self) -> watch::Receiver<RouteCutoverState> {
@@ -84,20 +66,29 @@ impl RouteRuntimeController {
    }

    pub(crate) fn set_mode(&self, mode: RelayRouteMode) -> Option<RouteCutoverState> {
-        let previous = self.mode.swap(mode.as_u8(), Ordering::Relaxed);
-        if previous == mode.as_u8() {
+        let mut next = None;
+        let changed = self.tx.send_if_modified(|state| {
+            if state.mode == mode {
+                return false;
+            }
+            state.mode = mode;
+            state.generation = state.generation.saturating_add(1);
+            next = Some(*state);
+            true
+        });
+
+        if !changed {
            return None;
        }
+
        if matches!(mode, RelayRouteMode::Direct) {
            self.direct_since_epoch_secs
                .store(now_epoch_secs(), Ordering::Relaxed);
        } else {
            self.direct_since_epoch_secs.store(0, Ordering::Relaxed);
        }
-        let generation = self.generation.fetch_add(1, Ordering::Relaxed) + 1;
-        let next = RouteCutoverState { mode, generation };
-        self.tx.send_replace(next);
-        Some(next)
+
+        next
    }
 }

@@ -110,10 +101,10 @@ fn now_epoch_secs() -> u64 {

 pub(crate) fn is_session_affected_by_cutover(
    current: RouteCutoverState,
-    _session_mode: RelayRouteMode,
+    session_mode: RelayRouteMode,
    session_generation: u64,
 ) -> bool {
-    current.generation > session_generation
+    current.generation > session_generation && current.mode != session_mode
 }

 pub(crate) fn affected_cutover_state(
@@ -140,3 +131,7 @@ pub(crate) fn cutover_stagger_delay(session_id: u64, generation: u64) -> Duratio
    let ms = 1000 + (value % 1000);
    Duration::from_millis(ms)
 }
+
+#[cfg(test)]
+#[path = "route_mode_security_tests.rs"]
+mod security_tests;
@@ -0,0 +1,406 @@
+use super::*;
+use rand::{Rng, SeedableRng};
+use rand::rngs::StdRng;
+use std::sync::Arc;
+use std::sync::atomic::{AtomicU64, Ordering};
+
+#[test]
+fn cutover_stagger_delay_is_deterministic_for_same_inputs() {
+    let d1 = cutover_stagger_delay(0x0123_4567_89ab_cdef, 42);
+    let d2 = cutover_stagger_delay(0x0123_4567_89ab_cdef, 42);
+    assert_eq!(
+        d1, d2,
+        "stagger delay must be deterministic for identical session/generation inputs"
+    );
+}
+
+#[test]
+fn cutover_stagger_delay_stays_within_budget_bounds() {
+    // Black-hat model: censors trigger many cutovers and correlate disconnect timing.
+    // Keep delay inside a narrow coarse window to avoid long-tail spikes.
+    for generation in [0u64, 1, 2, 3, 16, 128, u32::MAX as u64, u64::MAX] {
+        for session_id in [
+            0u64,
+            1,
+            2,
+            0xdead_beef,
+            0xfeed_face_cafe_babe,
+            u64::MAX,
+        ] {
+            let delay = cutover_stagger_delay(session_id, generation);
+            assert!(
+                (1000..=1999).contains(&delay.as_millis()),
+                "stagger delay must remain in fixed 1000..=1999ms budget"
+            );
+        }
+    }
+}
+
+#[test]
+fn cutover_stagger_delay_changes_with_generation_for_same_session() {
+    let session_id = 0x0123_4567_89ab_cdef;
+    let first = cutover_stagger_delay(session_id, 100);
+    let second = cutover_stagger_delay(session_id, 101);
+    assert_ne!(
+        first, second,
+        "adjacent cutover generations should decorrelate disconnect delays"
+    );
+}
+
+#[test]
+fn route_runtime_set_mode_is_idempotent_for_same_mode() {
+    let runtime = RouteRuntimeController::new(RelayRouteMode::Direct);
+    let first = runtime.snapshot();
+    let changed = runtime.set_mode(RelayRouteMode::Direct);
+    let second = runtime.snapshot();
+
+    assert!(
+        changed.is_none(),
+        "setting already-active mode must not produce a cutover event"
+    );
+    assert_eq!(
+        first.generation, second.generation,
+        "idempotent mode set must not bump generation"
+    );
+}
+
+#[test]
+fn affected_cutover_state_triggers_only_for_newer_generation() {
+    let runtime = RouteRuntimeController::new(RelayRouteMode::Direct);
+    let rx = runtime.subscribe();
+    let initial = runtime.snapshot();
+
+    assert!(
+        affected_cutover_state(&rx, RelayRouteMode::Direct, initial.generation).is_none(),
+        "current generation must not be considered a cutover for existing session"
+    );
+
+    let next = runtime
+        .set_mode(RelayRouteMode::Middle)
+        .expect("mode change must produce cutover state");
+    let seen = affected_cutover_state(&rx, RelayRouteMode::Direct, initial.generation)
+        .expect("newer generation must be observed as cutover");
+
+    assert_eq!(seen.generation, next.generation);
+    assert_eq!(seen.mode, RelayRouteMode::Middle);
+}
+
+#[test]
+fn integration_watch_and_snapshot_follow_same_transition_sequence() {
+    let runtime = RouteRuntimeController::new(RelayRouteMode::Direct);
+    let rx = runtime.subscribe();
+
+    let sequence = [
+        RelayRouteMode::Middle,
+        RelayRouteMode::Middle,
+        RelayRouteMode::Direct,
+        RelayRouteMode::Direct,
+        RelayRouteMode::Middle,
+    ];
+
+    let mut expected_generation = 0u64;
+    let mut expected_mode = RelayRouteMode::Direct;
+
+    for target in sequence {
+        let changed = runtime.set_mode(target);
+        if target == expected_mode {
+            assert!(changed.is_none(), "idempotent transition must return none");
+        } else {
+            expected_mode = target;
+            expected_generation = expected_generation.saturating_add(1);
+            let emitted = changed.expect("real transition must emit cutover state");
+            assert_eq!(emitted.mode, expected_mode);
+            assert_eq!(emitted.generation, expected_generation);
+        }
+
+        let snap = runtime.snapshot();
+        let watched = *rx.borrow();
+        assert_eq!(snap, watched, "snapshot and watch state must stay aligned");
+        assert_eq!(snap.mode, expected_mode);
+        assert_eq!(snap.generation, expected_generation);
+    }
+}
+
+#[test]
+fn session_is_not_affected_when_mode_matches_even_if_generation_advanced() {
+    let session_mode = RelayRouteMode::Direct;
+    let current = RouteCutoverState {
+        mode: RelayRouteMode::Direct,
+        generation: 2,
+    };
+    let session_generation = 0;
+
+    assert!(
+        !is_session_affected_by_cutover(current, session_mode, session_generation),
+        "session on matching final route mode should not be force-cut over on intermediate generation bumps"
+    );
+}
+
+#[test]
+fn cutover_predicate_rejects_equal_generation_even_if_mode_differs() {
+    let current = RouteCutoverState {
+        mode: RelayRouteMode::Middle,
+        generation: 77,
+    };
+    assert!(
+        !is_session_affected_by_cutover(current, RelayRouteMode::Direct, 77),
+        "equal generation must never trigger cutover regardless of mode mismatch"
+    );
+}
+
+#[test]
+fn adversarial_route_oscillation_only_cuts_over_sessions_with_different_final_mode() {
+    let runtime = RouteRuntimeController::new(RelayRouteMode::Direct);
+    let rx = runtime.subscribe();
+    let session_generation = runtime.snapshot().generation;
+
+    runtime
+        .set_mode(RelayRouteMode::Middle)
+        .expect("direct->middle must transition");
+    runtime
+        .set_mode(RelayRouteMode::Direct)
+        .expect("middle->direct must transition");
+
+    assert!(
+        affected_cutover_state(&rx, RelayRouteMode::Direct, session_generation).is_none(),
+        "direct session should survive when final mode returns to direct"
+    );
+    assert!(
+        affected_cutover_state(&rx, RelayRouteMode::Middle, session_generation).is_some(),
+        "middle session should be cut over when final mode is direct"
+    );
+}
+
+#[test]
+fn light_fuzz_cutover_predicate_matches_reference_oracle() {
+    let mut rng = StdRng::seed_from_u64(0xC0DEC0DE5EED);
+    for _ in 0..20_000 {
+        let current = RouteCutoverState {
+            mode: if rng.random::<bool>() {
+                RelayRouteMode::Direct
+            } else {
+                RelayRouteMode::Middle
+            },
+            generation: rng.random_range(0u64..1_000_000),
+        };
+        let session_mode = if rng.random::<bool>() {
+            RelayRouteMode::Direct
+        } else {
+            RelayRouteMode::Middle
+        };
+        let session_generation = rng.random_range(0u64..1_000_000);
+
+        let expected = current.generation > session_generation && current.mode != session_mode;
+        let actual = is_session_affected_by_cutover(current, session_mode, session_generation);
+        assert_eq!(
+            actual, expected,
+            "cutover predicate must match mode-aware generation oracle"
+        );
+    }
+}
+
+#[test]
+fn light_fuzz_set_mode_generation_tracks_only_real_transitions() {
+    let runtime = RouteRuntimeController::new(RelayRouteMode::Direct);
+    let mut rng = StdRng::seed_from_u64(0x0DDC0FFE);
+
+    let mut expected_mode = RelayRouteMode::Direct;
+    let mut expected_generation = 0u64;
+
+    for _ in 0..10_000 {
+        let candidate = if rng.random::<bool>() {
+            RelayRouteMode::Direct
+        } else {
+            RelayRouteMode::Middle
+        };
+        let changed = runtime.set_mode(candidate);
+
+        if candidate == expected_mode {
+            assert!(changed.is_none(), "idempotent set_mode must not emit cutover state");
+        } else {
+            expected_mode = candidate;
+            expected_generation = expected_generation.saturating_add(1);
+            let next = changed.expect("mode transition must emit cutover state");
+            assert_eq!(next.mode, expected_mode);
+            assert_eq!(next.generation, expected_generation);
+        }
+    }
+
+    let final_state = runtime.snapshot();
+    assert_eq!(final_state.mode, expected_mode);
+    assert_eq!(final_state.generation, expected_generation);
+}
+
+#[test]
+fn stress_snapshot_and_watch_state_remain_consistent_under_concurrent_switch_storm() {
+    let runtime = Arc::new(RouteRuntimeController::new(RelayRouteMode::Direct));
+
+    std::thread::scope(|scope| {
+        let mut writers = Vec::new();
+        for worker in 0..4usize {
+            let runtime = Arc::clone(&runtime);
+            writers.push(scope.spawn(move || {
+                for step in 0..20_000usize {
+                    let mode = if (worker + step) % 2 == 0 {
+                        RelayRouteMode::Direct
+                    } else {
+                        RelayRouteMode::Middle
+                    };
+                    let _ = runtime.set_mode(mode);
+                }
+            }));
+        }
+
+        for writer in writers {
+            writer
+                .join()
+                .expect("route mode writer thread must not panic");
+        }
+
+        let rx = runtime.subscribe();
+        for _ in 0..128 {
+            assert_eq!(
+                runtime.snapshot(),
+                *rx.borrow(),
+                "snapshot and watch state must converge after concurrent set_mode churn"
+            );
+            std::thread::yield_now();
+        }
+    });
+}
+
+#[test]
+fn stress_concurrent_transition_count_matches_final_generation() {
+    let runtime = Arc::new(RouteRuntimeController::new(RelayRouteMode::Direct));
+    let successful_transitions = Arc::new(AtomicU64::new(0));
+
+    std::thread::scope(|scope| {
+        let mut workers = Vec::new();
+        for worker in 0..6usize {
+            let runtime = Arc::clone(&runtime);
+            let successful_transitions = Arc::clone(&successful_transitions);
+            workers.push(scope.spawn(move || {
+                let mut state = (worker as u64 + 1).wrapping_mul(0x9E37_79B9_7F4A_7C15);
+                for _ in 0..25_000usize {
+                    state ^= state << 7;
+                    state ^= state >> 9;
+                    state ^= state << 8;
+                    let mode = if (state & 1) == 0 {
+                        RelayRouteMode::Direct
+                    } else {
+                        RelayRouteMode::Middle
+                    };
+                    if runtime.set_mode(mode).is_some() {
+                        successful_transitions.fetch_add(1, Ordering::Relaxed);
+                    }
+                }
+            }));
+        }
+
+        for worker in workers {
+            worker.join().expect("route mode transition worker must not panic");
+        }
+    });
+
+    let final_state = runtime.snapshot();
+    assert_eq!(
+        final_state.generation,
+        successful_transitions.load(Ordering::Relaxed),
+        "final generation must equal number of accepted mode transitions"
+    );
+    assert_eq!(
+        final_state,
+        *runtime.subscribe().borrow(),
+        "watch and snapshot state must match after concurrent transition accounting"
+    );
+}
+
+#[test]
+fn light_fuzz_cutover_stagger_delay_distribution_stays_in_fixed_window() {
+    // Deterministic xorshift fuzzing keeps this test stable across runs.
+    let mut s: u64 = 0x9E37_79B9_7F4A_7C15;
+
+    for _ in 0..20_000 {
+        s ^= s << 7;
+        s ^= s >> 9;
+        s ^= s << 8;
+        let session_id = s;
+
+        s ^= s << 7;
+        s ^= s >> 9;
+        s ^= s << 8;
+        let generation = s;
+
+        let delay = cutover_stagger_delay(session_id, generation);
+        assert!(
+            (1000..=1999).contains(&delay.as_millis()),
+            "fuzzed inputs must always map into fixed stagger window"
+        );
+    }
+}
+
+#[test]
+fn cutover_stagger_delay_distribution_has_no_empty_buckets_under_sequential_sessions() {
+    let mut buckets = [0usize; 1000];
+    let generation = 4242u64;
+
+    for session_id in 0..250_000u64 {
+        let delay_ms = cutover_stagger_delay(session_id, generation).as_millis() as usize;
+        let idx = delay_ms - 1000;
+        buckets[idx] += 1;
+    }
+
+    let empty = buckets.iter().filter(|&&count| count == 0).count();
+    assert_eq!(
+        empty, 0,
+        "all 1000 delay buckets must be exercised to avoid cutover herd clustering"
+    );
+}
+
+#[test]
+fn light_fuzz_cutover_stagger_delay_distribution_stays_reasonably_uniform() {
+    let mut buckets = [0usize; 1000];
+    let mut s: u64 = 0x1BAD_B002_CAFE_F00D;
+
+    for _ in 0..300_000usize {
+        s ^= s << 7;
+        s ^= s >> 9;
+        s ^= s << 8;
+        let session_id = s;
+
+        s ^= s << 7;
+        s ^= s >> 9;
+        s ^= s << 8;
+        let generation = s;
+
+        let delay_ms = cutover_stagger_delay(session_id, generation).as_millis() as usize;
+        buckets[delay_ms - 1000] += 1;
+    }
+
+    let min = *buckets.iter().min().unwrap_or(&0);
+    let max = *buckets.iter().max().unwrap_or(&0);
+    assert!(min > 0, "fuzzed distribution must not leave empty buckets");
+    assert!(
+        max <= min.saturating_mul(3),
+        "bucket skew is too high for anti-herd staggering (max={max}, min={min})"
+    );
+}
+
+#[test]
+fn stress_cutover_stagger_delay_distribution_remains_stable_across_generations() {
+    for generation in [0u64, 1, 7, 31, 255, 1024, u32::MAX as u64, u64::MAX - 1] {
+        let mut buckets = [0usize; 1000];
+        for session_id in 0..100_000u64 {
+            let delay_ms = cutover_stagger_delay(session_id ^ 0x9E37_79B9, generation)
+                .as_millis() as usize;
+            buckets[delay_ms - 1000] += 1;
+        }
+
+        let min = *buckets.iter().min().unwrap_or(&0);
+        let max = *buckets.iter().max().unwrap_or(&0);
+        assert!(
+            max <= min.saturating_mul(4).max(1),
+            "generation={generation}: distribution collapsed (max={max}, min={min})"
+        );
+    }
+}
@@ -1,46 +0,0 @@
-/// Session eviction is intentionally disabled in runtime.
-///
-/// The initial `user+dc` single-lease model caused valid parallel client
-/// connections to evict each other. Keep the API shape for compatibility,
-/// but make it a no-op until a safer policy is introduced.
-
-#[derive(Debug, Clone, Default)]
-pub struct SessionLease;
-
-impl SessionLease {
-    pub fn is_stale(&self) -> bool {
-        false
-    }
-
-    #[allow(dead_code)]
-    pub fn release(&self) {}
-}
-
-pub struct RegistrationResult {
-    pub lease: SessionLease,
-    pub replaced_existing: bool,
-}
-
-pub fn register_session(_user: &str, _dc_idx: i16) -> RegistrationResult {
-    RegistrationResult {
-        lease: SessionLease,
-        replaced_existing: false,
-    }
-}
-
-#[cfg(test)]
-mod tests {
-    use super::*;
-
-    #[test]
-    fn test_session_eviction_disabled_behavior() {
-        let first = register_session("alice", 2);
-        let second = register_session("alice", 2);
-        assert!(!first.replaced_existing);
-        assert!(!second.replaced_existing);
-        assert!(!first.lease.is_stale());
-        assert!(!second.lease.is_stale());
-        first.lease.release();
-        second.lease.release();
-    }
-}
@@ -0,0 +1,265 @@
+use super::*;
+use std::panic::{self, AssertUnwindSafe};
+use std::sync::Arc;
+use std::time::Duration;
+use tokio::sync::Barrier;
+
+#[test]
+fn direct_connection_lease_balances_on_drop() {
+    let stats = Arc::new(Stats::new());
+    assert_eq!(stats.get_current_connections_direct(), 0);
+
+    {
+        let _lease = stats.acquire_direct_connection_lease();
+        assert_eq!(stats.get_current_connections_direct(), 1);
+    }
+
+    assert_eq!(stats.get_current_connections_direct(), 0);
+}
+
+#[test]
+fn middle_connection_lease_balances_on_drop() {
+    let stats = Arc::new(Stats::new());
+    assert_eq!(stats.get_current_connections_me(), 0);
+
+    {
+        let _lease = stats.acquire_me_connection_lease();
+        assert_eq!(stats.get_current_connections_me(), 1);
+    }
+
+    assert_eq!(stats.get_current_connections_me(), 0);
+}
+
+#[test]
+fn connection_lease_disarm_prevents_double_release() {
+    let stats = Arc::new(Stats::new());
+
+    let mut lease = stats.acquire_direct_connection_lease();
+    assert_eq!(stats.get_current_connections_direct(), 1);
+
+    stats.decrement_current_connections_direct();
+    assert_eq!(stats.get_current_connections_direct(), 0);
+
+    lease.disarm();
+    drop(lease);
+
+    assert_eq!(stats.get_current_connections_direct(), 0);
+}
+
+#[test]
+fn direct_connection_lease_balances_on_panic_unwind() {
+    let stats = Arc::new(Stats::new());
+    let stats_for_panic = stats.clone();
+
+    let panic_result = panic::catch_unwind(AssertUnwindSafe(move || {
+        let _lease = stats_for_panic.acquire_direct_connection_lease();
+        panic!("intentional panic to verify lease drop path");
+    }));
+
+    assert!(panic_result.is_err(), "panic must propagate from test closure");
+    assert_eq!(
+        stats.get_current_connections_direct(),
+        0,
+        "panic unwind must release direct route gauge"
+    );
+}
+
+#[test]
+fn middle_connection_lease_balances_on_panic_unwind() {
+    let stats = Arc::new(Stats::new());
+    let stats_for_panic = stats.clone();
+
+    let panic_result = panic::catch_unwind(AssertUnwindSafe(move || {
+        let _lease = stats_for_panic.acquire_me_connection_lease();
+        panic!("intentional panic to verify middle lease drop path");
+    }));
+
+    assert!(panic_result.is_err(), "panic must propagate from test closure");
+    assert_eq!(
+        stats.get_current_connections_me(),
+        0,
+        "panic unwind must release middle route gauge"
+    );
+}
+
+#[tokio::test]
+async fn concurrent_mixed_route_lease_churn_balances_to_zero() {
+    const TASKS: usize = 48;
+    const ITERATIONS_PER_TASK: usize = 256;
+
+    let stats = Arc::new(Stats::new());
+    let barrier = Arc::new(Barrier::new(TASKS));
+    let mut workers = Vec::with_capacity(TASKS);
+
+    for task_idx in 0..TASKS {
+        let stats_for_task = stats.clone();
+        let barrier_for_task = barrier.clone();
+        workers.push(tokio::spawn(async move {
+            barrier_for_task.wait().await;
+            for iter in 0..ITERATIONS_PER_TASK {
+                if (task_idx + iter) % 2 == 0 {
+                    let _lease = stats_for_task.acquire_direct_connection_lease();
+                    tokio::task::yield_now().await;
+                } else {
+                    let _lease = stats_for_task.acquire_me_connection_lease();
+                    tokio::task::yield_now().await;
+                }
+            }
+        }));
+    }
+
+    for worker in workers {
+        worker
+            .await
+            .expect("lease churn worker must not panic");
+    }
+
+    assert_eq!(
+        stats.get_current_connections_direct(),
+        0,
+        "direct route gauge must return to zero after concurrent lease churn"
+    );
+    assert_eq!(
+        stats.get_current_connections_me(),
+        0,
+        "middle route gauge must return to zero after concurrent lease churn"
+    );
+}
+
+#[tokio::test]
+async fn abort_storm_mixed_route_leases_returns_all_gauges_to_zero() {
+    const TASKS: usize = 64;
+
+    let stats = Arc::new(Stats::new());
+    let mut workers = Vec::with_capacity(TASKS);
+
+    for task_idx in 0..TASKS {
+        let stats_for_task = stats.clone();
+        workers.push(tokio::spawn(async move {
+            if task_idx % 2 == 0 {
+                let _lease = stats_for_task.acquire_direct_connection_lease();
+                tokio::time::sleep(Duration::from_secs(60)).await;
+            } else {
+                let _lease = stats_for_task.acquire_me_connection_lease();
+                tokio::time::sleep(Duration::from_secs(60)).await;
+            }
+        }));
+    }
+
+    tokio::time::timeout(Duration::from_secs(2), async {
+        loop {
+            let total = stats.get_current_connections_direct() + stats.get_current_connections_me();
+            if total == TASKS as u64 {
+                break;
+            }
+            tokio::time::sleep(Duration::from_millis(10)).await;
+        }
+    })
+    .await
+    .expect("all storm tasks must acquire route leases before abort");
+
+    for worker in &workers {
+        worker.abort();
+    }
+    for worker in workers {
+        let joined = worker.await;
+        assert!(joined.is_err(), "aborted worker must return join error");
+    }
+
+    tokio::time::timeout(Duration::from_secs(2), async {
+        loop {
+            if stats.get_current_connections_direct() == 0 && stats.get_current_connections_me() == 0 {
+                break;
+            }
+            tokio::time::sleep(Duration::from_millis(10)).await;
+        }
+    })
+    .await
+    .expect("all route gauges must drain to zero after abort storm");
+}
+
+#[test]
+fn saturating_route_decrements_do_not_underflow_under_race() {
+    const THREADS: usize = 16;
+    const DECREMENTS_PER_THREAD: usize = 4096;
+
+    let stats = Arc::new(Stats::new());
+    let mut workers = Vec::with_capacity(THREADS);
+
+    for _ in 0..THREADS {
+        let stats_for_thread = stats.clone();
+        workers.push(std::thread::spawn(move || {
+            for _ in 0..DECREMENTS_PER_THREAD {
+                stats_for_thread.decrement_current_connections_direct();
+                stats_for_thread.decrement_current_connections_me();
+            }
+        }));
+    }
+
+    for worker in workers {
+        worker
+            .join()
+            .expect("decrement race worker must not panic");
+    }
+
+    assert_eq!(
+        stats.get_current_connections_direct(),
+        0,
+        "direct route decrement races must never underflow"
+    );
+    assert_eq!(
+        stats.get_current_connections_me(),
+        0,
+        "middle route decrement races must never underflow"
+    );
+}
+
+#[tokio::test]
+async fn direct_connection_lease_balances_on_task_abort() {
+    let stats = Arc::new(Stats::new());
+    let stats_for_task = stats.clone();
+
+    let task = tokio::spawn(async move {
+        let _lease = stats_for_task.acquire_direct_connection_lease();
+        tokio::time::sleep(Duration::from_secs(60)).await;
+    });
+
+    tokio::time::sleep(Duration::from_millis(20)).await;
+    assert_eq!(stats.get_current_connections_direct(), 1);
+
+    task.abort();
+    let joined = task.await;
+    assert!(joined.is_err(), "aborted task must return a join error");
+
+    tokio::time::sleep(Duration::from_millis(20)).await;
+    assert_eq!(
+        stats.get_current_connections_direct(),
+        0,
+        "aborted task must release direct route gauge"
+    );
+}
+
+#[tokio::test]
+async fn middle_connection_lease_balances_on_task_abort() {
+    let stats = Arc::new(Stats::new());
+    let stats_for_task = stats.clone();
+
+    let task = tokio::spawn(async move {
+        let _lease = stats_for_task.acquire_me_connection_lease();
+        tokio::time::sleep(Duration::from_secs(60)).await;
+    });
+
+    tokio::time::sleep(Duration::from_millis(20)).await;
+    assert_eq!(stats.get_current_connections_me(), 1);
+
+    task.abort();
+    let joined = task.await;
+    assert!(joined.is_err(), "aborted task must return a join error");
+
+    tokio::time::sleep(Duration::from_millis(20)).await;
+    assert_eq!(
+        stats.get_current_connections_me(),
+        0,
+        "aborted task must release middle route gauge"
+    );
+}
@@ -6,6 +6,7 @@ pub mod beobachten;
 pub mod telemetry;

 use std::sync::atomic::{AtomicBool, AtomicU8, AtomicU64, Ordering};
+use std::sync::Arc;
 use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
 use dashmap::DashMap;
 use parking_lot::Mutex;
@@ -19,6 +20,46 @@ use tracing::debug;
 use crate::config::{MeTelemetryLevel, MeWriterPickMode};
 use self::telemetry::TelemetryPolicy;

+#[derive(Clone, Copy)]
+enum RouteConnectionGauge {
+    Direct,
+    Middle,
+}
+
+#[must_use = "RouteConnectionLease must be kept alive to hold the connection gauge increment"]
+pub struct RouteConnectionLease {
+    stats: Arc<Stats>,
+    gauge: RouteConnectionGauge,
+    active: bool,
+}
+
+impl RouteConnectionLease {
+    fn new(stats: Arc<Stats>, gauge: RouteConnectionGauge) -> Self {
+        Self {
+            stats,
+            gauge,
+            active: true,
+        }
+    }
+
+    #[cfg(test)]
+    fn disarm(&mut self) {
+        self.active = false;
+    }
+}
+
+impl Drop for RouteConnectionLease {
+    fn drop(&mut self) {
+        if !self.active {
+            return;
+        }
+        match self.gauge {
+            RouteConnectionGauge::Direct => self.stats.decrement_current_connections_direct(),
+            RouteConnectionGauge::Middle => self.stats.decrement_current_connections_me(),
+        }
+    }
+}
+
 // ============= Stats =============

 #[derive(Default)]
@@ -120,8 +161,6 @@ pub struct Stats {
    pool_swap_total: AtomicU64,
    pool_drain_active: AtomicU64,
    pool_force_close_total: AtomicU64,
-    pool_drain_soft_evict_total: AtomicU64,
-    pool_drain_soft_evict_writer_total: AtomicU64,
    pool_stale_pick_total: AtomicU64,
    me_writer_removed_total: AtomicU64,
    me_writer_removed_unexpected_total: AtomicU64,
@@ -135,11 +174,6 @@ pub struct Stats {
    me_inline_recovery_total: AtomicU64,
    ip_reservation_rollback_tcp_limit_total: AtomicU64,
    ip_reservation_rollback_quota_limit_total: AtomicU64,
-    relay_adaptive_promotions_total: AtomicU64,
-    relay_adaptive_demotions_total: AtomicU64,
-    relay_adaptive_hard_promotions_total: AtomicU64,
-    reconnect_evict_total: AtomicU64,
-    reconnect_stale_close_total: AtomicU64,
    telemetry_core_enabled: AtomicBool,
    telemetry_user_enabled: AtomicBool,
    telemetry_me_level: AtomicU8,
@@ -292,35 +326,15 @@ impl Stats {
    pub fn decrement_current_connections_me(&self) {
        Self::decrement_atomic_saturating(&self.current_connections_me);
    }
-    pub fn increment_relay_adaptive_promotions_total(&self) {
-        if self.telemetry_core_enabled() {
-            self.relay_adaptive_promotions_total
-                .fetch_add(1, Ordering::Relaxed);
-        }
+
+    pub fn acquire_direct_connection_lease(self: &Arc<Self>) -> RouteConnectionLease {
+        self.increment_current_connections_direct();
+        RouteConnectionLease::new(self.clone(), RouteConnectionGauge::Direct)
    }
-    pub fn increment_relay_adaptive_demotions_total(&self) {
-        if self.telemetry_core_enabled() {
-            self.relay_adaptive_demotions_total
-                .fetch_add(1, Ordering::Relaxed);
-        }
-    }
-    pub fn increment_relay_adaptive_hard_promotions_total(&self) {
-        if self.telemetry_core_enabled() {
-            self.relay_adaptive_hard_promotions_total
-                .fetch_add(1, Ordering::Relaxed);
-        }
-    }
-    pub fn increment_reconnect_evict_total(&self) {
-        if self.telemetry_core_enabled() {
-            self.reconnect_evict_total
-                .fetch_add(1, Ordering::Relaxed);
-        }
-    }
-    pub fn increment_reconnect_stale_close_total(&self) {
-        if self.telemetry_core_enabled() {
-            self.reconnect_stale_close_total
-                .fetch_add(1, Ordering::Relaxed);
-        }
+
+    pub fn acquire_me_connection_lease(self: &Arc<Self>) -> RouteConnectionLease {
+        self.increment_current_connections_me();
+        RouteConnectionLease::new(self.clone(), RouteConnectionGauge::Middle)
    }
    pub fn increment_handshake_timeouts(&self) {
        if self.telemetry_core_enabled() {
@@ -717,18 +731,6 @@ impl Stats {
            self.pool_force_close_total.fetch_add(1, Ordering::Relaxed);
        }
    }
-    pub fn increment_pool_drain_soft_evict_total(&self) {
-        if self.telemetry_me_allows_normal() {
-            self.pool_drain_soft_evict_total
-                .fetch_add(1, Ordering::Relaxed);
-        }
-    }
-    pub fn increment_pool_drain_soft_evict_writer_total(&self) {
-        if self.telemetry_me_allows_normal() {
-            self.pool_drain_soft_evict_writer_total
-                .fetch_add(1, Ordering::Relaxed);
-        }
-    }
    pub fn increment_pool_stale_pick_total(&self) {
        if self.telemetry_me_allows_normal() {
            self.pool_stale_pick_total.fetch_add(1, Ordering::Relaxed);
@@ -982,22 +984,6 @@ impl Stats {
        self.get_current_connections_direct()
            .saturating_add(self.get_current_connections_me())
    }
-    pub fn get_relay_adaptive_promotions_total(&self) -> u64 {
-        self.relay_adaptive_promotions_total.load(Ordering::Relaxed)
-    }
-    pub fn get_relay_adaptive_demotions_total(&self) -> u64 {
-        self.relay_adaptive_demotions_total.load(Ordering::Relaxed)
-    }
-    pub fn get_relay_adaptive_hard_promotions_total(&self) -> u64 {
-        self.relay_adaptive_hard_promotions_total
-            .load(Ordering::Relaxed)
-    }
-    pub fn get_reconnect_evict_total(&self) -> u64 {
-        self.reconnect_evict_total.load(Ordering::Relaxed)
-    }
-    pub fn get_reconnect_stale_close_total(&self) -> u64 {
-        self.reconnect_stale_close_total.load(Ordering::Relaxed)
-    }
    pub fn get_me_keepalive_sent(&self) -> u64 { self.me_keepalive_sent.load(Ordering::Relaxed) }
    pub fn get_me_keepalive_failed(&self) -> u64 { self.me_keepalive_failed.load(Ordering::Relaxed) }
    pub fn get_me_keepalive_pong(&self) -> u64 { self.me_keepalive_pong.load(Ordering::Relaxed) }
@@ -1250,12 +1236,6 @@ impl Stats {
    pub fn get_pool_force_close_total(&self) -> u64 {
        self.pool_force_close_total.load(Ordering::Relaxed)
    }
-    pub fn get_pool_drain_soft_evict_total(&self) -> u64 {
-        self.pool_drain_soft_evict_total.load(Ordering::Relaxed)
-    }
-    pub fn get_pool_drain_soft_evict_writer_total(&self) -> u64 {
-        self.pool_drain_soft_evict_writer_total.load(Ordering::Relaxed)
-    }
    pub fn get_pool_stale_pick_total(&self) -> u64 {
        self.pool_stale_pick_total.load(Ordering::Relaxed)
    }
@@ -1327,11 +1307,35 @@ impl Stats {
        Self::touch_user_stats(stats.value());
        stats.curr_connects.fetch_add(1, Ordering::Relaxed);
    }
+
+    pub fn try_acquire_user_curr_connects(&self, user: &str, limit: Option<u64>) -> bool {
+        if !self.telemetry_user_enabled() {
+            return true;
+        }
+
+        self.maybe_cleanup_user_stats();
+        let stats = self.user_stats.entry(user.to_string()).or_default();
+        Self::touch_user_stats(stats.value());
+
+        let counter = &stats.curr_connects;
+        let mut current = counter.load(Ordering::Relaxed);
+        loop {
+            if let Some(max) = limit && current >= max {
+                return false;
+            }
+            match counter.compare_exchange_weak(
+                current,
+                current.saturating_add(1),
+                Ordering::Relaxed,
+                Ordering::Relaxed,
+            ) {
+                Ok(_) => return true,
+                Err(actual) => current = actual,
+            }
+        }
+    }
    
    pub fn decrement_user_curr_connects(&self, user: &str) {
-        if !self.telemetry_user_enabled() {
-            return;
-        }
        self.maybe_cleanup_user_stats();
        if let Some(stats) = self.user_stats.get(user) {
            Self::touch_user_stats(stats.value());
@@ -1504,9 +1508,11 @@ impl Stats {
 // ============= Replay Checker =============

 pub struct ReplayChecker {
-    shards: Vec<Mutex<ReplayShard>>,
+    handshake_shards: Vec<Mutex<ReplayShard>>,
+    tls_shards: Vec<Mutex<ReplayShard>>,
    shard_mask: usize,
    window: Duration,
+    tls_window: Duration,
    checks: AtomicU64,
    hits: AtomicU64,
    additions: AtomicU64,
@@ -1583,19 +1589,24 @@ impl ReplayShard {

 impl ReplayChecker {
    pub fn new(total_capacity: usize, window: Duration) -> Self {
+        const MIN_TLS_REPLAY_WINDOW: Duration = Duration::from_secs(120);
        let num_shards = 64;
        let shard_capacity = (total_capacity / num_shards).max(1);
        let cap = NonZeroUsize::new(shard_capacity).unwrap();

-        let mut shards = Vec::with_capacity(num_shards);
+        let mut handshake_shards = Vec::with_capacity(num_shards);
+        let mut tls_shards = Vec::with_capacity(num_shards);
        for _ in 0..num_shards {
-            shards.push(Mutex::new(ReplayShard::new(cap)));
+            handshake_shards.push(Mutex::new(ReplayShard::new(cap)));
+            tls_shards.push(Mutex::new(ReplayShard::new(cap)));
        }

        Self {
-            shards,
+            handshake_shards,
+            tls_shards,
            shard_mask: num_shards - 1,
            window,
+            tls_window: window.max(MIN_TLS_REPLAY_WINDOW),
            checks: AtomicU64::new(0),
            hits: AtomicU64::new(0),
            additions: AtomicU64::new(0),
@@ -1609,46 +1620,60 @@ impl ReplayChecker {
        (hasher.finish() as usize) & self.shard_mask
    }

-    fn check_and_add_internal(&self, data: &[u8]) -> bool {
+    fn check_and_add_internal(
+        &self,
+        data: &[u8],
+        shards: &[Mutex<ReplayShard>],
+        window: Duration,
+    ) -> bool {
        self.checks.fetch_add(1, Ordering::Relaxed);
        let idx = self.get_shard_idx(data);
-        let mut shard = self.shards[idx].lock();
+        let mut shard = shards[idx].lock();
        let now = Instant::now();
-        let found = shard.check(data, now, self.window);
+        let found = shard.check(data, now, window);
        if found {
            self.hits.fetch_add(1, Ordering::Relaxed);
        } else {
-            shard.add(data, now, self.window);
+            shard.add(data, now, window);
            self.additions.fetch_add(1, Ordering::Relaxed);
        }
        found
    }

-    fn add_only(&self, data: &[u8]) {
+    fn add_only(&self, data: &[u8], shards: &[Mutex<ReplayShard>], window: Duration) {
        self.additions.fetch_add(1, Ordering::Relaxed);
        let idx = self.get_shard_idx(data);
-        let mut shard = self.shards[idx].lock();
-        shard.add(data, Instant::now(), self.window);
+        let mut shard = shards[idx].lock();
+        shard.add(data, Instant::now(), window);
    }

    pub fn check_and_add_handshake(&self, data: &[u8]) -> bool {
-        self.check_and_add_internal(data)
+        self.check_and_add_internal(data, &self.handshake_shards, self.window)
    }

    pub fn check_and_add_tls_digest(&self, data: &[u8]) -> bool {
-        self.check_and_add_internal(data)
+        self.check_and_add_internal(data, &self.tls_shards, self.tls_window)
    }

    // Compatibility helpers (non-atomic split operations) — prefer check_and_add_*.
    pub fn check_handshake(&self, data: &[u8]) -> bool { self.check_and_add_handshake(data) }
-    pub fn add_handshake(&self, data: &[u8]) { self.add_only(data) }
+    pub fn add_handshake(&self, data: &[u8]) {
+        self.add_only(data, &self.handshake_shards, self.window)
+    }
    pub fn check_tls_digest(&self, data: &[u8]) -> bool { self.check_and_add_tls_digest(data) }
-    pub fn add_tls_digest(&self, data: &[u8]) { self.add_only(data) }
+    pub fn add_tls_digest(&self, data: &[u8]) {
+        self.add_only(data, &self.tls_shards, self.tls_window)
+    }
    
    pub fn stats(&self) -> ReplayStats {
        let mut total_entries = 0;
        let mut total_queue_len = 0;
-        for shard in &self.shards {
+        for shard in &self.handshake_shards {
+            let s = shard.lock();
+            total_entries += s.cache.len();
+            total_queue_len += s.queue.len();
+        }
+        for shard in &self.tls_shards {
            let s = shard.lock();
            total_entries += s.cache.len();
            total_queue_len += s.queue.len();
@@ -1661,7 +1686,7 @@ impl ReplayChecker {
            total_hits: self.hits.load(Ordering::Relaxed),
            total_additions: self.additions.load(Ordering::Relaxed),
            total_cleanups: self.cleanups.load(Ordering::Relaxed),
-            num_shards: self.shards.len(),
+            num_shards: self.handshake_shards.len() + self.tls_shards.len(),
            window_secs: self.window.as_secs(),
        }
    }
@@ -1679,13 +1704,20 @@ impl ReplayChecker {
            let now = Instant::now();
            let mut cleaned = 0usize;
            
-            for shard_mutex in &self.shards {
+            for shard_mutex in &self.handshake_shards {
                let mut shard = shard_mutex.lock();
                let before = shard.len();
                shard.cleanup(now, self.window);
                let after = shard.len();
                cleaned += before.saturating_sub(after);
            }
+            for shard_mutex in &self.tls_shards {
+                let mut shard = shard_mutex.lock();
+                let before = shard.len();
+                shard.cleanup(now, self.tls_window);
+                let after = shard.len();
+                cleaned += before.saturating_sub(after);
+            }
            
            self.cleanups.fetch_add(1, Ordering::Relaxed);
            
@@ -1811,7 +1843,7 @@ mod tests {
    fn test_replay_checker_many_keys() {
        let checker = ReplayChecker::new(10_000, Duration::from_secs(60));
        for i in 0..500u32 {
-            checker.add_only(&i.to_le_bytes());
+            checker.add_handshake(&i.to_le_bytes());
        }
        for i in 0..500u32 {
            assert!(checker.check_handshake(&i.to_le_bytes()));
@@ -1819,3 +1851,11 @@ mod tests {
        assert_eq!(checker.stats().total_entries, 500);
    }
 }
+
+#[cfg(test)]
+#[path = "connection_lease_security_tests.rs"]
+mod connection_lease_security_tests;
+
+#[cfg(test)]
+#[path = "replay_checker_security_tests.rs"]
+mod replay_checker_security_tests;
@@ -0,0 +1,80 @@
+use super::*;
+use std::time::Duration;
+
+#[test]
+fn replay_checker_keeps_tls_and_handshake_domains_isolated_for_same_key() {
+    let checker = ReplayChecker::new(128, Duration::from_millis(20));
+    let key = b"same-key-domain-separation";
+
+    assert!(
+        !checker.check_and_add_handshake(key),
+        "first handshake use should be fresh"
+    );
+    assert!(
+        !checker.check_and_add_tls_digest(key),
+        "same bytes in TLS domain should still be fresh"
+    );
+
+    assert!(
+        checker.check_and_add_handshake(key),
+        "second handshake use should be replay-hit"
+    );
+    assert!(
+        checker.check_and_add_tls_digest(key),
+        "second TLS use should be replay-hit independently"
+    );
+}
+
+#[test]
+fn replay_checker_tls_window_is_clamped_beyond_small_handshake_window() {
+    let checker = ReplayChecker::new(128, Duration::from_millis(20));
+    let handshake_key = b"short-window-handshake";
+    let tls_key = b"short-window-tls";
+
+    assert!(!checker.check_and_add_handshake(handshake_key));
+    assert!(!checker.check_and_add_tls_digest(tls_key));
+
+    std::thread::sleep(Duration::from_millis(80));
+
+    assert!(
+        !checker.check_and_add_handshake(handshake_key),
+        "handshake key should expire under short configured window"
+    );
+    assert!(
+        checker.check_and_add_tls_digest(tls_key),
+        "TLS key should still be replay-hit because TLS window is clamped to a secure minimum"
+    );
+}
+
+#[test]
+fn replay_checker_compat_add_paths_do_not_cross_pollute_domains() {
+    let checker = ReplayChecker::new(128, Duration::from_secs(1));
+    let key = b"compat-domain-separation";
+
+    checker.add_handshake(key);
+    assert!(
+        checker.check_and_add_handshake(key),
+        "handshake add helper must populate handshake domain"
+    );
+    assert!(
+        !checker.check_and_add_tls_digest(key),
+        "handshake add helper must not pollute TLS domain"
+    );
+
+    checker.add_tls_digest(key);
+    assert!(
+        checker.check_and_add_tls_digest(key),
+        "TLS add helper must populate TLS domain"
+    );
+}
+
+#[test]
+fn replay_checker_stats_reflect_dual_shard_domains() {
+    let checker = ReplayChecker::new(128, Duration::from_secs(1));
+    let stats = checker.stats();
+
+    assert_eq!(
+        stats.num_shards, 128,
+        "stats should expose both shard domains (handshake + TLS)"
+    );
+}
@@ -14,7 +14,8 @@ use std::sync::Arc;
 // ============= Configuration =============

 /// Default buffer size
-pub const DEFAULT_BUFFER_SIZE: usize = 64 * 1024;
+/// CHANGED: Reduced from 64KB to 16KB to match TLS record size and prevent bufferbloat.
+pub const DEFAULT_BUFFER_SIZE: usize = 16 * 1024;

 /// Default maximum number of pooled buffers
 pub const DEFAULT_MAX_BUFFERS: usize = 1024;
@@ -513,6 +513,7 @@ impl FrameCodecTrait for SecureCodec {
 #[cfg(test)]
 mod tests {
    use super::*;
+    use std::collections::HashSet;
    use tokio_util::codec::{FramedRead, FramedWrite};
    use tokio::io::duplex;
    use futures::{SinkExt, StreamExt};
@@ -630,4 +631,31 @@ mod tests {
        let result = codec.decode(&mut buf);
        assert!(result.is_err());
    }
+
+    #[test]
+    fn secure_codec_always_adds_padding_and_jitters_wire_length() {
+        let codec = SecureCodec::new(Arc::new(SecureRandom::new()));
+        let payload = Bytes::from_static(&[1, 2, 3, 4, 5, 6, 7, 8]);
+        let mut wire_lens = HashSet::new();
+
+        for _ in 0..64 {
+            let frame = Frame::new(payload.clone());
+            let mut out = BytesMut::new();
+            codec.encode(&frame, &mut out).unwrap();
+
+            assert!(out.len() >= 4 + payload.len() + 1);
+            let wire_len = u32::from_le_bytes([out[0], out[1], out[2], out[3]]) as usize;
+            assert!(
+                (payload.len() + 1..=payload.len() + 3).contains(&wire_len),
+                "Secure wire length must be payload+1..3, got {wire_len}"
+            );
+            assert_ne!(wire_len % 4, 0, "Secure wire length must be non-4-aligned");
+            wire_lens.insert(wire_len);
+        }
+
+        assert!(
+            wire_lens.len() >= 2,
+            "Secure padding should create observable wire-length jitter"
+        );
+    }
 }
@@ -117,15 +117,6 @@ pub fn build_emulated_server_hello(
    extensions.extend_from_slice(&0x002bu16.to_be_bytes());
    extensions.extend_from_slice(&(2u16).to_be_bytes());
    extensions.extend_from_slice(&0x0304u16.to_be_bytes());
-    if let Some(alpn_proto) = &alpn {
-        extensions.extend_from_slice(&0x0010u16.to_be_bytes());
-        let list_len: u16 = 1 + alpn_proto.len() as u16;
-        let ext_len: u16 = 2 + list_len;
-        extensions.extend_from_slice(&ext_len.to_be_bytes());
-        extensions.extend_from_slice(&list_len.to_be_bytes());
-        extensions.push(alpn_proto.len() as u8);
-        extensions.extend_from_slice(alpn_proto);
-    }
    let extensions_len = extensions.len() as u16;

    let body_len = 2 + // version
@@ -207,8 +198,22 @@ pub fn build_emulated_server_hello(
    }

    let mut app_data = Vec::new();
+    let alpn_marker = alpn
+        .as_ref()
+        .filter(|p| !p.is_empty() && p.len() <= u8::MAX as usize)
+        .map(|proto| {
+            let proto_list_len = 1usize + proto.len();
+            let ext_data_len = 2usize + proto_list_len;
+            let mut marker = Vec::with_capacity(4 + ext_data_len);
+            marker.extend_from_slice(&0x0010u16.to_be_bytes());
+            marker.extend_from_slice(&(ext_data_len as u16).to_be_bytes());
+            marker.extend_from_slice(&(proto_list_len as u16).to_be_bytes());
+            marker.push(proto.len() as u8);
+            marker.extend_from_slice(proto);
+            marker
+        });
    let mut payload_offset = 0usize;
-    for size in sizes {
+    for (idx, size) in sizes.into_iter().enumerate() {
        let mut rec = Vec::with_capacity(5 + size);
        rec.push(TLS_RECORD_APPLICATION);
        rec.extend_from_slice(&TLS_VERSION);
@@ -233,7 +238,20 @@ pub fn build_emulated_server_hello(
            }
        } else if size > 17 {
            let body_len = size - 17;
-            rec.extend_from_slice(&rng.bytes(body_len));
+            let mut body = Vec::with_capacity(body_len);
+            if idx == 0 && let Some(marker) = &alpn_marker {
+                if marker.len() <= body_len {
+                    body.extend_from_slice(marker);
+                    if body_len > marker.len() {
+                        body.extend_from_slice(&rng.bytes(body_len - marker.len()));
+                    }
+                } else {
+                    body.extend_from_slice(&rng.bytes(body_len));
+                }
+            } else {
+                body.extend_from_slice(&rng.bytes(body_len));
+            }
+            rec.extend_from_slice(&body);
            rec.push(0x16); // inner content type marker (handshake)
            rec.extend_from_slice(&rng.bytes(16)); // AEAD-like tag
        } else {
@@ -245,8 +263,9 @@ pub fn build_emulated_server_hello(
    // --- Combine ---
    // Optional NewSessionTicket mimic records (opaque ApplicationData for fingerprint).
    let mut tickets = Vec::new();
-    if new_session_tickets > 0 {
-        for _ in 0..new_session_tickets {
+    let ticket_count = new_session_tickets.min(4);
+    if ticket_count > 0 {
+        for _ in 0..ticket_count {
            let ticket_len: usize = rng.range(48) + 48;
            let mut rec = Vec::with_capacity(5 + ticket_len);
            rec.push(TLS_RECORD_APPLICATION);
@@ -273,6 +292,10 @@ pub fn build_emulated_server_hello(
    response
 }

+#[cfg(test)]
+#[path = "emulator_security_tests.rs"]
+mod security_tests;
+
 #[cfg(test)]
 mod tests {
    use std::time::SystemTime;
@@ -0,0 +1,136 @@
+use std::time::SystemTime;
+
+use crate::crypto::SecureRandom;
+use crate::protocol::constants::{TLS_RECORD_APPLICATION, TLS_RECORD_CHANGE_CIPHER, TLS_RECORD_HANDSHAKE};
+use crate::tls_front::emulator::build_emulated_server_hello;
+use crate::tls_front::types::{
+    CachedTlsData, ParsedServerHello, TlsBehaviorProfile, TlsCertPayload, TlsProfileSource,
+};
+
+fn make_cached(cert_payload: Option<crate::tls_front::types::TlsCertPayload>) -> CachedTlsData {
+    CachedTlsData {
+        server_hello_template: ParsedServerHello {
+            version: [0x03, 0x03],
+            random: [0u8; 32],
+            session_id: Vec::new(),
+            cipher_suite: [0x13, 0x01],
+            compression: 0,
+            extensions: Vec::new(),
+        },
+        cert_info: None,
+        cert_payload,
+        app_data_records_sizes: vec![64],
+        total_app_data_len: 64,
+        behavior_profile: TlsBehaviorProfile {
+            change_cipher_spec_count: 1,
+            app_data_record_sizes: vec![64],
+            ticket_record_sizes: Vec::new(),
+            source: TlsProfileSource::Default,
+        },
+        fetched_at: SystemTime::now(),
+        domain: "example.com".to_string(),
+    }
+}
+
+fn first_app_data_payload(response: &[u8]) -> &[u8] {
+    let hello_len = u16::from_be_bytes([response[3], response[4]]) as usize;
+    let ccs_start = 5 + hello_len;
+    let ccs_len = u16::from_be_bytes([response[ccs_start + 3], response[ccs_start + 4]]) as usize;
+    let app_start = ccs_start + 5 + ccs_len;
+    let app_len = u16::from_be_bytes([response[app_start + 3], response[app_start + 4]]) as usize;
+    &response[app_start + 5..app_start + 5 + app_len]
+}
+
+#[test]
+fn emulated_server_hello_ignores_oversized_alpn_when_marker_would_not_fit() {
+    let cached = make_cached(None);
+    let rng = SecureRandom::new();
+    let oversized_alpn = vec![0xAB; u8::MAX as usize + 1];
+
+    let response = build_emulated_server_hello(
+        b"secret",
+        &[0x11; 32],
+        &[0x22; 16],
+        &cached,
+        true,
+        &rng,
+        Some(oversized_alpn),
+        0,
+    );
+
+    assert_eq!(response[0], TLS_RECORD_HANDSHAKE);
+    let hello_len = u16::from_be_bytes([response[3], response[4]]) as usize;
+    let ccs_start = 5 + hello_len;
+    assert_eq!(response[ccs_start], TLS_RECORD_CHANGE_CIPHER);
+    let app_start = ccs_start + 6;
+    assert_eq!(response[app_start], TLS_RECORD_APPLICATION);
+
+    let payload = first_app_data_payload(&response);
+    let mut marker_prefix = Vec::new();
+    marker_prefix.extend_from_slice(&0x0010u16.to_be_bytes());
+    marker_prefix.extend_from_slice(&0x0102u16.to_be_bytes());
+    marker_prefix.extend_from_slice(&0x0100u16.to_be_bytes());
+    marker_prefix.push(0xff);
+    marker_prefix.extend_from_slice(&[0xab; 8]);
+    assert!(
+        !payload.starts_with(&marker_prefix),
+        "oversized ALPN must not be partially embedded into the emulated first application record"
+    );
+}
+
+#[test]
+fn emulated_server_hello_embeds_full_alpn_marker_when_body_can_fit() {
+    let cached = make_cached(None);
+    let rng = SecureRandom::new();
+
+    let response = build_emulated_server_hello(
+        b"secret",
+        &[0x31; 32],
+        &[0x41; 16],
+        &cached,
+        true,
+        &rng,
+        Some(b"h2".to_vec()),
+        0,
+    );
+
+    let payload = first_app_data_payload(&response);
+    let expected = [0x00u8, 0x10, 0x00, 0x05, 0x00, 0x03, 0x02, b'h', b'2'];
+    assert!(
+        payload.starts_with(&expected),
+        "when body has enough capacity, emulated first application record must include full ALPN marker"
+    );
+}
+
+#[test]
+fn emulated_server_hello_prefers_cert_payload_over_alpn_marker() {
+    let cert_msg = vec![0x0b, 0x00, 0x00, 0x05, 0x00, 0xaa, 0xbb, 0xcc, 0xdd];
+    let cached = make_cached(Some(TlsCertPayload {
+        cert_chain_der: vec![vec![0x30, 0x01, 0x00]],
+        certificate_message: cert_msg.clone(),
+    }));
+    let rng = SecureRandom::new();
+
+    let response = build_emulated_server_hello(
+        b"secret",
+        &[0x32; 32],
+        &[0x42; 16],
+        &cached,
+        true,
+        &rng,
+        Some(b"h2".to_vec()),
+        0,
+    );
+
+    let payload = first_app_data_payload(&response);
+    let alpn_marker = [0x00u8, 0x10, 0x00, 0x05, 0x00, 0x03, 0x02, b'h', b'2'];
+
+    assert!(
+        payload.starts_with(&cert_msg),
+        "when certificate payload is available, first record must start with cert payload bytes"
+    );
+    assert!(
+        !payload.starts_with(&alpn_marker),
+        "ALPN marker must not displace selected certificate payload"
+    );
+}
@@ -299,11 +299,6 @@ async fn run_update_cycle(
        cfg.general.hardswap,
        cfg.general.me_pool_drain_ttl_secs,
        cfg.general.me_pool_drain_threshold,
-        cfg.general.me_pool_drain_soft_evict_enabled,
-        cfg.general.me_pool_drain_soft_evict_grace_secs,
-        cfg.general.me_pool_drain_soft_evict_per_writer,
-        cfg.general.me_pool_drain_soft_evict_budget_per_core,
-        cfg.general.me_pool_drain_soft_evict_cooldown_ms,
        cfg.general.effective_me_pool_force_close_secs(),
        cfg.general.me_pool_min_fresh_ratio,
        cfg.general.me_hardswap_warmup_delay_min_ms,
@@ -531,11 +526,6 @@ pub async fn me_config_updater(
                    cfg.general.hardswap,
                    cfg.general.me_pool_drain_ttl_secs,
                    cfg.general.me_pool_drain_threshold,
-                    cfg.general.me_pool_drain_soft_evict_enabled,
-                    cfg.general.me_pool_drain_soft_evict_grace_secs,
-                    cfg.general.me_pool_drain_soft_evict_per_writer,
-                    cfg.general.me_pool_drain_soft_evict_budget_per_core,
-                    cfg.general.me_pool_drain_soft_evict_cooldown_ms,
                    cfg.general.effective_me_pool_force_close_secs(),
                    cfg.general.me_pool_min_fresh_ratio,
                    cfg.general.me_hardswap_warmup_delay_min_ms,
@@ -28,8 +28,6 @@ const HEALTH_RECONNECT_BUDGET_MAX: usize = 128;
 const HEALTH_DRAIN_CLOSE_BUDGET_PER_CORE: usize = 16;
 const HEALTH_DRAIN_CLOSE_BUDGET_MIN: usize = 16;
 const HEALTH_DRAIN_CLOSE_BUDGET_MAX: usize = 256;
-const HEALTH_DRAIN_SOFT_EVICT_BUDGET_MIN: usize = 8;
-const HEALTH_DRAIN_SOFT_EVICT_BUDGET_MAX: usize = 256;

 #[derive(Debug, Clone)]
 struct DcFloorPlanEntry {
@@ -68,7 +66,6 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
    let mut adaptive_recover_until: HashMap<(i32, IpFamily), Instant> = HashMap::new();
    let mut floor_warn_next_allowed: HashMap<(i32, IpFamily), Instant> = HashMap::new();
    let mut drain_warn_next_allowed: HashMap<u64, Instant> = HashMap::new();
-    let mut drain_soft_evict_next_allowed: HashMap<u64, Instant> = HashMap::new();
    let mut degraded_interval = true;
    loop {
        let interval = if degraded_interval {
@@ -78,12 +75,7 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
        };
        tokio::time::sleep(interval).await;
        pool.prune_closed_writers().await;
-        reap_draining_writers(
-            &pool,
-            &mut drain_warn_next_allowed,
-            &mut drain_soft_evict_next_allowed,
-        )
-        .await;
+        reap_draining_writers(&pool, &mut drain_warn_next_allowed).await;
        let v4_degraded = check_family(
            IpFamily::V4,
            &pool,
@@ -125,7 +117,6 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
 pub(super) async fn reap_draining_writers(
    pool: &Arc<MePool>,
    warn_next_allowed: &mut HashMap<u64, Instant>,
-    soft_evict_next_allowed: &mut HashMap<u64, Instant>,
 ) {
    let now_epoch_secs = MePool::now_epoch_secs();
    let now = Instant::now();
@@ -133,12 +124,12 @@ pub(super) async fn reap_draining_writers(
    let drain_threshold = pool
        .me_pool_drain_threshold
        .load(std::sync::atomic::Ordering::Relaxed);
-    let writers = pool.writers.read().await.clone();
    let activity = pool.registry.writer_activity_snapshot().await;
-    let mut draining_writers = Vec::new();
+    let mut draining_writers = Vec::<DrainingWriterSnapshot>::new();
    let mut empty_writer_ids = Vec::<u64>::new();
    let mut force_close_writer_ids = Vec::<u64>::new();
-    for writer in writers {
+    let writers = pool.writers.read().await;
+    for writer in writers.iter() {
        if !writer.draining.load(std::sync::atomic::Ordering::Relaxed) {
            continue;
        }
@@ -152,23 +143,38 @@ pub(super) async fn reap_draining_writers(
            empty_writer_ids.push(writer.id);
            continue;
        }
-        draining_writers.push(writer);
+        draining_writers.push(DrainingWriterSnapshot {
+            id: writer.id,
+            writer_dc: writer.writer_dc,
+            addr: writer.addr,
+            generation: writer.generation,
+            created_at: writer.created_at,
+            draining_started_at_epoch_secs: writer
+                .draining_started_at_epoch_secs
+                .load(std::sync::atomic::Ordering::Relaxed),
+            drain_deadline_epoch_secs: writer
+                .drain_deadline_epoch_secs
+                .load(std::sync::atomic::Ordering::Relaxed),
+            allow_drain_fallback: writer
+                .allow_drain_fallback
+                .load(std::sync::atomic::Ordering::Relaxed),
+        });
    }
+    drop(writers);

-    if drain_threshold > 0 && draining_writers.len() > drain_threshold as usize {
+    let overflow = if drain_threshold > 0 && draining_writers.len() > drain_threshold as usize {
+        draining_writers.len().saturating_sub(drain_threshold as usize)
+    } else {
+        0
+    };
+
+    if overflow > 0 {
        draining_writers.sort_by(|left, right| {
-            let left_started = left
-                .draining_started_at_epoch_secs
-                .load(std::sync::atomic::Ordering::Relaxed);
-            let right_started = right
-                .draining_started_at_epoch_secs
-                .load(std::sync::atomic::Ordering::Relaxed);
-            left_started
-                .cmp(&right_started)
+            left.draining_started_at_epoch_secs
+                .cmp(&right.draining_started_at_epoch_secs)
                .then_with(|| left.created_at.cmp(&right.created_at))
                .then_with(|| left.id.cmp(&right.id))
        });
-        let overflow = draining_writers.len().saturating_sub(drain_threshold as usize);
        warn!(
            draining_writers = draining_writers.len(),
            me_pool_drain_threshold = drain_threshold,
@@ -180,15 +186,10 @@ pub(super) async fn reap_draining_writers(
        }
    }

-    let mut active_draining_writer_ids = HashSet::with_capacity(draining_writers.len());
-    for writer in &draining_writers {
-        active_draining_writer_ids.insert(writer.id);
-        let drain_started_at_epoch_secs = writer
-            .draining_started_at_epoch_secs
-            .load(std::sync::atomic::Ordering::Relaxed);
+    for writer in draining_writers {
        if drain_ttl_secs > 0
-            && drain_started_at_epoch_secs != 0
-            && now_epoch_secs.saturating_sub(drain_started_at_epoch_secs) > drain_ttl_secs
+            && writer.draining_started_at_epoch_secs != 0
+            && now_epoch_secs.saturating_sub(writer.draining_started_at_epoch_secs) > drain_ttl_secs
            && should_emit_writer_warn(
                warn_next_allowed,
                writer.id,
@@ -203,99 +204,14 @@ pub(super) async fn reap_draining_writers(
                generation = writer.generation,
                drain_ttl_secs,
                force_close_secs = pool.me_pool_force_close_secs.load(std::sync::atomic::Ordering::Relaxed),
-                allow_drain_fallback = writer.allow_drain_fallback.load(std::sync::atomic::Ordering::Relaxed),
+                allow_drain_fallback = writer.allow_drain_fallback,
                "ME draining writer remains non-empty past drain TTL"
            );
        }
-        let deadline_epoch_secs = writer
-            .drain_deadline_epoch_secs
-            .load(std::sync::atomic::Ordering::Relaxed);
-        if deadline_epoch_secs != 0 && now_epoch_secs >= deadline_epoch_secs {
+        if writer.drain_deadline_epoch_secs != 0 && now_epoch_secs >= writer.drain_deadline_epoch_secs
+        {
            warn!(writer_id = writer.id, "Drain timeout, force-closing");
            force_close_writer_ids.push(writer.id);
-            active_draining_writer_ids.remove(&writer.id);
-        }
-    }
-
-    warn_next_allowed.retain(|writer_id, _| active_draining_writer_ids.contains(writer_id));
-    soft_evict_next_allowed.retain(|writer_id, _| active_draining_writer_ids.contains(writer_id));
-
-    if pool.drain_soft_evict_enabled() && drain_ttl_secs > 0 && !draining_writers.is_empty() {
-        let mut force_close_ids = HashSet::<u64>::with_capacity(force_close_writer_ids.len());
-        for writer_id in &force_close_writer_ids {
-            force_close_ids.insert(*writer_id);
-        }
-        let soft_grace_secs = pool.drain_soft_evict_grace_secs();
-        let soft_trigger_age_secs = drain_ttl_secs.saturating_add(soft_grace_secs);
-        let per_writer_limit = pool.drain_soft_evict_per_writer();
-        let soft_budget = health_drain_soft_evict_budget(pool);
-        let soft_cooldown = pool.drain_soft_evict_cooldown();
-        let mut soft_evicted_total = 0usize;
-
-        for writer in &draining_writers {
-            if soft_evicted_total >= soft_budget {
-                break;
-            }
-            if force_close_ids.contains(&writer.id) {
-                continue;
-            }
-            if pool.writer_accepts_new_binding(writer) {
-                continue;
-            }
-            let started_epoch_secs = writer
-                .draining_started_at_epoch_secs
-                .load(std::sync::atomic::Ordering::Relaxed);
-            if started_epoch_secs == 0
-                || now_epoch_secs.saturating_sub(started_epoch_secs) < soft_trigger_age_secs
-            {
-                continue;
-            }
-            if !should_emit_writer_warn(
-                soft_evict_next_allowed,
-                writer.id,
-                now,
-                soft_cooldown,
-            ) {
-                continue;
-            }
-
-            let remaining_budget = soft_budget.saturating_sub(soft_evicted_total);
-            let limit = per_writer_limit.min(remaining_budget);
-            if limit == 0 {
-                break;
-            }
-            let conn_ids = pool
-                .registry
-                .bound_conn_ids_for_writer_limited(writer.id, limit)
-                .await;
-            if conn_ids.is_empty() {
-                continue;
-            }
-
-            let mut evicted_for_writer = 0usize;
-            for conn_id in conn_ids {
-                if pool.registry.evict_bound_conn_if_writer(conn_id, writer.id).await {
-                    evicted_for_writer = evicted_for_writer.saturating_add(1);
-                    soft_evicted_total = soft_evicted_total.saturating_add(1);
-                    pool.stats.increment_pool_drain_soft_evict_total();
-                    if soft_evicted_total >= soft_budget {
-                        break;
-                    }
-                }
-            }
-
-            if evicted_for_writer > 0 {
-                pool.stats.increment_pool_drain_soft_evict_writer_total();
-                info!(
-                    writer_id = writer.id,
-                    writer_dc = writer.writer_dc,
-                    endpoint = %writer.addr,
-                    drained_connections = evicted_for_writer,
-                    soft_budget,
-                    soft_trigger_age_secs,
-                    "ME draining writer soft-evicted bound clients"
-                );
-            }
        }
    }

@@ -323,7 +239,9 @@ pub(super) async fn reap_draining_writers(
        if !closed_writer_ids.insert(writer_id) {
            continue;
        }
-        pool.remove_writer_and_close_clients(writer_id).await;
+        if !pool.remove_writer_if_empty(writer_id).await {
+            continue;
+        }
        closed_total = closed_total.saturating_add(1);
    }

@@ -336,6 +254,18 @@ pub(super) async fn reap_draining_writers(
            "ME draining close backlog deferred to next health cycle"
        );
    }
+
+    // Keep warn cooldown state for draining writers still present in the pool;
+    // drop state only once a writer is actually removed.
+    let active_draining_writer_ids = {
+        let writers = pool.writers.read().await;
+        writers
+            .iter()
+            .filter(|writer| writer.draining.load(std::sync::atomic::Ordering::Relaxed))
+            .map(|writer| writer.id)
+            .collect::<HashSet<u64>>()
+    };
+    warn_next_allowed.retain(|writer_id, _| active_draining_writer_ids.contains(writer_id));
 }

 pub(super) fn health_drain_close_budget() -> usize {
@@ -347,17 +277,16 @@ pub(super) fn health_drain_close_budget() -> usize {
        .clamp(HEALTH_DRAIN_CLOSE_BUDGET_MIN, HEALTH_DRAIN_CLOSE_BUDGET_MAX)
 }

-pub(super) fn health_drain_soft_evict_budget(pool: &MePool) -> usize {
-    let cpu_cores = std::thread::available_parallelism()
-        .map(std::num::NonZeroUsize::get)
-        .unwrap_or(1);
-    let per_core = pool.drain_soft_evict_budget_per_core();
-    cpu_cores
-        .saturating_mul(per_core)
-        .clamp(
-            HEALTH_DRAIN_SOFT_EVICT_BUDGET_MIN,
-            HEALTH_DRAIN_SOFT_EVICT_BUDGET_MAX,
-        )
+#[derive(Debug, Clone)]
+struct DrainingWriterSnapshot {
+    id: u64,
+    writer_dc: i32,
+    addr: SocketAddr,
+    generation: u64,
+    created_at: Instant,
+    draining_started_at_epoch_secs: u64,
+    drain_deadline_epoch_secs: u64,
+    allow_drain_fallback: bool,
 }

 fn should_emit_writer_warn(
@@ -1493,6 +1422,15 @@ mod tests {
            me_pool_drain_threshold,
            ..GeneralConfig::default()
        };
+        let mut proxy_map_v4 = HashMap::new();
+        proxy_map_v4.insert(
+            2,
+            vec![(IpAddr::V4(Ipv4Addr::new(203, 0, 113, 10)), 443)],
+        );
+        let decision = NetworkDecision {
+            ipv4_me: true,
+            ..NetworkDecision::default()
+        };
        MePool::new(
            None,
            vec![1u8; 32],
@@ -1504,10 +1442,10 @@ mod tests {
            None,
            12,
            1200,
-            HashMap::new(),
+            proxy_map_v4,
            HashMap::new(),
            None,
-            NetworkDecision::default(),
+            decision,
            None,
            Arc::new(SecureRandom::new()),
            Arc::new(Stats::default()),
@@ -1545,11 +1483,6 @@ mod tests {
            general.hardswap,
            general.me_pool_drain_ttl_secs,
            general.me_pool_drain_threshold,
-            general.me_pool_drain_soft_evict_enabled,
-            general.me_pool_drain_soft_evict_grace_secs,
-            general.me_pool_drain_soft_evict_per_writer,
-            general.me_pool_drain_soft_evict_budget_per_core,
-            general.me_pool_drain_soft_evict_cooldown_ms,
            general.effective_me_pool_force_close_secs(),
            general.me_pool_min_fresh_ratio,
            general.me_hardswap_warmup_delay_min_ms,
@@ -1623,19 +1556,66 @@ mod tests {
        conn_id
    }

+    async fn insert_live_writer(pool: &Arc<MePool>, writer_id: u64, writer_dc: i32) {
+        let (tx, _writer_rx) = mpsc::channel::<WriterCommand>(8);
+        let writer = MeWriter {
+            id: writer_id,
+            addr: SocketAddr::new(
+                IpAddr::V4(Ipv4Addr::new(203, 0, 113, (writer_id as u8).saturating_add(1))),
+                4000 + writer_id as u16,
+            ),
+            source_ip: IpAddr::V4(Ipv4Addr::LOCALHOST),
+            writer_dc,
+            generation: 2,
+            contour: Arc::new(AtomicU8::new(WriterContour::Active.as_u8())),
+            created_at: Instant::now(),
+            tx: tx.clone(),
+            cancel: CancellationToken::new(),
+            degraded: Arc::new(AtomicBool::new(false)),
+            rtt_ema_ms_x10: Arc::new(AtomicU32::new(0)),
+            draining: Arc::new(AtomicBool::new(false)),
+            draining_started_at_epoch_secs: Arc::new(AtomicU64::new(0)),
+            drain_deadline_epoch_secs: Arc::new(AtomicU64::new(0)),
+            allow_drain_fallback: Arc::new(AtomicBool::new(false)),
+        };
+        pool.writers.write().await.push(writer);
+        pool.registry.register_writer(writer_id, tx).await;
+        pool.conn_count.fetch_add(1, Ordering::Relaxed);
+    }
+
    #[tokio::test]
    async fn reap_draining_writers_force_closes_oldest_over_threshold() {
+        let pool = make_pool(2).await;
+        insert_live_writer(&pool, 1, 2).await;
+        let now_epoch_secs = MePool::now_epoch_secs();
+        let conn_a = insert_draining_writer(&pool, 10, now_epoch_secs.saturating_sub(30)).await;
+        let conn_b = insert_draining_writer(&pool, 20, now_epoch_secs.saturating_sub(20)).await;
+        let conn_c = insert_draining_writer(&pool, 30, now_epoch_secs.saturating_sub(10)).await;
+        let mut warn_next_allowed = HashMap::new();
+
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
+
+        let mut writer_ids: Vec<u64> = pool.writers.read().await.iter().map(|writer| writer.id).collect();
+        writer_ids.sort_unstable();
+        assert_eq!(writer_ids, vec![1, 20, 30]);
+        assert!(pool.registry.get_writer(conn_a).await.is_none());
+        assert_eq!(pool.registry.get_writer(conn_b).await.unwrap().writer_id, 20);
+        assert_eq!(pool.registry.get_writer(conn_c).await.unwrap().writer_id, 30);
+    }
+
+    #[tokio::test]
+    async fn reap_draining_writers_force_closes_overflow_without_replacement() {
        let pool = make_pool(2).await;
        let now_epoch_secs = MePool::now_epoch_secs();
        let conn_a = insert_draining_writer(&pool, 10, now_epoch_secs.saturating_sub(30)).await;
        let conn_b = insert_draining_writer(&pool, 20, now_epoch_secs.saturating_sub(20)).await;
        let conn_c = insert_draining_writer(&pool, 30, now_epoch_secs.saturating_sub(10)).await;
        let mut warn_next_allowed = HashMap::new();
-        let mut soft_evict_next_allowed = HashMap::new();

-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;

-        let writer_ids: Vec<u64> = pool.writers.read().await.iter().map(|writer| writer.id).collect();
+        let mut writer_ids: Vec<u64> = pool.writers.read().await.iter().map(|writer| writer.id).collect();
+        writer_ids.sort_unstable();
        assert_eq!(writer_ids, vec![20, 30]);
        assert!(pool.registry.get_writer(conn_a).await.is_none());
        assert_eq!(pool.registry.get_writer(conn_b).await.unwrap().writer_id, 20);
@@ -1650,9 +1630,8 @@ mod tests {
        let conn_b = insert_draining_writer(&pool, 20, now_epoch_secs.saturating_sub(20)).await;
        let conn_c = insert_draining_writer(&pool, 30, now_epoch_secs.saturating_sub(10)).await;
        let mut warn_next_allowed = HashMap::new();
-        let mut soft_evict_next_allowed = HashMap::new();

-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;

        let writer_ids: Vec<u64> = pool.writers.read().await.iter().map(|writer| writer.id).collect();
        assert_eq!(writer_ids, vec![10, 20, 30]);
@@ -1,4 +1,5 @@
 use std::collections::HashMap;
+use std::collections::HashSet;
 use std::net::{IpAddr, Ipv4Addr, SocketAddr};
 use std::sync::Arc;
 use std::sync::atomic::{AtomicBool, AtomicU8, AtomicU32, AtomicU64, Ordering};
@@ -82,11 +83,6 @@ async fn make_pool(
        general.hardswap,
        general.me_pool_drain_ttl_secs,
        general.me_pool_drain_threshold,
-        general.me_pool_drain_soft_evict_enabled,
-        general.me_pool_drain_soft_evict_grace_secs,
-        general.me_pool_drain_soft_evict_per_writer,
-        general.me_pool_drain_soft_evict_budget_per_core,
-        general.me_pool_drain_soft_evict_cooldown_ms,
        general.effective_me_pool_force_close_secs(),
        general.me_pool_min_fresh_ratio,
        general.me_hardswap_warmup_delay_min_ms,
@@ -186,15 +182,48 @@ async fn sorted_writer_ids(pool: &Arc<MePool>) -> Vec<u64> {
    ids
 }

+fn lcg_next(state: &mut u64) -> u64 {
+    *state = state.wrapping_mul(6364136223846793005).wrapping_add(1);
+    *state
+}
+
+async fn draining_writer_ids(pool: &Arc<MePool>) -> HashSet<u64> {
+    pool.writers
+        .read()
+        .await
+        .iter()
+        .filter(|writer| writer.draining.load(Ordering::Relaxed))
+        .map(|writer| writer.id)
+        .collect::<HashSet<u64>>()
+}
+
+async fn set_writer_runtime_state(
+    pool: &Arc<MePool>,
+    writer_id: u64,
+    draining: bool,
+    drain_started_at_epoch_secs: u64,
+    drain_deadline_epoch_secs: u64,
+) {
+    let writers = pool.writers.read().await;
+    if let Some(writer) = writers.iter().find(|writer| writer.id == writer_id) {
+        writer.draining.store(draining, Ordering::Relaxed);
+        writer
+            .draining_started_at_epoch_secs
+            .store(drain_started_at_epoch_secs, Ordering::Relaxed);
+        writer
+            .drain_deadline_epoch_secs
+            .store(drain_deadline_epoch_secs, Ordering::Relaxed);
+    }
+}
+
 #[tokio::test]
 async fn reap_draining_writers_clears_warn_state_when_pool_empty() {
    let (pool, _rng) = make_pool(128, 1, 1).await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();
    warn_next_allowed.insert(11, Instant::now() + Duration::from_secs(5));
    warn_next_allowed.insert(22, Instant::now() + Duration::from_secs(5));

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert!(warn_next_allowed.is_empty());
 }
@@ -203,8 +232,6 @@ async fn reap_draining_writers_clears_warn_state_when_pool_empty() {
 async fn reap_draining_writers_respects_threshold_across_multiple_overflow_cycles() {
    let threshold = 3u64;
    let (pool, _rng) = make_pool(threshold, 1, 1).await;
-    pool.me_pool_drain_soft_evict_enabled
-        .store(false, Ordering::Relaxed);
    let now_epoch_secs = MePool::now_epoch_secs();

    for writer_id in 1..=60u64 {
@@ -219,9 +246,8 @@ async fn reap_draining_writers_respects_threshold_across_multiple_overflow_cycle
    }

    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();
    for _ in 0..64 {
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
        if writer_count(&pool).await <= threshold as usize {
            break;
        }
@@ -249,12 +275,11 @@ async fn reap_draining_writers_handles_large_empty_writer_population() {
    }

    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();
    for _ in 0..24 {
        if writer_count(&pool).await == 0 {
            break;
        }
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
    }

    assert_eq!(writer_count(&pool).await, 0);
@@ -278,12 +303,11 @@ async fn reap_draining_writers_processes_mass_deadline_expiry_without_unbounded_
    }

    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();
    for _ in 0..40 {
        if writer_count(&pool).await == 0 {
            break;
        }
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
    }

    assert_eq!(writer_count(&pool).await, 0);
@@ -294,7 +318,6 @@ async fn reap_draining_writers_maintains_warn_state_subset_property_under_bulk_c
    let (pool, _rng) = make_pool(128, 1, 1).await;
    let now_epoch_secs = MePool::now_epoch_secs();
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

    for wave in 0..40u64 {
        for offset in 0..8u64 {
@@ -308,7 +331,7 @@ async fn reap_draining_writers_maintains_warn_state_subset_property_under_bulk_c
            .await;
        }

-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
        assert!(warn_next_allowed.len() <= writer_count(&pool).await);

        let ids = sorted_writer_ids(&pool).await;
@@ -316,7 +339,7 @@ async fn reap_draining_writers_maintains_warn_state_subset_property_under_bulk_c
            let _ = pool.remove_writer_and_close_clients(writer_id).await;
        }

-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
        assert!(warn_next_allowed.len() <= writer_count(&pool).await);
    }
 }
@@ -338,10 +361,9 @@ async fn reap_draining_writers_budgeted_cleanup_never_increases_pool_size() {
    }

    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();
    let mut previous = writer_count(&pool).await;
    for _ in 0..32 {
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
        let current = writer_count(&pool).await;
        assert!(current <= previous);
        previous = current;
@@ -443,6 +465,149 @@ async fn me_health_monitor_eliminates_mixed_empty_and_deadline_backlog() {
    assert!(writer_count(&pool).await <= threshold as usize);
 }

+#[tokio::test]
+async fn reap_draining_writers_deterministic_mixed_state_churn_preserves_invariants() {
+    let threshold = 9u64;
+    let (pool, _rng) = make_pool(threshold, 1, 1).await;
+    let mut warn_next_allowed = HashMap::new();
+    let mut seed = 0x9E37_79B9_7F4A_7C15u64;
+    let mut next_writer_id = 20_000u64;
+    let now_epoch_secs = MePool::now_epoch_secs();
+
+    for writer_id in 1..=72u64 {
+        let bound_clients = if writer_id % 4 == 0 { 0 } else { 1 };
+        let deadline = if writer_id % 5 == 0 {
+            now_epoch_secs.saturating_sub(1)
+        } else {
+            0
+        };
+        insert_draining_writer(
+            &pool,
+            writer_id,
+            now_epoch_secs.saturating_sub(500).saturating_add(writer_id),
+            bound_clients,
+            deadline,
+        )
+        .await;
+    }
+
+    for _round in 0..90 {
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
+
+        let draining_ids = draining_writer_ids(&pool).await;
+        assert!(
+            warn_next_allowed.keys().all(|id| draining_ids.contains(id)),
+            "warn-state keys must always be a subset of live draining writers"
+        );
+
+        let writer_ids = sorted_writer_ids(&pool).await;
+        if writer_ids.is_empty() {
+            continue;
+        }
+
+        let remove_n = (lcg_next(&mut seed) % 3) as usize;
+        for writer_id in writer_ids.iter().copied().take(remove_n) {
+            let _ = pool.remove_writer_and_close_clients(writer_id).await;
+        }
+
+        let survivors = sorted_writer_ids(&pool).await;
+        if !survivors.is_empty() {
+            let idx = (lcg_next(&mut seed) as usize) % survivors.len();
+            let target = survivors[idx];
+            set_writer_runtime_state(&pool, target, false, 0, 0).await;
+        }
+
+        let survivors = sorted_writer_ids(&pool).await;
+        if survivors.len() > 1 {
+            let idx = (lcg_next(&mut seed) as usize) % survivors.len();
+            let target = survivors[idx];
+            let expired_deadline = if lcg_next(&mut seed) & 1 == 0 {
+                now_epoch_secs.saturating_sub(1)
+            } else {
+                0
+            };
+            set_writer_runtime_state(
+                &pool,
+                target,
+                true,
+                now_epoch_secs.saturating_sub(120),
+                expired_deadline,
+            )
+            .await;
+        }
+
+        let inject_n = (lcg_next(&mut seed) % 4) as usize;
+        for _ in 0..inject_n {
+            let bound_clients = if lcg_next(&mut seed) & 1 == 0 { 0 } else { 1 };
+            let deadline = if lcg_next(&mut seed) & 1 == 0 {
+                now_epoch_secs.saturating_sub(1)
+            } else {
+                0
+            };
+            insert_draining_writer(
+                &pool,
+                next_writer_id,
+                now_epoch_secs.saturating_sub(240),
+                bound_clients,
+                deadline,
+            )
+            .await;
+            next_writer_id = next_writer_id.saturating_add(1);
+        }
+    }
+
+    for _ in 0..64 {
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
+        if writer_count(&pool).await <= threshold as usize {
+            break;
+        }
+    }
+
+    assert!(writer_count(&pool).await <= threshold as usize);
+    let draining_ids = draining_writer_ids(&pool).await;
+    assert!(warn_next_allowed.keys().all(|id| draining_ids.contains(id)));
+}
+
+#[tokio::test]
+async fn reap_draining_writers_repeated_draining_flips_never_leave_stale_warn_state() {
+    let (pool, _rng) = make_pool(64, 1, 1).await;
+    let now_epoch_secs = MePool::now_epoch_secs();
+
+    for writer_id in 1..=24u64 {
+        insert_draining_writer(
+            &pool,
+            writer_id,
+            now_epoch_secs.saturating_sub(240),
+            1,
+            0,
+        )
+        .await;
+    }
+
+    let mut warn_next_allowed = HashMap::new();
+    for _round in 0..48u64 {
+        for writer_id in 1..=24u64 {
+            let draining = (writer_id + _round) % 3 != 0;
+            set_writer_runtime_state(
+                &pool,
+                writer_id,
+                draining,
+                now_epoch_secs.saturating_sub(120),
+                0,
+            )
+            .await;
+        }
+
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
+
+        let draining_ids = draining_writer_ids(&pool).await;
+        assert!(
+            warn_next_allowed.keys().all(|id| draining_ids.contains(id)),
+            "warn-state map must not retain entries for writers outside draining set"
+        );
+    }
+}
+
 #[test]
 fn health_drain_close_budget_is_within_expected_bounds() {
    let budget = health_drain_close_budget();
@@ -81,11 +81,6 @@ async fn make_pool(
        general.hardswap,
        general.me_pool_drain_ttl_secs,
        general.me_pool_drain_threshold,
-        general.me_pool_drain_soft_evict_enabled,
-        general.me_pool_drain_soft_evict_grace_secs,
-        general.me_pool_drain_soft_evict_per_writer,
-        general.me_pool_drain_soft_evict_budget_per_core,
-        general.me_pool_drain_soft_evict_cooldown_ms,
        general.effective_me_pool_force_close_secs(),
        general.me_pool_min_fresh_ratio,
        general.me_hardswap_warmup_delay_min_ms,
@@ -166,6 +161,20 @@ async fn insert_draining_writer(
    }
 }

+async fn wait_for_pool_empty(pool: &Arc<MePool>, timeout: Duration) {
+    let start = Instant::now();
+    loop {
+        if pool.writers.read().await.is_empty() {
+            return;
+        }
+        assert!(
+            start.elapsed() < timeout,
+            "timed out waiting for pool.writers to become empty"
+        );
+        tokio::time::sleep(Duration::from_millis(5)).await;
+    }
+}
+
 #[tokio::test]
 async fn me_health_monitor_drains_expired_backlog_over_multiple_cycles() {
    let (pool, rng) = make_pool(128, 1, 1).await;
@@ -183,7 +192,7 @@ async fn me_health_monitor_drains_expired_backlog_over_multiple_cycles() {
    }

    let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
-    tokio::time::sleep(Duration::from_millis(60)).await;
+    wait_for_pool_empty(&pool, Duration::from_secs(1)).await;
    monitor.abort();
    let _ = monitor.await;

@@ -199,7 +208,7 @@ async fn me_health_monitor_cleans_empty_draining_writers_without_force_close() {
    }

    let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
-    tokio::time::sleep(Duration::from_millis(30)).await;
+    wait_for_pool_empty(&pool, Duration::from_secs(1)).await;
    monitor.abort();
    let _ = monitor.await;

@@ -224,7 +233,7 @@ async fn me_health_monitor_converges_retry_like_threshold_backlog_to_empty() {
    }

    let monitor = tokio::spawn(me_health_monitor(pool.clone(), rng, 0));
-    tokio::time::sleep(Duration::from_millis(60)).await;
+    wait_for_pool_empty(&pool, Duration::from_secs(1)).await;
    monitor.abort();
    let _ = monitor.await;

@@ -39,7 +39,7 @@ async fn make_pool(me_pool_drain_threshold: u64) -> Arc<MePool> {
        NetworkDecision::default(),
        None,
        Arc::new(SecureRandom::new()),
-        Arc::new(Stats::new()),
+        Arc::new(Stats::default()),
        general.me_keepalive_enabled,
        general.me_keepalive_interval_secs,
        general.me_keepalive_jitter_secs,
@@ -74,11 +74,6 @@ async fn make_pool(me_pool_drain_threshold: u64) -> Arc<MePool> {
        general.hardswap,
        general.me_pool_drain_ttl_secs,
        general.me_pool_drain_threshold,
-        general.me_pool_drain_soft_evict_enabled,
-        general.me_pool_drain_soft_evict_grace_secs,
-        general.me_pool_drain_soft_evict_per_writer,
-        general.me_pool_drain_soft_evict_budget_per_core,
-        general.me_pool_drain_soft_evict_cooldown_ms,
        general.effective_me_pool_force_close_secs(),
        general.me_pool_min_fresh_ratio,
        general.me_hardswap_warmup_delay_min_ms,
@@ -173,6 +168,21 @@ async fn current_writer_ids(pool: &Arc<MePool>) -> Vec<u64> {
    writer_ids
 }

+async fn writer_exists(pool: &Arc<MePool>, writer_id: u64) -> bool {
+    pool.writers
+        .read()
+        .await
+        .iter()
+        .any(|writer| writer.id == writer_id)
+}
+
+async fn set_writer_draining(pool: &Arc<MePool>, writer_id: u64, draining: bool) {
+    let writers = pool.writers.read().await;
+    if let Some(writer) = writers.iter().find(|writer| writer.id == writer_id) {
+        writer.draining.store(draining, Ordering::Relaxed);
+    }
+}
+
 #[tokio::test]
 async fn reap_draining_writers_drops_warn_state_for_removed_writer() {
    let pool = make_pool(128).await;
@@ -180,15 +190,14 @@ async fn reap_draining_writers_drops_warn_state_for_removed_writer() {
    let conn_ids =
        insert_draining_writer(&pool, 7, now_epoch_secs.saturating_sub(180), 1, 0).await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;
    assert!(warn_next_allowed.contains_key(&7));

    let _ = pool.remove_writer_and_close_clients(7).await;
    assert!(pool.registry.get_writer(conn_ids[0]).await.is_none());

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;
    assert!(!warn_next_allowed.contains_key(&7));
 }

@@ -200,9 +209,8 @@ async fn reap_draining_writers_removes_empty_draining_writers() {
    insert_draining_writer(&pool, 2, now_epoch_secs.saturating_sub(30), 0, 0).await;
    insert_draining_writer(&pool, 3, now_epoch_secs.saturating_sub(20), 1, 0).await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert_eq!(current_writer_ids(&pool).await, vec![3]);
 }
@@ -216,9 +224,8 @@ async fn reap_draining_writers_overflow_closes_oldest_non_empty_writers() {
    insert_draining_writer(&pool, 33, now_epoch_secs.saturating_sub(20), 1, 0).await;
    insert_draining_writer(&pool, 44, now_epoch_secs.saturating_sub(10), 1, 0).await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert_eq!(current_writer_ids(&pool).await, vec![33, 44]);
 }
@@ -236,9 +243,8 @@ async fn reap_draining_writers_deadline_force_close_applies_under_threshold() {
    )
    .await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert!(current_writer_ids(&pool).await.is_empty());
 }
@@ -260,13 +266,129 @@ async fn reap_draining_writers_limits_closes_per_health_tick() {
        .await;
    }
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert_eq!(pool.writers.read().await.len(), writer_total - close_budget);
 }

+#[tokio::test]
+async fn reap_draining_writers_keeps_warn_state_for_deadline_backlog_writers() {
+    let pool = make_pool(0).await;
+    let now_epoch_secs = MePool::now_epoch_secs();
+    let close_budget = health_drain_close_budget();
+    let writer_total = close_budget.saturating_add(5);
+    for writer_id in 1..=writer_total as u64 {
+        insert_draining_writer(
+            &pool,
+            writer_id,
+            now_epoch_secs.saturating_sub(60),
+            1,
+            now_epoch_secs.saturating_sub(1),
+        )
+        .await;
+    }
+    let target_writer_id = writer_total as u64;
+    let mut warn_next_allowed = HashMap::new();
+    warn_next_allowed.insert(
+        target_writer_id,
+        Instant::now() + Duration::from_secs(300),
+    );
+
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;
+
+    assert!(writer_exists(&pool, target_writer_id).await);
+    assert!(warn_next_allowed.contains_key(&target_writer_id));
+}
+
+#[tokio::test]
+async fn reap_draining_writers_keeps_warn_state_for_overflow_backlog_writers() {
+    let pool = make_pool(1).await;
+    let now_epoch_secs = MePool::now_epoch_secs();
+    let close_budget = health_drain_close_budget();
+    let writer_total = close_budget.saturating_add(6);
+    for writer_id in 1..=writer_total as u64 {
+        insert_draining_writer(
+            &pool,
+            writer_id,
+            now_epoch_secs.saturating_sub(300).saturating_add(writer_id),
+            1,
+            0,
+        )
+        .await;
+    }
+    let target_writer_id = writer_total.saturating_sub(1) as u64;
+    let mut warn_next_allowed = HashMap::new();
+    warn_next_allowed.insert(
+        target_writer_id,
+        Instant::now() + Duration::from_secs(300),
+    );
+
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;
+
+    assert!(writer_exists(&pool, target_writer_id).await);
+    assert!(warn_next_allowed.contains_key(&target_writer_id));
+}
+
+#[tokio::test]
+async fn reap_draining_writers_drops_warn_state_when_writer_exits_draining_state() {
+    let pool = make_pool(128).await;
+    let now_epoch_secs = MePool::now_epoch_secs();
+    insert_draining_writer(&pool, 71, now_epoch_secs.saturating_sub(60), 1, 0).await;
+
+    let mut warn_next_allowed = HashMap::new();
+    warn_next_allowed.insert(71, Instant::now() + Duration::from_secs(300));
+
+    set_writer_draining(&pool, 71, false).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;
+
+    assert!(writer_exists(&pool, 71).await);
+    assert!(
+        !warn_next_allowed.contains_key(&71),
+        "warn cooldown state must be dropped after writer leaves draining state"
+    );
+}
+
+#[tokio::test]
+async fn reap_draining_writers_preserves_warn_state_across_multiple_budget_deferrals() {
+    let pool = make_pool(0).await;
+    let now_epoch_secs = MePool::now_epoch_secs();
+    let close_budget = health_drain_close_budget();
+    let writer_total = close_budget.saturating_mul(2).saturating_add(1);
+    for writer_id in 1..=writer_total as u64 {
+        insert_draining_writer(
+            &pool,
+            writer_id,
+            now_epoch_secs.saturating_sub(120),
+            1,
+            now_epoch_secs.saturating_sub(1),
+        )
+        .await;
+    }
+
+    let tail_writer_id = writer_total as u64;
+    let mut warn_next_allowed = HashMap::new();
+    warn_next_allowed.insert(
+        tail_writer_id,
+        Instant::now() + Duration::from_secs(300),
+    );
+
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;
+    assert!(writer_exists(&pool, tail_writer_id).await);
+    assert!(warn_next_allowed.contains_key(&tail_writer_id));
+
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;
+    assert!(writer_exists(&pool, tail_writer_id).await);
+    assert!(warn_next_allowed.contains_key(&tail_writer_id));
+
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;
+    assert!(!writer_exists(&pool, tail_writer_id).await);
+    assert!(
+        !warn_next_allowed.contains_key(&tail_writer_id),
+        "warn cooldown state must clear once writer is actually removed"
+    );
+}
+
 #[tokio::test]
 async fn reap_draining_writers_backlog_drains_across_ticks() {
    let pool = make_pool(128).await;
@@ -284,13 +406,12 @@ async fn reap_draining_writers_backlog_drains_across_ticks() {
        .await;
    }
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

    for _ in 0..8 {
        if pool.writers.read().await.is_empty() {
            break;
        }
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
    }

    assert!(pool.writers.read().await.is_empty());
@@ -314,10 +435,9 @@ async fn reap_draining_writers_threshold_backlog_converges_to_threshold() {
        .await;
    }
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

    for _ in 0..16 {
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
        if pool.writers.read().await.len() <= threshold as usize {
            break;
        }
@@ -334,9 +454,8 @@ async fn reap_draining_writers_threshold_zero_preserves_non_expired_non_empty_wr
    insert_draining_writer(&pool, 20, now_epoch_secs.saturating_sub(30), 1, 0).await;
    insert_draining_writer(&pool, 30, now_epoch_secs.saturating_sub(20), 1, 0).await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert_eq!(current_writer_ids(&pool).await, vec![10, 20, 30]);
 }
@@ -359,9 +478,8 @@ async fn reap_draining_writers_prioritizes_force_close_before_empty_cleanup() {
    let empty_writer_id = close_budget as u64 + 1;
    insert_draining_writer(&pool, empty_writer_id, now_epoch_secs.saturating_sub(20), 0, 0).await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert_eq!(current_writer_ids(&pool).await, vec![empty_writer_id]);
 }
@@ -373,9 +491,8 @@ async fn reap_draining_writers_empty_cleanup_does_not_increment_force_close_metr
    insert_draining_writer(&pool, 1, now_epoch_secs.saturating_sub(60), 0, 0).await;
    insert_draining_writer(&pool, 2, now_epoch_secs.saturating_sub(50), 0, 0).await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert!(current_writer_ids(&pool).await.is_empty());
    assert_eq!(pool.stats.get_pool_force_close_total(), 0);
@@ -402,9 +519,8 @@ async fn reap_draining_writers_handles_duplicate_force_close_requests_for_same_w
    )
    .await;
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+    reap_draining_writers(&pool, &mut warn_next_allowed).await;

    assert!(current_writer_ids(&pool).await.is_empty());
 }
@@ -414,7 +530,6 @@ async fn reap_draining_writers_warn_state_never_exceeds_live_draining_population
    let pool = make_pool(128).await;
    let now_epoch_secs = MePool::now_epoch_secs();
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

    for wave in 0..12u64 {
        for offset in 0..9u64 {
@@ -427,14 +542,14 @@ async fn reap_draining_writers_warn_state_never_exceeds_live_draining_population
            )
            .await;
        }
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
        assert!(warn_next_allowed.len() <= pool.writers.read().await.len());

        let existing_writer_ids = current_writer_ids(&pool).await;
        for writer_id in existing_writer_ids.into_iter().take(4) {
            let _ = pool.remove_writer_and_close_clients(writer_id).await;
        }
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
        assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
    }
 }
@@ -444,7 +559,6 @@ async fn reap_draining_writers_mixed_backlog_converges_without_leaking_warn_stat
    let pool = make_pool(6).await;
    let now_epoch_secs = MePool::now_epoch_secs();
    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();

    for writer_id in 1..=18u64 {
        let bound_clients = if writer_id % 3 == 0 { 0 } else { 1 };
@@ -464,7 +578,7 @@ async fn reap_draining_writers_mixed_backlog_converges_without_leaking_warn_stat
    }

    for _ in 0..16 {
-        reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
+        reap_draining_writers(&pool, &mut warn_next_allowed).await;
        if pool.writers.read().await.len() <= 6 {
            break;
        }
@@ -474,60 +588,71 @@ async fn reap_draining_writers_mixed_backlog_converges_without_leaking_warn_stat
    assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
 }

-#[tokio::test]
-async fn reap_draining_writers_soft_evicts_stuck_writer_with_per_writer_cap() {
-    let pool = make_pool(128).await;
-    pool.me_pool_drain_soft_evict_enabled.store(true, Ordering::Relaxed);
-    pool.me_pool_drain_soft_evict_grace_secs.store(0, Ordering::Relaxed);
-    pool.me_pool_drain_soft_evict_per_writer.store(1, Ordering::Relaxed);
-    pool.me_pool_drain_soft_evict_budget_per_core.store(8, Ordering::Relaxed);
-    pool.me_pool_drain_soft_evict_cooldown_ms
-        .store(1, Ordering::Relaxed);
-
-    let now_epoch_secs = MePool::now_epoch_secs();
-    insert_draining_writer(&pool, 77, now_epoch_secs.saturating_sub(240), 3, 0).await;
-    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();
-
-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
-
-    let activity = pool.registry.writer_activity_snapshot().await;
-    assert_eq!(activity.bound_clients_by_writer.get(&77), Some(&2));
-    assert_eq!(pool.stats.get_pool_drain_soft_evict_total(), 1);
-    assert_eq!(pool.stats.get_pool_drain_soft_evict_writer_total(), 1);
-    assert_eq!(current_writer_ids(&pool).await, vec![77]);
-}
-
-#[tokio::test]
-async fn reap_draining_writers_soft_evict_respects_cooldown_per_writer() {
-    let pool = make_pool(128).await;
-    pool.me_pool_drain_soft_evict_enabled.store(true, Ordering::Relaxed);
-    pool.me_pool_drain_soft_evict_grace_secs.store(0, Ordering::Relaxed);
-    pool.me_pool_drain_soft_evict_per_writer.store(1, Ordering::Relaxed);
-    pool.me_pool_drain_soft_evict_budget_per_core.store(8, Ordering::Relaxed);
-    pool.me_pool_drain_soft_evict_cooldown_ms
-        .store(60_000, Ordering::Relaxed);
-
-    let now_epoch_secs = MePool::now_epoch_secs();
-    insert_draining_writer(&pool, 88, now_epoch_secs.saturating_sub(240), 3, 0).await;
-    let mut warn_next_allowed = HashMap::new();
-    let mut soft_evict_next_allowed = HashMap::new();
-
-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
-    reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
-
-    let activity = pool.registry.writer_activity_snapshot().await;
-    assert_eq!(activity.bound_clients_by_writer.get(&88), Some(&2));
-    assert_eq!(pool.stats.get_pool_drain_soft_evict_total(), 1);
-    assert_eq!(pool.stats.get_pool_drain_soft_evict_writer_total(), 1);
-}
-
 #[test]
 fn general_config_default_drain_threshold_remains_enabled() {
    assert_eq!(GeneralConfig::default().me_pool_drain_threshold, 128);
-    assert!(GeneralConfig::default().me_pool_drain_soft_evict_enabled);
-    assert_eq!(
-        GeneralConfig::default().me_pool_drain_soft_evict_per_writer,
-        1
-    );
+}
+
+#[tokio::test]
+async fn reap_draining_writers_does_not_close_writer_that_became_non_empty_after_snapshot() {
+    let pool = make_pool(128).await;
+    let now_epoch_secs = MePool::now_epoch_secs();
+
+    let empty_writer_id = 700u64;
+    insert_draining_writer(
+        &pool,
+        empty_writer_id,
+        now_epoch_secs.saturating_sub(60),
+        0,
+        0,
+    )
+    .await;
+
+    let stale_empty_snapshot = vec![empty_writer_id];
+    let (rebound_conn_id, _rx) = pool.registry.register().await;
+    assert!(
+        pool.registry
+            .bind_writer(
+                rebound_conn_id,
+                empty_writer_id,
+                ConnMeta {
+                    target_dc: 2,
+                    client_addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 9050),
+                    our_addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443),
+                    proto_flags: 0,
+                },
+            )
+            .await,
+        "writer should accept a new bind after stale empty snapshot"
+    );
+
+    for writer_id in stale_empty_snapshot {
+        assert!(
+            !pool.remove_writer_if_empty(writer_id).await,
+            "atomic empty cleanup must reject writers that gained bound clients"
+        );
+    }
+
+    assert!(
+        writer_exists(&pool, empty_writer_id).await,
+        "empty-path cleanup must not remove a writer that gained a bound client"
+    );
+    assert_eq!(
+        pool.registry.get_writer(rebound_conn_id).await.map(|w| w.writer_id),
+        Some(empty_writer_id)
+    );
+
+    let _ = pool.registry.unregister(rebound_conn_id).await;
+}
+
+#[tokio::test]
+async fn prune_closed_writers_closes_bound_clients_when_writer_is_non_empty() {
+    let pool = make_pool(128).await;
+    let now_epoch_secs = MePool::now_epoch_secs();
+    let conn_ids = insert_draining_writer(&pool, 910, now_epoch_secs.saturating_sub(60), 1, 0).await;
+
+    pool.prune_closed_writers().await;
+
+    assert!(!writer_exists(&pool, 910).await);
+    assert!(pool.registry.get_writer(conn_ids[0]).await.is_none());
 }
@@ -27,6 +27,8 @@ mod health_regression_tests;
 mod health_integration_tests;
 #[cfg(test)]
 mod health_adversarial_tests;
+#[cfg(test)]
+mod send_adversarial_tests;

 use bytes::Bytes;

@@ -160,6 +160,7 @@ pub struct MePool {
    pub(super) refill_inflight: Arc<Mutex<HashSet<RefillEndpointKey>>>,
    pub(super) refill_inflight_dc: Arc<Mutex<HashSet<RefillDcKey>>>,
    pub(super) conn_count: AtomicUsize,
+    pub(super) draining_active_runtime: AtomicU64,
    pub(super) stats: Arc<crate::stats::Stats>,
    pub(super) generation: AtomicU64,
    pub(super) active_generation: AtomicU64,
@@ -172,11 +173,6 @@ pub struct MePool {
    pub(super) kdf_material_fingerprint: Arc<RwLock<HashMap<SocketAddr, (u64, u16)>>>,
    pub(super) me_pool_drain_ttl_secs: AtomicU64,
    pub(super) me_pool_drain_threshold: AtomicU64,
-    pub(super) me_pool_drain_soft_evict_enabled: AtomicBool,
-    pub(super) me_pool_drain_soft_evict_grace_secs: AtomicU64,
-    pub(super) me_pool_drain_soft_evict_per_writer: AtomicU8,
-    pub(super) me_pool_drain_soft_evict_budget_per_core: AtomicU32,
-    pub(super) me_pool_drain_soft_evict_cooldown_ms: AtomicU64,
    pub(super) me_pool_force_close_secs: AtomicU64,
    pub(super) me_pool_min_fresh_ratio_permille: AtomicU32,
    pub(super) me_hardswap_warmup_delay_min_ms: AtomicU64,
@@ -278,11 +274,6 @@ impl MePool {
        hardswap: bool,
        me_pool_drain_ttl_secs: u64,
        me_pool_drain_threshold: u64,
-        me_pool_drain_soft_evict_enabled: bool,
-        me_pool_drain_soft_evict_grace_secs: u64,
-        me_pool_drain_soft_evict_per_writer: u8,
-        me_pool_drain_soft_evict_budget_per_core: u16,
-        me_pool_drain_soft_evict_cooldown_ms: u64,
        me_pool_force_close_secs: u64,
        me_pool_min_fresh_ratio: f32,
        me_hardswap_warmup_delay_min_ms: u64,
@@ -448,6 +439,7 @@ impl MePool {
            refill_inflight: Arc::new(Mutex::new(HashSet::new())),
            refill_inflight_dc: Arc::new(Mutex::new(HashSet::new())),
            conn_count: AtomicUsize::new(0),
+            draining_active_runtime: AtomicU64::new(0),
            generation: AtomicU64::new(1),
            active_generation: AtomicU64::new(1),
            warm_generation: AtomicU64::new(0),
@@ -459,17 +451,6 @@ impl MePool {
            kdf_material_fingerprint: Arc::new(RwLock::new(HashMap::new())),
            me_pool_drain_ttl_secs: AtomicU64::new(me_pool_drain_ttl_secs),
            me_pool_drain_threshold: AtomicU64::new(me_pool_drain_threshold),
-            me_pool_drain_soft_evict_enabled: AtomicBool::new(me_pool_drain_soft_evict_enabled),
-            me_pool_drain_soft_evict_grace_secs: AtomicU64::new(me_pool_drain_soft_evict_grace_secs),
-            me_pool_drain_soft_evict_per_writer: AtomicU8::new(
-                me_pool_drain_soft_evict_per_writer.max(1),
-            ),
-            me_pool_drain_soft_evict_budget_per_core: AtomicU32::new(
-                me_pool_drain_soft_evict_budget_per_core.max(1) as u32,
-            ),
-            me_pool_drain_soft_evict_cooldown_ms: AtomicU64::new(
-                me_pool_drain_soft_evict_cooldown_ms.max(1),
-            ),
            me_pool_force_close_secs: AtomicU64::new(me_pool_force_close_secs),
            me_pool_min_fresh_ratio_permille: AtomicU32::new(Self::ratio_to_permille(
                me_pool_min_fresh_ratio,
@@ -517,11 +498,6 @@ impl MePool {
        hardswap: bool,
        drain_ttl_secs: u64,
        pool_drain_threshold: u64,
-        pool_drain_soft_evict_enabled: bool,
-        pool_drain_soft_evict_grace_secs: u64,
-        pool_drain_soft_evict_per_writer: u8,
-        pool_drain_soft_evict_budget_per_core: u16,
-        pool_drain_soft_evict_cooldown_ms: u64,
        force_close_secs: u64,
        min_fresh_ratio: f32,
        hardswap_warmup_delay_min_ms: u64,
@@ -562,18 +538,6 @@ impl MePool {
            .store(drain_ttl_secs, Ordering::Relaxed);
        self.me_pool_drain_threshold
            .store(pool_drain_threshold, Ordering::Relaxed);
-        self.me_pool_drain_soft_evict_enabled
-            .store(pool_drain_soft_evict_enabled, Ordering::Relaxed);
-        self.me_pool_drain_soft_evict_grace_secs
-            .store(pool_drain_soft_evict_grace_secs, Ordering::Relaxed);
-        self.me_pool_drain_soft_evict_per_writer
-            .store(pool_drain_soft_evict_per_writer.max(1), Ordering::Relaxed);
-        self.me_pool_drain_soft_evict_budget_per_core.store(
-            pool_drain_soft_evict_budget_per_core.max(1) as u32,
-            Ordering::Relaxed,
-        );
-        self.me_pool_drain_soft_evict_cooldown_ms
-            .store(pool_drain_soft_evict_cooldown_ms.max(1), Ordering::Relaxed);
        self.me_pool_force_close_secs
            .store(force_close_secs, Ordering::Relaxed);
        self.me_pool_min_fresh_ratio_permille
@@ -728,34 +692,31 @@ impl MePool {
        }
    }

-    pub(super) fn drain_soft_evict_enabled(&self) -> bool {
-        self.me_pool_drain_soft_evict_enabled
-            .load(Ordering::Relaxed)
+    #[allow(dead_code)]
+    pub(super) fn draining_active_runtime(&self) -> u64 {
+        self.draining_active_runtime.load(Ordering::Relaxed)
    }

-    pub(super) fn drain_soft_evict_grace_secs(&self) -> u64 {
-        self.me_pool_drain_soft_evict_grace_secs
-            .load(Ordering::Relaxed)
+    pub(super) fn increment_draining_active_runtime(&self) {
+        self.draining_active_runtime.fetch_add(1, Ordering::Relaxed);
    }

-    pub(super) fn drain_soft_evict_per_writer(&self) -> usize {
-        self.me_pool_drain_soft_evict_per_writer
-            .load(Ordering::Relaxed)
-            .max(1) as usize
-    }
-
-    pub(super) fn drain_soft_evict_budget_per_core(&self) -> usize {
-        self.me_pool_drain_soft_evict_budget_per_core
-            .load(Ordering::Relaxed)
-            .max(1) as usize
-    }
-
-    pub(super) fn drain_soft_evict_cooldown(&self) -> Duration {
-        Duration::from_millis(
-            self.me_pool_drain_soft_evict_cooldown_ms
-                .load(Ordering::Relaxed)
-                .max(1),
-        )
+    pub(super) fn decrement_draining_active_runtime(&self) {
+        let mut current = self.draining_active_runtime.load(Ordering::Relaxed);
+        loop {
+            if current == 0 {
+                break;
+            }
+            match self.draining_active_runtime.compare_exchange_weak(
+                current,
+                current - 1,
+                Ordering::Relaxed,
+                Ordering::Relaxed,
+            ) {
+                Ok(_) => break,
+                Err(actual) => current = actual,
+            }
+        }
    }

    pub(super) async fn key_selector(&self) -> u32 {
@@ -70,12 +70,10 @@ impl MePool {

        let mut missing_dc = Vec::<i32>::new();
        let mut covered = 0usize;
-        let mut total = 0usize;
        for (dc, endpoints) in desired_by_dc {
            if endpoints.is_empty() {
                continue;
            }
-            total += 1;
            if endpoints
                .iter()
                .any(|addr| active_writer_addrs.contains(&(*dc, *addr)))
@@ -87,9 +85,7 @@ impl MePool {
        }

        missing_dc.sort_unstable();
-        if total == 0 {
-            return (1.0, missing_dc);
-        }
+        let total = desired_by_dc.len().max(1);
        let ratio = (covered as f32) / (total as f32);
        (ratio, missing_dc)
    }
@@ -145,6 +141,38 @@ impl MePool {
        out
    }

+    pub(super) async fn has_non_draining_writer_per_desired_dc_group(&self) -> bool {
+        let desired_by_dc = self.desired_dc_endpoints().await;
+        let required_dcs: HashSet<i32> = desired_by_dc
+            .iter()
+            .filter_map(|(dc, endpoints)| {
+                if endpoints.is_empty() {
+                    None
+                } else {
+                    Some(*dc)
+                }
+            })
+            .collect();
+        if required_dcs.is_empty() {
+            return true;
+        }
+
+        let ws = self.writers.read().await;
+        let mut covered_dcs = HashSet::<i32>::with_capacity(required_dcs.len());
+        for writer in ws.iter() {
+            if writer.draining.load(Ordering::Relaxed) {
+                continue;
+            }
+            if required_dcs.contains(&writer.writer_dc) {
+                covered_dcs.insert(writer.writer_dc);
+                if covered_dcs.len() == required_dcs.len() {
+                    return true;
+                }
+            }
+        }
+        false
+    }
+
    fn hardswap_warmup_connect_delay_ms(&self) -> u64 {
        let min_ms = self.me_hardswap_warmup_delay_min_ms.load(Ordering::Relaxed);
        let max_ms = self.me_hardswap_warmup_delay_max_ms.load(Ordering::Relaxed);
@@ -403,21 +431,29 @@ impl MePool {
        }

        if hardswap {
-            let fresh_writer_addrs: HashSet<(i32, SocketAddr)> = writers
-                .iter()
-                .filter(|w| !w.draining.load(Ordering::Relaxed))
-                .filter(|w| w.generation == generation)
-                .map(|w| (w.writer_dc, w.addr))
-                .collect();
-            let (fresh_coverage_ratio, fresh_missing_dc) =
-                Self::coverage_ratio(&desired_by_dc, &fresh_writer_addrs);
+            let mut fresh_missing_dc = Vec::<(i32, usize, usize)>::new();
+            for (dc, endpoints) in &desired_by_dc {
+                if endpoints.is_empty() {
+                    continue;
+                }
+                let required = self.required_writers_for_dc(endpoints.len());
+                let fresh_count = writers
+                    .iter()
+                    .filter(|w| !w.draining.load(Ordering::Relaxed))
+                    .filter(|w| w.generation == generation)
+                    .filter(|w| w.writer_dc == *dc)
+                    .filter(|w| endpoints.contains(&w.addr))
+                    .count();
+                if fresh_count < required {
+                    fresh_missing_dc.push((*dc, fresh_count, required));
+                }
+            }
            if !fresh_missing_dc.is_empty() {
                warn!(
                    previous_generation,
                    generation,
-                    fresh_coverage_ratio = format_args!("{fresh_coverage_ratio:.3}"),
                    missing_dc = ?fresh_missing_dc,
-                    "ME hardswap pending: fresh generation DC coverage incomplete"
+                    "ME hardswap pending: fresh generation coverage incomplete"
                );
                return;
            }
@@ -471,12 +507,30 @@ impl MePool {
            coverage_ratio = format_args!("{coverage_ratio:.3}"),
            min_ratio = format_args!("{min_ratio:.3}"),
            drain_timeout_secs,
-            "ME reinit cycle covered; draining stale writers"
+            "ME reinit cycle covered; processing stale writers"
        );
        self.stats.increment_pool_swap_total();
+        let can_drop_with_replacement = self
+            .has_non_draining_writer_per_desired_dc_group()
+            .await;
+        if can_drop_with_replacement {
+            info!(
+                stale_writers = stale_writer_ids.len(),
+                "ME reinit stale writers: replacement coverage ready, force-closing clients for fast rebind"
+            );
+        } else {
+            warn!(
+                stale_writers = stale_writer_ids.len(),
+                "ME reinit stale writers: replacement coverage incomplete, keeping draining fallback"
+            );
+        }
        for writer_id in stale_writer_ids {
            self.mark_writer_draining_with_timeout(writer_id, drain_timeout, !hardswap)
                .await;
+            if can_drop_with_replacement {
+                self.stats.increment_pool_force_close_total();
+                self.remove_writer_and_close_clients(writer_id).await;
+            }
        }
        if hardswap {
            self.clear_pending_hardswap_state();
@@ -487,61 +541,3 @@ impl MePool {
        self.zero_downtime_reinit_after_map_change(rng).await;
    }
 }
-
-#[cfg(test)]
-mod tests {
-    use std::collections::{HashMap, HashSet};
-    use std::net::{IpAddr, Ipv4Addr, SocketAddr};
-
-    use super::MePool;
-
-    fn addr(octet: u8, port: u16) -> SocketAddr {
-        SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, octet)), port)
-    }
-
-    #[test]
-    fn coverage_ratio_counts_dc_coverage_not_floor() {
-        let dc1 = addr(1, 2001);
-        let dc2 = addr(2, 2002);
-
-        let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
-        desired_by_dc.insert(1, HashSet::from([dc1]));
-        desired_by_dc.insert(2, HashSet::from([dc2]));
-
-        let active_writer_addrs = HashSet::from([(1, dc1)]);
-        let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &active_writer_addrs);
-
-        assert_eq!(ratio, 0.5);
-        assert_eq!(missing_dc, vec![2]);
-    }
-
-    #[test]
-    fn coverage_ratio_ignores_empty_dc_groups() {
-        let dc1 = addr(1, 2001);
-
-        let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
-        desired_by_dc.insert(1, HashSet::from([dc1]));
-        desired_by_dc.insert(2, HashSet::new());
-
-        let active_writer_addrs = HashSet::from([(1, dc1)]);
-        let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &active_writer_addrs);
-
-        assert_eq!(ratio, 1.0);
-        assert!(missing_dc.is_empty());
-    }
-
-    #[test]
-    fn coverage_ratio_reports_missing_dcs_sorted() {
-        let dc1 = addr(1, 2001);
-        let dc2 = addr(2, 2002);
-
-        let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
-        desired_by_dc.insert(2, HashSet::from([dc2]));
-        desired_by_dc.insert(1, HashSet::from([dc1]));
-
-        let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &HashSet::new());
-
-        assert_eq!(ratio, 0.0);
-        assert_eq!(missing_dc, vec![1, 2]);
-    }
-}
@@ -40,7 +40,6 @@ pub(crate) struct MeApiDcStatusSnapshot {
    pub floor_max: usize,
    pub floor_capped: bool,
    pub alive_writers: usize,
-    pub coverage_ratio: f64,
    pub coverage_pct: f64,
    pub fresh_alive_writers: usize,
    pub fresh_coverage_pct: f64,
@@ -63,7 +62,6 @@ pub(crate) struct MeApiStatusSnapshot {
    pub available_pct: f64,
    pub required_writers: usize,
    pub alive_writers: usize,
-    pub coverage_ratio: f64,
    pub coverage_pct: f64,
    pub fresh_alive_writers: usize,
    pub fresh_coverage_pct: f64,
@@ -126,11 +124,6 @@ pub(crate) struct MeApiRuntimeSnapshot {
    pub me_reconnect_backoff_cap_ms: u64,
    pub me_reconnect_fast_retry_count: u32,
    pub me_pool_drain_ttl_secs: u64,
-    pub me_pool_drain_soft_evict_enabled: bool,
-    pub me_pool_drain_soft_evict_grace_secs: u64,
-    pub me_pool_drain_soft_evict_per_writer: u8,
-    pub me_pool_drain_soft_evict_budget_per_core: u16,
-    pub me_pool_drain_soft_evict_cooldown_ms: u64,
    pub me_pool_force_close_secs: u64,
    pub me_pool_min_fresh_ratio: f32,
    pub me_bind_stale_mode: &'static str,
@@ -344,8 +337,6 @@ impl MePool {
        let mut available_endpoints = 0usize;
        let mut alive_writers = 0usize;
        let mut fresh_alive_writers = 0usize;
-        let mut coverage_ratio_dcs_total = 0usize;
-        let mut coverage_ratio_dcs_covered = 0usize;
        let floor_mode = self.floor_mode();
        let adaptive_cpu_cores = (self
            .me_adaptive_floor_cpu_cores_effective
@@ -397,12 +388,6 @@ impl MePool {
            available_endpoints += dc_available_endpoints;
            alive_writers += dc_alive_writers;
            fresh_alive_writers += dc_fresh_alive_writers;
-            if endpoint_count > 0 {
-                coverage_ratio_dcs_total += 1;
-                if dc_alive_writers > 0 {
-                    coverage_ratio_dcs_covered += 1;
-                }
-            }

            dcs.push(MeApiDcStatusSnapshot {
                dc,
@@ -425,11 +410,6 @@ impl MePool {
                floor_max,
                floor_capped,
                alive_writers: dc_alive_writers,
-                coverage_ratio: if endpoint_count > 0 && dc_alive_writers > 0 {
-                    100.0
-                } else {
-                    0.0
-                },
                coverage_pct: ratio_pct(dc_alive_writers, dc_required_writers),
                fresh_alive_writers: dc_fresh_alive_writers,
                fresh_coverage_pct: ratio_pct(dc_fresh_alive_writers, dc_required_writers),
@@ -446,7 +426,6 @@ impl MePool {
            available_pct: ratio_pct(available_endpoints, configured_endpoints),
            required_writers,
            alive_writers,
-            coverage_ratio: ratio_pct(coverage_ratio_dcs_covered, coverage_ratio_dcs_total),
            coverage_pct: ratio_pct(alive_writers, required_writers),
            fresh_alive_writers,
            fresh_coverage_pct: ratio_pct(fresh_alive_writers, required_writers),
@@ -583,22 +562,6 @@ impl MePool {
            me_reconnect_backoff_cap_ms: self.me_reconnect_backoff_cap.as_millis() as u64,
            me_reconnect_fast_retry_count: self.me_reconnect_fast_retry_count,
            me_pool_drain_ttl_secs: self.me_pool_drain_ttl_secs.load(Ordering::Relaxed),
-            me_pool_drain_soft_evict_enabled: self
-                .me_pool_drain_soft_evict_enabled
-                .load(Ordering::Relaxed),
-            me_pool_drain_soft_evict_grace_secs: self
-                .me_pool_drain_soft_evict_grace_secs
-                .load(Ordering::Relaxed),
-            me_pool_drain_soft_evict_per_writer: self
-                .me_pool_drain_soft_evict_per_writer
-                .load(Ordering::Relaxed),
-            me_pool_drain_soft_evict_budget_per_core: self
-                .me_pool_drain_soft_evict_budget_per_core
-                .load(Ordering::Relaxed)
-                .min(u16::MAX as u32) as u16,
-            me_pool_drain_soft_evict_cooldown_ms: self
-                .me_pool_drain_soft_evict_cooldown_ms
-                .load(Ordering::Relaxed),
            me_pool_force_close_secs: self.me_pool_force_close_secs.load(Ordering::Relaxed),
            me_pool_min_fresh_ratio: Self::permille_to_ratio(
                self.me_pool_min_fresh_ratio_permille.load(Ordering::Relaxed),
@@ -42,11 +42,10 @@ impl MePool {
        }

        for writer_id in closed_writer_ids {
-            if self.registry.is_writer_empty(writer_id).await {
-                let _ = self.remove_writer_only(writer_id).await;
-            } else {
-                let _ = self.remove_writer_and_close_clients(writer_id).await;
+            if self.remove_writer_if_empty(writer_id).await {
+                continue;
            }
+            let _ = self.remove_writer_and_close_clients(writer_id).await;
        }
    }

@@ -501,6 +500,17 @@ impl MePool {
        }
    }

+    pub(crate) async fn remove_writer_if_empty(self: &Arc<Self>, writer_id: u64) -> bool {
+        if !self.registry.unregister_writer_if_empty(writer_id).await {
+            return false;
+        }
+
+        // The registry empty-check and unregister are atomic with respect to binds,
+        // so remove_writer_only cannot return active bound sessions here.
+        let _ = self.remove_writer_only(writer_id).await;
+        true
+    }
+
    async fn remove_writer_only(self: &Arc<Self>, writer_id: u64) -> Vec<BoundConn> {
        let mut close_tx: Option<mpsc::Sender<WriterCommand>> = None;
        let mut removed_addr: Option<SocketAddr> = None;
@@ -514,6 +524,7 @@ impl MePool {
                let was_draining = w.draining.load(Ordering::Relaxed);
                if was_draining {
                    self.stats.decrement_pool_drain_active();
+                    self.decrement_draining_active_runtime();
                }
                self.stats.increment_me_writer_removed_total();
                w.cancel.cancel();
@@ -572,6 +583,7 @@ impl MePool {
                    .store(drain_deadline_epoch_secs, Ordering::Relaxed);
                if !already_draining {
                    self.stats.increment_pool_drain_active();
+                    self.increment_draining_active_runtime();
                }
                w.contour
                    .store(WriterContour::Draining.as_u8(), Ordering::Relaxed);
@@ -394,56 +394,6 @@ impl ConnRegistry {
        inner.writer_for_conn.keys().copied().collect()
    }

-    pub(super) async fn bound_conn_ids_for_writer_limited(
-        &self,
-        writer_id: u64,
-        limit: usize,
-    ) -> Vec<u64> {
-        if limit == 0 {
-            return Vec::new();
-        }
-        let inner = self.inner.read().await;
-        let Some(conn_ids) = inner.conns_for_writer.get(&writer_id) else {
-            return Vec::new();
-        };
-        let mut out = conn_ids.iter().copied().collect::<Vec<_>>();
-        out.sort_unstable();
-        out.truncate(limit);
-        out
-    }
-
-    pub(super) async fn evict_bound_conn_if_writer(&self, conn_id: u64, writer_id: u64) -> bool {
-        let maybe_client_tx = {
-            let mut inner = self.inner.write().await;
-            if inner.writer_for_conn.get(&conn_id).copied() != Some(writer_id) {
-                return false;
-            }
-
-            let client_tx = inner.map.get(&conn_id).cloned();
-            inner.map.remove(&conn_id);
-            inner.meta.remove(&conn_id);
-            inner.writer_for_conn.remove(&conn_id);
-
-            let became_empty = if let Some(set) = inner.conns_for_writer.get_mut(&writer_id) {
-                set.remove(&conn_id);
-                set.is_empty()
-            } else {
-                false
-            };
-            if became_empty {
-                inner
-                    .writer_idle_since_epoch_secs
-                    .insert(writer_id, Self::now_epoch_secs());
-            }
-            client_tx
-        };
-
-        if let Some(client_tx) = maybe_client_tx {
-            let _ = client_tx.try_send(MeResponse::Close);
-        }
-        true
-    }
-
    pub async fn writer_lost(&self, writer_id: u64) -> Vec<BoundConn> {
        let mut inner = self.inner.write().await;
        inner.writers.remove(&writer_id);
@@ -486,6 +436,37 @@ impl ConnRegistry {
            .map(|s| s.is_empty())
            .unwrap_or(true)
    }
+
+    pub async fn unregister_writer_if_empty(&self, writer_id: u64) -> bool {
+        let mut inner = self.inner.write().await;
+        let Some(conn_ids) = inner.conns_for_writer.get(&writer_id) else {
+            // Writer is already absent from the registry.
+            return true;
+        };
+        if !conn_ids.is_empty() {
+            return false;
+        }
+
+        inner.writers.remove(&writer_id);
+        inner.last_meta_for_writer.remove(&writer_id);
+        inner.writer_idle_since_epoch_secs.remove(&writer_id);
+        inner.conns_for_writer.remove(&writer_id);
+        true
+    }
+
+    #[allow(dead_code)]
+    pub(super) async fn non_empty_writer_ids(&self, writer_ids: &[u64]) -> HashSet<u64> {
+        let inner = self.inner.read().await;
+        let mut out = HashSet::<u64>::with_capacity(writer_ids.len());
+        for writer_id in writer_ids {
+            if let Some(conns) = inner.conns_for_writer.get(writer_id)
+                && !conns.is_empty()
+            {
+                out.insert(*writer_id);
+            }
+        }
+        out
+    }
 }

 #[cfg(test)]
@@ -494,7 +475,6 @@ mod tests {

    use super::ConnMeta;
    use super::ConnRegistry;
-    use super::MeResponse;

    #[tokio::test]
    async fn writer_activity_snapshot_tracks_writer_and_dc_load() {
@@ -687,47 +667,15 @@ mod tests {
    }

    #[tokio::test]
-    async fn bound_conn_ids_for_writer_limited_is_sorted_and_bounded() {
+    async fn non_empty_writer_ids_returns_only_writers_with_bound_clients() {
        let registry = ConnRegistry::new();
-        let (writer_tx, _writer_rx) = tokio::sync::mpsc::channel(8);
-        registry.register_writer(10, writer_tx).await;
-        let addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443);
-        let mut conn_ids = Vec::new();
-        for _ in 0..5 {
-            let (conn_id, _rx) = registry.register().await;
-            assert!(
-                registry
-                    .bind_writer(
-                        conn_id,
-                        10,
-                        ConnMeta {
-                            target_dc: 2,
-                            client_addr: addr,
-                            our_addr: addr,
-                            proto_flags: 0,
-                        },
-                    )
-                    .await
-            );
-            conn_ids.push(conn_id);
-        }
-        conn_ids.sort_unstable();
-
-        let limited = registry.bound_conn_ids_for_writer_limited(10, 3).await;
-        assert_eq!(limited.len(), 3);
-        assert_eq!(limited, conn_ids.into_iter().take(3).collect::<Vec<_>>());
-    }
-
-    #[tokio::test]
-    async fn evict_bound_conn_if_writer_does_not_touch_rebound_conn() {
-        let registry = ConnRegistry::new();
-        let (conn_id, mut rx) = registry.register().await;
+        let (conn_id, _rx) = registry.register().await;
        let (writer_tx_a, _writer_rx_a) = tokio::sync::mpsc::channel(8);
        let (writer_tx_b, _writer_rx_b) = tokio::sync::mpsc::channel(8);
        registry.register_writer(10, writer_tx_a).await;
        registry.register_writer(20, writer_tx_b).await;
-        let addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443);

+        let addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443);
        assert!(
            registry
                .bind_writer(
@@ -742,29 +690,10 @@ mod tests {
                )
                .await
        );
-        assert!(
-            registry
-                .bind_writer(
-                    conn_id,
-                    20,
-                    ConnMeta {
-                        target_dc: 2,
-                        client_addr: addr,
-                        our_addr: addr,
-                        proto_flags: 1,
-                    },
-                )
-                .await
-        );

-        let evicted = registry.evict_bound_conn_if_writer(conn_id, 10).await;
-        assert!(!evicted);
-        assert_eq!(registry.get_writer(conn_id).await.expect("writer").writer_id, 20);
-        assert!(rx.try_recv().is_err());
-
-        let evicted = registry.evict_bound_conn_if_writer(conn_id, 20).await;
-        assert!(evicted);
-        assert!(registry.get_writer(conn_id).await.is_none());
-        assert!(matches!(rx.try_recv(), Ok(MeResponse::Close)));
+        let non_empty = registry.non_empty_writer_ids(&[10, 20, 30]).await;
+        assert!(non_empty.contains(&10));
+        assert!(!non_empty.contains(&20));
+        assert!(!non_empty.contains(&30));
    }
 }
@@ -372,17 +372,20 @@ impl MePool {
                }
                let effective_our_addr = SocketAddr::new(w.source_ip, our_addr.port());
                let (payload, meta) = build_routed_payload(effective_our_addr);
-                match w.tx.try_send(WriterCommand::Data(payload.clone())) {
-                    Ok(()) => {
-                        self.stats.increment_me_writer_pick_success_try_total(pick_mode);
+                match w.tx.clone().try_reserve_owned() {
+                    Ok(permit) => {
                        if !self.registry.bind_writer(conn_id, w.id, meta).await {
                            debug!(
                                conn_id,
                                writer_id = w.id,
-                                "ME writer disappeared before bind commit, retrying"
+                                "ME writer disappeared before bind commit, pruning stale writer"
                            );
+                            drop(permit);
+                            self.remove_writer_and_close_clients(w.id).await;
                            continue;
                        }
+                        permit.send(WriterCommand::Data(payload.clone()));
+                        self.stats.increment_me_writer_pick_success_try_total(pick_mode);
                        if w.generation < self.current_generation() {
                            self.stats.increment_pool_stale_pick_total();
                            debug!(
@@ -422,18 +425,21 @@ impl MePool {
            self.stats.increment_me_writer_pick_blocking_fallback_total();
            let effective_our_addr = SocketAddr::new(w.source_ip, our_addr.port());
            let (payload, meta) = build_routed_payload(effective_our_addr);
-            match w.tx.send(WriterCommand::Data(payload.clone())).await {
-                Ok(()) => {
-                    self.stats
-                        .increment_me_writer_pick_success_fallback_total(pick_mode);
+            match w.tx.clone().reserve_owned().await {
+                Ok(permit) => {
                    if !self.registry.bind_writer(conn_id, w.id, meta).await {
                        debug!(
                            conn_id,
                            writer_id = w.id,
-                            "ME writer disappeared before fallback bind commit, retrying"
+                            "ME writer disappeared before fallback bind commit, pruning stale writer"
                        );
+                        drop(permit);
+                        self.remove_writer_and_close_clients(w.id).await;
                        continue;
                    }
+                    permit.send(WriterCommand::Data(payload.clone()));
+                    self.stats
+                        .increment_me_writer_pick_success_fallback_total(pick_mode);
                    if w.generation < self.current_generation() {
                        self.stats.increment_pool_stale_pick_total();
                    }
@@ -0,0 +1,263 @@
+use std::collections::HashMap;
+use std::net::{IpAddr, Ipv4Addr, SocketAddr};
+use std::sync::Arc;
+use std::sync::atomic::{AtomicBool, AtomicU8, AtomicU32, AtomicU64, Ordering};
+use std::time::{Duration, Instant};
+
+use tokio::sync::mpsc;
+use tokio_util::sync::CancellationToken;
+
+use super::codec::WriterCommand;
+use super::pool::{MePool, MeWriter, WriterContour};
+use crate::config::{GeneralConfig, MeRouteNoWriterMode, MeSocksKdfPolicy, MeWriterPickMode};
+use crate::crypto::SecureRandom;
+use crate::network::probe::NetworkDecision;
+use crate::stats::Stats;
+
+async fn make_pool() -> (Arc<MePool>, Arc<SecureRandom>) {
+    let general = GeneralConfig {
+        me_route_no_writer_mode: MeRouteNoWriterMode::AsyncRecoveryFailfast,
+        me_route_no_writer_wait_ms: 50,
+        me_writer_pick_mode: MeWriterPickMode::SortedRr,
+        me_deterministic_writer_sort: true,
+        ..GeneralConfig::default()
+    };
+
+    let rng = Arc::new(SecureRandom::new());
+    let pool = MePool::new(
+        None,
+        vec![1u8; 32],
+        None,
+        false,
+        None,
+        Vec::new(),
+        1,
+        None,
+        12,
+        1200,
+        HashMap::new(),
+        HashMap::new(),
+        None,
+        NetworkDecision::default(),
+        None,
+        rng.clone(),
+        Arc::new(Stats::default()),
+        general.me_keepalive_enabled,
+        general.me_keepalive_interval_secs,
+        general.me_keepalive_jitter_secs,
+        general.me_keepalive_payload_random,
+        general.rpc_proxy_req_every,
+        general.me_warmup_stagger_enabled,
+        general.me_warmup_step_delay_ms,
+        general.me_warmup_step_jitter_ms,
+        general.me_reconnect_max_concurrent_per_dc,
+        general.me_reconnect_backoff_base_ms,
+        general.me_reconnect_backoff_cap_ms,
+        general.me_reconnect_fast_retry_count,
+        general.me_single_endpoint_shadow_writers,
+        general.me_single_endpoint_outage_mode_enabled,
+        general.me_single_endpoint_outage_disable_quarantine,
+        general.me_single_endpoint_outage_backoff_min_ms,
+        general.me_single_endpoint_outage_backoff_max_ms,
+        general.me_single_endpoint_shadow_rotate_every_secs,
+        general.me_floor_mode,
+        general.me_adaptive_floor_idle_secs,
+        general.me_adaptive_floor_min_writers_single_endpoint,
+        general.me_adaptive_floor_min_writers_multi_endpoint,
+        general.me_adaptive_floor_recover_grace_secs,
+        general.me_adaptive_floor_writers_per_core_total,
+        general.me_adaptive_floor_cpu_cores_override,
+        general.me_adaptive_floor_max_extra_writers_single_per_core,
+        general.me_adaptive_floor_max_extra_writers_multi_per_core,
+        general.me_adaptive_floor_max_active_writers_per_core,
+        general.me_adaptive_floor_max_warm_writers_per_core,
+        general.me_adaptive_floor_max_active_writers_global,
+        general.me_adaptive_floor_max_warm_writers_global,
+        general.hardswap,
+        general.me_pool_drain_ttl_secs,
+        general.me_pool_drain_threshold,
+        general.effective_me_pool_force_close_secs(),
+        general.me_pool_min_fresh_ratio,
+        general.me_hardswap_warmup_delay_min_ms,
+        general.me_hardswap_warmup_delay_max_ms,
+        general.me_hardswap_warmup_extra_passes,
+        general.me_hardswap_warmup_pass_backoff_base_ms,
+        general.me_bind_stale_mode,
+        general.me_bind_stale_ttl_secs,
+        general.me_secret_atomic_snapshot,
+        general.me_deterministic_writer_sort,
+        general.me_writer_pick_mode,
+        general.me_writer_pick_sample_size,
+        MeSocksKdfPolicy::default(),
+        general.me_writer_cmd_channel_capacity,
+        general.me_route_channel_capacity,
+        general.me_route_backpressure_base_timeout_ms,
+        general.me_route_backpressure_high_timeout_ms,
+        general.me_route_backpressure_high_watermark_pct,
+        general.me_reader_route_data_wait_ms,
+        general.me_health_interval_ms_unhealthy,
+        general.me_health_interval_ms_healthy,
+        general.me_warn_rate_limit_ms,
+        general.me_route_no_writer_mode,
+        general.me_route_no_writer_wait_ms,
+        general.me_route_inline_recovery_attempts,
+        general.me_route_inline_recovery_wait_ms,
+    );
+
+    (pool, rng)
+}
+
+async fn insert_writer(
+    pool: &Arc<MePool>,
+    writer_id: u64,
+    writer_dc: i32,
+    addr: SocketAddr,
+    register_in_registry: bool,
+) -> mpsc::Receiver<WriterCommand> {
+    let (tx, rx) = mpsc::channel::<WriterCommand>(8);
+    let writer = MeWriter {
+        id: writer_id,
+        addr,
+        source_ip: addr.ip(),
+        writer_dc,
+        generation: pool.current_generation(),
+        contour: Arc::new(AtomicU8::new(WriterContour::Active.as_u8())),
+        created_at: Instant::now(),
+        tx: tx.clone(),
+        cancel: CancellationToken::new(),
+        degraded: Arc::new(AtomicBool::new(false)),
+        rtt_ema_ms_x10: Arc::new(AtomicU32::new(0)),
+        draining: Arc::new(AtomicBool::new(false)),
+        draining_started_at_epoch_secs: Arc::new(AtomicU64::new(0)),
+        drain_deadline_epoch_secs: Arc::new(AtomicU64::new(0)),
+        allow_drain_fallback: Arc::new(AtomicBool::new(false)),
+    };
+
+    pool.writers.write().await.push(writer);
+    {
+        let mut map = pool.proxy_map_v4.write().await;
+        map.entry(writer_dc)
+            .or_insert_with(Vec::new)
+            .push((addr.ip(), addr.port()));
+    }
+    pool.rebuild_endpoint_dc_map().await;
+    if register_in_registry {
+        pool.registry.register_writer(writer_id, tx).await;
+    }
+    rx
+}
+
+async fn recv_data_count(rx: &mut mpsc::Receiver<WriterCommand>, budget: Duration) -> usize {
+    let start = Instant::now();
+    let mut data_count = 0usize;
+    while Instant::now().duration_since(start) < budget {
+        let remaining = budget.saturating_sub(Instant::now().duration_since(start));
+        match tokio::time::timeout(remaining.min(Duration::from_millis(10)), rx.recv()).await {
+            Ok(Some(WriterCommand::Data(_))) => data_count += 1,
+            Ok(Some(WriterCommand::DataAndFlush(_))) => data_count += 1,
+            Ok(Some(WriterCommand::Close)) => {}
+            Ok(None) => break,
+            Err(_) => break,
+        }
+    }
+    data_count
+}
+
+#[tokio::test]
+async fn send_proxy_req_does_not_replay_when_first_bind_commit_fails() {
+    let (pool, _rng) = make_pool().await;
+    pool.rr.store(0, Ordering::Relaxed);
+
+    let (conn_id, _rx) = pool.registry.register().await;
+    let mut stale_rx = insert_writer(
+        &pool,
+        10,
+        2,
+        SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 10)), 443),
+        false,
+    )
+    .await;
+    let mut live_rx = insert_writer(
+        &pool,
+        11,
+        2,
+        SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 11)), 443),
+        true,
+    )
+    .await;
+
+    let result = pool
+        .send_proxy_req(
+            conn_id,
+            2,
+            SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 30000),
+            SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443),
+            b"hello",
+            0,
+            None,
+        )
+        .await;
+
+    assert!(result.is_ok());
+    assert_eq!(recv_data_count(&mut stale_rx, Duration::from_millis(50)).await, 0);
+    assert_eq!(recv_data_count(&mut live_rx, Duration::from_millis(50)).await, 1);
+
+    let bound = pool.registry.get_writer(conn_id).await;
+    assert!(bound.is_some());
+    assert_eq!(bound.expect("writer should be bound").writer_id, 11);
+}
+
+#[tokio::test]
+async fn send_proxy_req_prunes_iterative_stale_bind_failures_without_data_replay() {
+    let (pool, _rng) = make_pool().await;
+    pool.rr.store(0, Ordering::Relaxed);
+
+    let (conn_id, _rx) = pool.registry.register().await;
+
+    let mut stale_rx_1 = insert_writer(
+        &pool,
+        21,
+        2,
+        SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 1, 21)), 443),
+        false,
+    )
+    .await;
+    let mut stale_rx_2 = insert_writer(
+        &pool,
+        22,
+        2,
+        SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 1, 22)), 443),
+        false,
+    )
+    .await;
+    let mut live_rx = insert_writer(
+        &pool,
+        23,
+        2,
+        SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 1, 23)), 443),
+        true,
+    )
+    .await;
+
+    let result = pool
+        .send_proxy_req(
+            conn_id,
+            2,
+            SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 30001),
+            SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443),
+            b"storm",
+            0,
+            None,
+        )
+        .await;
+
+    assert!(result.is_ok());
+    assert_eq!(recv_data_count(&mut stale_rx_1, Duration::from_millis(50)).await, 0);
+    assert_eq!(recv_data_count(&mut stale_rx_2, Duration::from_millis(50)).await, 0);
+    assert_eq!(recv_data_count(&mut live_rx, Duration::from_millis(50)).await, 1);
+
+    let writers = pool.writers.read().await;
+    let writer_ids = writers.iter().map(|w| w.id).collect::<Vec<_>>();
+    drop(writers);
+    assert_eq!(writer_ids, vec![23]);
+}
@@ -11,8 +11,6 @@ use tokio::net::TcpStream;
 use socket2::{Socket, TcpKeepalive, Domain, Type, Protocol};
 use tracing::debug;

-const DEFAULT_SOCKET_BUFFER_BYTES: usize = 256 * 1024;
-
 /// Configure TCP socket with recommended settings for proxy use
 #[allow(dead_code)]
 pub fn configure_tcp_socket(
@@ -36,10 +34,10 @@ pub fn configure_tcp_socket(
        
        socket.set_tcp_keepalive(&keepalive)?;
    }
-
-    // Use explicit baseline buffers to reduce slow-start stalls on high RTT links.
-    socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
-    socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
+    
+    // CHANGED: Removed manual buffer size setting (was 256KB).
+    // Allowing the OS kernel to handle TCP window scaling (Autotuning) is critical
+    // for mobile clients to avoid bufferbloat and stalled connections during uploads.
    
    Ok(())
 }
@@ -64,10 +62,6 @@ pub fn configure_client_socket(
    let keepalive = keepalive.with_interval(Duration::from_secs(keepalive_secs));
    
    socket.set_tcp_keepalive(&keepalive)?;
-
-    // Keep explicit baseline buffers for predictable throughput across busy hosts.
-    socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
-    socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
    
    // Set TCP user timeout (Linux only)
    // NOTE: iOS does not support TCP_USER_TIMEOUT - application-level timeout 
@@ -130,8 +124,6 @@ pub fn create_outgoing_socket_bound(addr: SocketAddr, bind_addr: Option<IpAddr>)
    
    // Disable Nagle
    socket.set_nodelay(true)?;
-    socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
-    socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;

    if let Some(bind_ip) = bind_addr {
        let bind_sock_addr = SocketAddr::new(bind_ip, 0);
@@ -1,728 +0,0 @@
-"""
-Telemt Control API Python Client
-Full-coverage client for https://github.com/telemt/telemt
-
-Usage:
-    client = TelemtAPI("http://127.0.0.1:9091", auth_header="your-secret")
-    client.health()
-    client.create_user("alice", max_tcp_conns=10)
-    client.patch_user("alice", data_quota_bytes=1_000_000_000)
-    client.delete_user("alice")
-"""
-
-from __future__ import annotations
-
-import json
-import secrets
-from dataclasses import dataclass, field
-from typing import Any, Dict, List, Optional, Union
-from urllib.error import HTTPError, URLError
-from urllib.request import Request, urlopen
-
-
-# ---------------------------------------------------------------------------
-# Exceptions
-# ---------------------------------------------------------------------------
-
-class TememtAPIError(Exception):
-    """Raised when the API returns an error envelope or a transport error."""
-
-    def __init__(self, message: str, code: str | None = None,
-                 http_status: int | None = None, request_id: int | None = None):
-        super().__init__(message)
-        self.code = code
-        self.http_status = http_status
-        self.request_id = request_id
-
-    def __repr__(self) -> str:
-        return (f"TememtAPIError(message={str(self)!r}, code={self.code!r}, "
-                f"http_status={self.http_status}, request_id={self.request_id})")
-
-
-# ---------------------------------------------------------------------------
-# Response wrapper
-# ---------------------------------------------------------------------------
-
-@dataclass
-class APIResponse:
-    """Wraps a successful API response envelope."""
-    ok: bool
-    data: Any
-    revision: str | None = None
-
-    def __repr__(self) -> str:  # pragma: no cover
-        return f"APIResponse(ok={self.ok}, revision={self.revision!r}, data={self.data!r})"
-
-
-# ---------------------------------------------------------------------------
-# Main client
-# ---------------------------------------------------------------------------
-
-class TememtAPI:
-    """
-    HTTP client for the Telemt Control API.
-
-    Parameters
-    ----------
-    base_url:
-        Scheme + host + port, e.g. ``"http://127.0.0.1:9091"``.
-        Trailing slash is stripped automatically.
-    auth_header:
-        Exact value for the ``Authorization`` header.
-        Leave *None* when ``auth_header`` is not configured server-side.
-    timeout:
-        Socket timeout in seconds for every request (default 10).
-    """
-
-    def __init__(
-        self,
-        base_url: str = "http://127.0.0.1:9091",
-        auth_header: str | None = None,
-        timeout: int = 10,
-    ) -> None:
-        self.base_url = base_url.rstrip("/")
-        self.auth_header = auth_header
-        self.timeout = timeout
-
-    # ------------------------------------------------------------------
-    # Low-level HTTP helpers
-    # ------------------------------------------------------------------
-
-    def _headers(self, extra: dict | None = None) -> dict:
-        h = {"Content-Type": "application/json; charset=utf-8",
-             "Accept": "application/json"}
-        if self.auth_header:
-            h["Authorization"] = self.auth_header
-        if extra:
-            h.update(extra)
-        return h
-
-    def _request(
-        self,
-        method: str,
-        path: str,
-        body: dict | None = None,
-        if_match: str | None = None,
-        query: dict | None = None,
-    ) -> APIResponse:
-        url = self.base_url + path
-        if query:
-            qs = "&".join(f"{k}={v}" for k, v in query.items())
-            url = f"{url}?{qs}"
-
-        raw_body: bytes | None = None
-        if body is not None:
-            raw_body = json.dumps(body).encode()
-
-        extra_headers: dict = {}
-        if if_match is not None:
-            extra_headers["If-Match"] = if_match
-
-        req = Request(
-            url,
-            data=raw_body,
-            headers=self._headers(extra_headers),
-            method=method,
-        )
-
-        try:
-            with urlopen(req, timeout=self.timeout) as resp:
-                payload = json.loads(resp.read())
-        except HTTPError as exc:
-            raw = exc.read()
-            try:
-                payload = json.loads(raw)
-            except Exception:
-                raise TememtAPIError(
-                    str(exc), http_status=exc.code
-                ) from exc
-            err = payload.get("error", {})
-            raise TememtAPIError(
-                err.get("message", str(exc)),
-                code=err.get("code"),
-                http_status=exc.code,
-                request_id=payload.get("request_id"),
-            ) from exc
-        except URLError as exc:
-            raise TememtAPIError(str(exc)) from exc
-
-        if not payload.get("ok"):
-            err = payload.get("error", {})
-            raise TememtAPIError(
-                err.get("message", "unknown error"),
-                code=err.get("code"),
-                request_id=payload.get("request_id"),
-            )
-
-        return APIResponse(
-            ok=True,
-            data=payload.get("data"),
-            revision=payload.get("revision"),
-        )
-
-    def _get(self, path: str, query: dict | None = None) -> APIResponse:
-        return self._request("GET", path, query=query)
-
-    def _post(self, path: str, body: dict | None = None,
-              if_match: str | None = None) -> APIResponse:
-        return self._request("POST", path, body=body, if_match=if_match)
-
-    def _patch(self, path: str, body: dict,
-               if_match: str | None = None) -> APIResponse:
-        return self._request("PATCH", path, body=body, if_match=if_match)
-
-    def _delete(self, path: str, if_match: str | None = None) -> APIResponse:
-        return self._request("DELETE", path, if_match=if_match)
-
-    # ------------------------------------------------------------------
-    # Health & system
-    # ------------------------------------------------------------------
-
-    def health(self) -> APIResponse:
-        """GET /v1/health — liveness probe."""
-        return self._get("/v1/health")
-
-    def system_info(self) -> APIResponse:
-        """GET /v1/system/info — binary version, uptime, config hash."""
-        return self._get("/v1/system/info")
-
-    # ------------------------------------------------------------------
-    # Runtime gates & initialization
-    # ------------------------------------------------------------------
-
-    def runtime_gates(self) -> APIResponse:
-        """GET /v1/runtime/gates — admission gates and startup progress."""
-        return self._get("/v1/runtime/gates")
-
-    def runtime_initialization(self) -> APIResponse:
-        """GET /v1/runtime/initialization — detailed startup timeline."""
-        return self._get("/v1/runtime/initialization")
-
-    # ------------------------------------------------------------------
-    # Limits & security
-    # ------------------------------------------------------------------
-
-    def limits_effective(self) -> APIResponse:
-        """GET /v1/limits/effective — effective timeout/upstream/ME limits."""
-        return self._get("/v1/limits/effective")
-
-    def security_posture(self) -> APIResponse:
-        """GET /v1/security/posture — API auth, telemetry, log-level summary."""
-        return self._get("/v1/security/posture")
-
-    def security_whitelist(self) -> APIResponse:
-        """GET /v1/security/whitelist — current IP whitelist CIDRs."""
-        return self._get("/v1/security/whitelist")
-
-    # ------------------------------------------------------------------
-    # Stats
-    # ------------------------------------------------------------------
-
-    def stats_summary(self) -> APIResponse:
-        """GET /v1/stats/summary — uptime, connection totals, user count."""
-        return self._get("/v1/stats/summary")
-
-    def stats_zero_all(self) -> APIResponse:
-        """GET /v1/stats/zero/all — zero-cost counters (core, upstream, ME, pool, desync)."""
-        return self._get("/v1/stats/zero/all")
-
-    def stats_upstreams(self) -> APIResponse:
-        """GET /v1/stats/upstreams — upstream health + zero counters."""
-        return self._get("/v1/stats/upstreams")
-
-    def stats_minimal_all(self) -> APIResponse:
-        """GET /v1/stats/minimal/all — ME writers + DC snapshot (requires minimal_runtime_enabled)."""
-        return self._get("/v1/stats/minimal/all")
-
-    def stats_me_writers(self) -> APIResponse:
-        """GET /v1/stats/me-writers — per-writer ME status (requires minimal_runtime_enabled)."""
-        return self._get("/v1/stats/me-writers")
-
-    def stats_dcs(self) -> APIResponse:
-        """GET /v1/stats/dcs — per-DC coverage and writer counts (requires minimal_runtime_enabled)."""
-        return self._get("/v1/stats/dcs")
-
-    # ------------------------------------------------------------------
-    # Runtime deep-dive
-    # ------------------------------------------------------------------
-
-    def runtime_me_pool_state(self) -> APIResponse:
-        """GET /v1/runtime/me_pool_state — ME pool generation/writer/refill snapshot."""
-        return self._get("/v1/runtime/me_pool_state")
-
-    def runtime_me_quality(self) -> APIResponse:
-        """GET /v1/runtime/me_quality — ME KDF, route-drop, and per-DC RTT counters."""
-        return self._get("/v1/runtime/me_quality")
-
-    def runtime_upstream_quality(self) -> APIResponse:
-        """GET /v1/runtime/upstream_quality — per-upstream health, latency, DC preferences."""
-        return self._get("/v1/runtime/upstream_quality")
-
-    def runtime_nat_stun(self) -> APIResponse:
-        """GET /v1/runtime/nat_stun — NAT probe state, STUN servers, reflected IPs."""
-        return self._get("/v1/runtime/nat_stun")
-
-    def runtime_me_selftest(self) -> APIResponse:
-        """GET /v1/runtime/me-selftest — KDF/timeskew/IP/PID/BND health state."""
-        return self._get("/v1/runtime/me-selftest")
-
-    def runtime_connections_summary(self) -> APIResponse:
-        """GET /v1/runtime/connections/summary — live connection totals + top-N users (requires runtime_edge_enabled)."""
-        return self._get("/v1/runtime/connections/summary")
-
-    def runtime_events_recent(self, limit: int | None = None) -> APIResponse:
-        """GET /v1/runtime/events/recent — recent ring-buffer events (requires runtime_edge_enabled).
-
-        Parameters
-        ----------
-        limit:
-            Optional cap on returned events (1–1000, server default 50).
-        """
-        query = {"limit": str(limit)} if limit is not None else None
-        return self._get("/v1/runtime/events/recent", query=query)
-
-    # ------------------------------------------------------------------
-    # Users (read)
-    # ------------------------------------------------------------------
-
-    def list_users(self) -> APIResponse:
-        """GET /v1/users — list all users with connection/traffic info."""
-        return self._get("/v1/users")
-
-    def get_user(self, username: str) -> APIResponse:
-        """GET /v1/users/{username} — single user info."""
-        return self._get(f"/v1/users/{_safe(username)}")
-
-    # ------------------------------------------------------------------
-    # Users (write)
-    # ------------------------------------------------------------------
-
-    def create_user(
-        self,
-        username: str,
-        *,
-        secret: str | None = None,
-        user_ad_tag: str | None = None,
-        max_tcp_conns: int | None = None,
-        expiration_rfc3339: str | None = None,
-        data_quota_bytes: int | None = None,
-        max_unique_ips: int | None = None,
-        if_match: str | None = None,
-    ) -> APIResponse:
-        """POST /v1/users — create a new user.
-
-        Parameters
-        ----------
-        username:
-            ``[A-Za-z0-9_.-]``, length 1–64.
-        secret:
-            Exactly 32 hex chars. Auto-generated if omitted.
-        user_ad_tag:
-            Exactly 32 hex chars.
-        max_tcp_conns:
-            Per-user concurrent TCP limit.
-        expiration_rfc3339:
-            RFC3339 expiration timestamp, e.g. ``"2025-12-31T23:59:59Z"``.
-        data_quota_bytes:
-            Per-user traffic quota in bytes.
-        max_unique_ips:
-            Per-user unique source IP limit.
-        if_match:
-            Optional ``If-Match`` revision for optimistic concurrency.
-        """
-        body: Dict[str, Any] = {"username": username}
-        _opt(body, "secret", secret)
-        _opt(body, "user_ad_tag", user_ad_tag)
-        _opt(body, "max_tcp_conns", max_tcp_conns)
-        _opt(body, "expiration_rfc3339", expiration_rfc3339)
-        _opt(body, "data_quota_bytes", data_quota_bytes)
-        _opt(body, "max_unique_ips", max_unique_ips)
-        return self._post("/v1/users", body=body, if_match=if_match)
-
-    def patch_user(
-        self,
-        username: str,
-        *,
-        secret: str | None = None,
-        user_ad_tag: str | None = None,
-        max_tcp_conns: int | None = None,
-        expiration_rfc3339: str | None = None,
-        data_quota_bytes: int | None = None,
-        max_unique_ips: int | None = None,
-        if_match: str | None = None,
-    ) -> APIResponse:
-        """PATCH /v1/users/{username} — partial update; only provided fields change.
-
-        Parameters
-        ----------
-        username:
-            Existing username to update.
-        secret:
-            New secret (32 hex chars).
-        user_ad_tag:
-            New ad tag (32 hex chars).
-        max_tcp_conns:
-            New TCP concurrency limit.
-        expiration_rfc3339:
-            New expiration timestamp.
-        data_quota_bytes:
-            New quota in bytes.
-        max_unique_ips:
-            New unique IP limit.
-        if_match:
-            Optional ``If-Match`` revision.
-        """
-        body: Dict[str, Any] = {}
-        _opt(body, "secret", secret)
-        _opt(body, "user_ad_tag", user_ad_tag)
-        _opt(body, "max_tcp_conns", max_tcp_conns)
-        _opt(body, "expiration_rfc3339", expiration_rfc3339)
-        _opt(body, "data_quota_bytes", data_quota_bytes)
-        _opt(body, "max_unique_ips", max_unique_ips)
-        if not body:
-            raise ValueError("patch_user: at least one field must be provided")
-        return self._patch(f"/v1/users/{_safe(username)}", body=body,
-                           if_match=if_match)
-
-    def delete_user(
-        self,
-        username: str,
-        *,
-        if_match: str | None = None,
-    ) -> APIResponse:
-        """DELETE /v1/users/{username} — remove user; blocks deletion of last user.
-
-        Parameters
-        ----------
-        if_match:
-            Optional ``If-Match`` revision for optimistic concurrency.
-        """
-        return self._delete(f"/v1/users/{_safe(username)}", if_match=if_match)
-
-    # NOTE: POST /v1/users/{username}/rotate-secret currently returns 404
-    # in the route matcher (documented limitation). The method is provided
-    # for completeness and future compatibility.
-    def rotate_secret(
-        self,
-        username: str,
-        *,
-        secret: str | None = None,
-        if_match: str | None = None,
-    ) -> APIResponse:
-        """POST /v1/users/{username}/rotate-secret — rotate user secret.
-
-        .. warning::
-            This endpoint currently returns ``404 not_found`` in all released
-            versions (documented route matcher limitation). The method is
-            included for future compatibility.
-
-        Parameters
-        ----------
-        secret:
-            New secret (32 hex chars). Auto-generated if omitted.
-        """
-        body: Dict[str, Any] = {}
-        _opt(body, "secret", secret)
-        return self._post(f"/v1/users/{_safe(username)}/rotate-secret",
-                          body=body or None, if_match=if_match)
-
-    # ------------------------------------------------------------------
-    # Convenience helpers
-    # ------------------------------------------------------------------
-
-    @staticmethod
-    def generate_secret() -> str:
-        """Generate a random 32-character hex secret suitable for user creation."""
-        return secrets.token_hex(16)  # 16 bytes → 32 hex chars
-
-
-# ---------------------------------------------------------------------------
-# Internal helpers
-# ---------------------------------------------------------------------------
-
-def _safe(username: str) -> str:
-    """Minimal guard: reject obvious path-injection attempts."""
-    if "/" in username or "\\" in username:
-        raise ValueError(f"Invalid username: {username!r}")
-    return username
-
-
-def _opt(d: dict, key: str, value: Any) -> None:
-    """Add key to dict only when value is not None."""
-    if value is not None:
-        d[key] = value
-
-
-# ---------------------------------------------------------------------------
-# CLI
-# ---------------------------------------------------------------------------
-
-def _print(resp: APIResponse) -> None:
-    print(json.dumps(resp.data, indent=2))
-    if resp.revision:
-        print(f"# revision: {resp.revision}", flush=True)
-
-
-def _build_parser():
-    import argparse
-
-    p = argparse.ArgumentParser(
-        prog="telemt_api.py",
-        description="Telemt Control API CLI",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-        epilog="""
-COMMANDS (read)
-  health                          Liveness check
-  info                            System info (version, uptime, config hash)
-  status                          Runtime gates + startup progress
-  init                            Runtime initialization timeline
-  limits                          Effective limits (timeouts, upstream, ME)
-  posture                         Security posture summary
-  whitelist                       IP whitelist entries
-  summary                         Stats summary (conns, uptime, users)
-  zero                            Zero-cost counters (core/upstream/ME/pool/desync)
-  upstreams                       Upstream health + zero counters
-  minimal                         ME writers + DC snapshot  [minimal_runtime_enabled]
-  me-writers                      Per-writer ME status      [minimal_runtime_enabled]
-  dcs                             Per-DC coverage           [minimal_runtime_enabled]
-  me-pool                         ME pool generation/writer/refill snapshot
-  me-quality                      ME KDF, route-drops, per-DC RTT
-  upstream-quality                Per-upstream health + latency
-  nat-stun                        NAT probe state + STUN servers
-  me-selftest                     KDF/timeskew/IP/PID/BND health
-  connections                     Live connection totals + top-N  [runtime_edge_enabled]
-  events [--limit N]              Recent ring-buffer events       [runtime_edge_enabled]
-
-COMMANDS (users)
-  users                           List all users
-  user <username>                 Get single user
-  create <username> [OPTIONS]     Create user
-  patch  <username> [OPTIONS]     Partial update user
-  delete <username>               Delete user
-  secret <username> [--secret S]  Rotate secret (reserved; returns 404 in current release)
-  gen-secret                      Print a random 32-hex secret and exit
-
-USER OPTIONS (for create / patch)
-  --secret S          32 hex chars
-  --ad-tag S          32 hex chars (ad tag)
-  --max-conns N       Max concurrent TCP connections
-  --expires DATETIME  RFC3339 expiration (e.g. 2026-12-31T23:59:59Z)
-  --quota N           Data quota in bytes
-  --max-ips N         Max unique source IPs
-
-EXAMPLES
-  telemt_api.py health
-  telemt_api.py -u http://10.0.0.1:9091 -a mysecret users
-  telemt_api.py create alice --max-conns 5 --quota 10000000000
-  telemt_api.py patch  alice --expires 2027-01-01T00:00:00Z
-  telemt_api.py delete alice
-  telemt_api.py events --limit 20
-        """,
-    )
-
-    p.add_argument("-u", "--url", default="http://127.0.0.1:9091",
-                   metavar="URL", help="API base URL (default: http://127.0.0.1:9091)")
-    p.add_argument("-a", "--auth", default=None, metavar="TOKEN",
-                   help="Authorization header value")
-    p.add_argument("-t", "--timeout", type=int, default=10, metavar="SEC",
-                   help="Request timeout in seconds (default: 10)")
-
-    p.add_argument("command", nargs="?", default="help",
-                   help="Command to run (see COMMANDS below)")
-    p.add_argument("arg", nargs="?", default=None, metavar="USERNAME",
-                   help="Username for user commands")
-
-    # user create/patch fields
-    p.add_argument("--secret",    default=None)
-    p.add_argument("--ad-tag",    dest="ad_tag", default=None)
-    p.add_argument("--max-conns", dest="max_conns", type=int, default=None)
-    p.add_argument("--expires",   default=None)
-    p.add_argument("--quota",     type=int, default=None)
-    p.add_argument("--max-ips",   dest="max_ips", type=int, default=None)
-
-    # events
-    p.add_argument("--limit", type=int, default=None,
-                   help="Max events for `events` command")
-
-    # optimistic concurrency
-    p.add_argument("--if-match", dest="if_match", default=None,
-                   metavar="REVISION", help="If-Match revision header")
-
-    return p
-
-
-if __name__ == "__main__":
-    import sys
-
-    parser = _build_parser()
-    args = parser.parse_args()
-
-    cmd = (args.command or "help").lower()
-
-    if cmd in ("help", "--help"):
-        parser.print_help()
-        sys.exit(0)
-
-    if cmd == "gen-secret":
-        print(TememtAPI.generate_secret())
-        sys.exit(0)
-
-    api = TememtAPI(args.url, auth_header=args.auth, timeout=args.timeout)
-
-    try:
-        # -- read endpoints --------------------------------------------------
-        if cmd == "health":
-            _print(api.health())
-
-        elif cmd == "info":
-            _print(api.system_info())
-
-        elif cmd == "status":
-            _print(api.runtime_gates())
-
-        elif cmd == "init":
-            _print(api.runtime_initialization())
-
-        elif cmd == "limits":
-            _print(api.limits_effective())
-
-        elif cmd == "posture":
-            _print(api.security_posture())
-
-        elif cmd == "whitelist":
-            _print(api.security_whitelist())
-
-        elif cmd == "summary":
-            _print(api.stats_summary())
-
-        elif cmd == "zero":
-            _print(api.stats_zero_all())
-
-        elif cmd == "upstreams":
-            _print(api.stats_upstreams())
-
-        elif cmd == "minimal":
-            _print(api.stats_minimal_all())
-
-        elif cmd == "me-writers":
-            _print(api.stats_me_writers())
-
-        elif cmd == "dcs":
-            _print(api.stats_dcs())
-
-        elif cmd == "me-pool":
-            _print(api.runtime_me_pool_state())
-
-        elif cmd == "me-quality":
-            _print(api.runtime_me_quality())
-
-        elif cmd == "upstream-quality":
-            _print(api.runtime_upstream_quality())
-
-        elif cmd == "nat-stun":
-            _print(api.runtime_nat_stun())
-
-        elif cmd == "me-selftest":
-            _print(api.runtime_me_selftest())
-
-        elif cmd == "connections":
-            _print(api.runtime_connections_summary())
-
-        elif cmd == "events":
-            _print(api.runtime_events_recent(limit=args.limit))
-
-        # -- user read -------------------------------------------------------
-        elif cmd == "users":
-            resp = api.list_users()
-            users = resp.data or []
-            if not users:
-                print("No users configured.")
-            else:
-                fmt = "{:<24} {:>7}  {:>14}  {}"
-                print(fmt.format("USERNAME", "CONNS", "OCTETS", "LINKS"))
-                print("-" * 72)
-                for u in users:
-                    links = (u.get("links") or {})
-                    all_links = (links.get("classic") or []) + \
-                                (links.get("secure") or []) + \
-                                (links.get("tls") or [])
-                    link_str = all_links[0] if all_links else "-"
-                    print(fmt.format(
-                        u["username"],
-                        u.get("current_connections", 0),
-                        u.get("total_octets", 0),
-                        link_str,
-                    ))
-            if resp.revision:
-                print(f"# revision: {resp.revision}")
-
-        elif cmd == "user":
-            if not args.arg:
-                parser.error("user command requires <username>")
-            _print(api.get_user(args.arg))
-
-        # -- user write ------------------------------------------------------
-        elif cmd == "create":
-            if not args.arg:
-                parser.error("create command requires <username>")
-            resp = api.create_user(
-                args.arg,
-                secret=args.secret,
-                user_ad_tag=args.ad_tag,
-                max_tcp_conns=args.max_conns,
-                expiration_rfc3339=args.expires,
-                data_quota_bytes=args.quota,
-                max_unique_ips=args.max_ips,
-                if_match=args.if_match,
-            )
-            d = resp.data or {}
-            print(f"Created: {d.get('user', {}).get('username')}")
-            print(f"Secret:  {d.get('secret')}")
-            links = (d.get("user") or {}).get("links") or {}
-            for kind, lst in links.items():
-                for link in (lst or []):
-                    print(f"Link ({kind}): {link}")
-            if resp.revision:
-                print(f"# revision: {resp.revision}")
-
-        elif cmd == "patch":
-            if not args.arg:
-                parser.error("patch command requires <username>")
-            if not any([args.secret, args.ad_tag, args.max_conns,
-                        args.expires, args.quota, args.max_ips]):
-                parser.error("patch requires at least one field (--secret, --max-conns, --expires, --quota, --max-ips, --ad-tag)")
-            _print(api.patch_user(
-                args.arg,
-                secret=args.secret,
-                user_ad_tag=args.ad_tag,
-                max_tcp_conns=args.max_conns,
-                expiration_rfc3339=args.expires,
-                data_quota_bytes=args.quota,
-                max_unique_ips=args.max_ips,
-                if_match=args.if_match,
-            ))
-
-        elif cmd == "delete":
-            if not args.arg:
-                parser.error("delete command requires <username>")
-            resp = api.delete_user(args.arg, if_match=args.if_match)
-            print(f"Deleted: {resp.data}")
-            if resp.revision:
-                print(f"# revision: {resp.revision}")
-
-        elif cmd == "secret":
-            if not args.arg:
-                parser.error("secret command requires <username>")
-            _print(api.rotate_secret(args.arg, secret=args.secret,
-                                     if_match=args.if_match))
-
-        else:
-            print(f"Unknown command: {cmd!r}\nRun with 'help' to see available commands.",
-                  file=sys.stderr)
-            sys.exit(1)
-
-    except TememtAPIError as exc:
-        print(f"API error [{exc.http_status}] {exc.code}: {exc}", file=sys.stderr)
-        sys.exit(1)
-    except KeyboardInterrupt:
-        sys.exit(130)
@@ -1165,60 +1165,6 @@ zabbix_export:
              tags:
                - tag: Application
                  value: 'Users connections'
-          graph_prototypes:
-            - uuid: 4199de3dcea943d8a1ec62dc297b2e9f
-              name: 'User {#TELEMT_USER}: Connections'
-              graph_items:
-                - color: 1A7C11
-                  item:
-                    host: Telemt
-                    key: 'telemt.active_conn_[{#TELEMT_USER}]'
-                - color: F63100
-                  sortorder: '1'
-                  item:
-                    host: Telemt
-                    key: 'telemt.total_conn_[{#TELEMT_USER}]'
-            - uuid: 84b8f22d891e49768891f497cac12fb3
-              name: 'User {#TELEMT_USER}: IPs'
-              graph_items:
-                - color: 0080FF
-                  item:
-                    host: Telemt
-                    key: 'telemt.ips_current_[{#TELEMT_USER}]'
-                - color: FF8000
-                  sortorder: '1'
-                  item:
-                    host: Telemt
-                    key: 'telemt.ips_limit_[{#TELEMT_USER}]'
-                - color: AA00FF
-                  sortorder: '2'
-                  item:
-                    host: Telemt
-                    key: 'telemt.ips_utilization_[{#TELEMT_USER}]'
-            - uuid: 09dabe7125114e36a6ce40788a7cb888
-              name: 'User {#TELEMT_USER}: Traffic'
-              graph_items:
-                - color: 00AA00
-                  item:
-                    host: Telemt
-                    key: 'telemt.octets_from_[{#TELEMT_USER}]'
-                - color: AA0000
-                  sortorder: '1'
-                  item:
-                    host: Telemt
-                    key: 'telemt.octets_to_[{#TELEMT_USER}]'
-            - uuid: 367f458962574b0ab3c02278a4cd7ecb
-              name: 'User {#TELEMT_USER}: Messages'
-              graph_items:
-                - color: 00AAFF
-                  item:
-                    host: Telemt
-                    key: 'telemt.msgs_from_[{#TELEMT_USER}]'
-                - color: FF5500
-                  sortorder: '1'
-                  item:
-                    host: Telemt
-                    key: 'telemt.msgs_to_[{#TELEMT_USER}]'
          master_item:
            key: telemt.prom_metrics
          lld_macro_paths:
@@ -1231,206 +1177,3 @@ zabbix_export:
      tags:
        - tag: target
          value: Telemt
-  graphs:
-    - uuid: f162658049ca4f50893c5cc02515ff10
-      name: 'Telemt: Server Connections Overview'
-      graph_items:
-        - color: 1A7C11
-          item:
-            host: Telemt
-            key: telemt.conn_total
-        - color: F63100
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.conn_bad_total
-        - color: FC6EA3
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.handshake_timeouts_total
-    - uuid: 759eca5e687142f19248f9d9343e1adf
-      name: 'Telemt: Uptime'
-      graph_items:
-        - color: 0080FF
-          item:
-            host: Telemt
-            key: telemt.uptime
-    - uuid: 0a27dbd0490d4a508c03ed39fa18545d
-      name: 'Telemt: ME Keepalive'
-      graph_items:
-        - color: 1A7C11
-          item:
-            host: Telemt
-            key: telemt.me_keepalive_sent_total
-        - color: 00AA00
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.me_keepalive_pong_total
-        - color: F63100
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.me_keepalive_failed_total
-        - color: FF8000
-          sortorder: '3'
-          item:
-            host: Telemt
-            key: telemt.me_keepalive_timeout_total
-    - uuid: 4015e24ff70b49f484e884d1dde687c0
-      name: 'Telemt: ME Reconnects'
-      graph_items:
-        - color: 0080FF
-          item:
-            host: Telemt
-            key: telemt.me_reconnect_attempts_total
-        - color: 1A7C11
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.me_reconnect_success_total
-    - uuid: f3e3eeb0663c471aa26cf4b6872b0c50
-      name: 'Telemt: ME Route Drops'
-      graph_items:
-        - color: F63100
-          item:
-            host: Telemt
-            key: telemt.me_route_drop_channel_closed_total
-        - color: FF8000
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.me_route_drop_no_conn_total
-        - color: AA00FF
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.me_route_drop_queue_full_total
-    - uuid: 49b51ed78a5943bdbd6d1d34fe28bf61
-      name: 'Telemt: ME Writer Pool'
-      graph_items:
-        - color: 0080FF
-          item:
-            host: Telemt
-            key: telemt.pool_drain_active
-        - color: F63100
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.pool_force_close_total
-        - color: FF8000
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.pool_stale_pick_total
-        - color: 1A7C11
-          sortorder: '3'
-          item:
-            host: Telemt
-            key: telemt.pool_swap_total
-    - uuid: a0779e6c979f4c1ab7ac4da7123a5ecb
-      name: 'Telemt: ME Writer Removals and Restores'
-      graph_items:
-        - color: F63100
-          item:
-            host: Telemt
-            key: telemt.me_writer_removed_total
-        - color: FF8000
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.me_writer_removed_unexpected_total
-        - color: FFAA00
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.me_writer_removed_unexpected_minus_restored_total
-        - color: 1A7C11
-          sortorder: '3'
-          item:
-            host: Telemt
-            key: telemt.me_writer_restored_same_endpoint_total
-        - color: 00AA00
-          sortorder: '4'
-          item:
-            host: Telemt
-            key: telemt.me_writer_restored_fallback_total
-    - uuid: 4fead70290664953b026a228108bee0e
-      name: 'Telemt: Desync Detections'
-      graph_items:
-        - color: F63100
-          item:
-            host: Telemt
-            key: telemt.desync_total
-        - color: 1A7C11
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.desync_full_logged_total
-        - color: FF8000
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.desync_suppressed_total
-    - uuid: 9f8c9f48cb534a66ac21b1bba1acb602
-      name: 'Telemt: Upstream Connect Cycles'
-      graph_items:
-        - color: 0080FF
-          item:
-            host: Telemt
-            key: telemt.upstream_connect_attempt_total
-        - color: 1A7C11
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.upstream_connect_success_total
-        - color: F63100
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.upstream_connect_fail_total
-        - color: FF8000
-          sortorder: '3'
-          item:
-            host: Telemt
-            key: telemt.upstream_connect_failfast_hard_error_total
-    - uuid: 05182057727547f8b8884b7e71e34f19
-      name: 'Telemt: ME Single-Endpoint Outages'
-      graph_items:
-        - color: F63100
-          item:
-            host: Telemt
-            key: telemt.me_single_endpoint_outage_enter_total
-        - color: 1A7C11
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.me_single_endpoint_outage_exit_total
-        - color: 0080FF
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.me_single_endpoint_outage_reconnect_attempt_total
-        - color: 00AA00
-          sortorder: '3'
-          item:
-            host: Telemt
-            key: telemt.me_single_endpoint_outage_reconnect_success_total
-    - uuid: 6892e8b7fbd2445d9ccc0574af58a354
-      name: 'Telemt: ME Refill Activity'
-      graph_items:
-        - color: 0080FF
-          item:
-            host: Telemt
-            key: telemt.me_refill_triggered_total
-        - color: F63100
-          sortorder: '1'
-          item:
-            host: Telemt
-            key: telemt.me_refill_failed_total
-        - color: FF8000
-          sortorder: '2'
-          item:
-            host: Telemt
-            key: telemt.me_refill_skipped_inflight_total
Author	SHA1	Message	Date
Alexey	f8e1e2f2ea	Merge pull request #495 from DavidOsipov/rescue/flow-sec-security Add adversarial and security tests for client, handshake, and relay modules	2026-03-19 17:33:08 +03:00
David Osipov	924c0d32e9	Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-03-19 18:23:36 +04:00
David Osipov	e6ad9e4c7f	Add security tests for connection limits and handshake integrity - Implement a test to ensure that exceeding the user connection limit does not leak the current connections counter. - Add tests for direct relay connection refusal and adversarial scenarios to verify proper error handling. - Introduce fuzz testing for MTProto handshake to ensure robustness against malformed inputs and replay attacks. - Remove obsolete short TLS probe throttle tests and integrate their functionality into existing security tests. - Enhance middle relay tests to validate behavior during connection drops and cutovers, ensuring graceful error handling. - Add a test for half-close scenarios in relay to confirm bidirectional data flow continues as expected.	2026-03-19 17:31:19 +04:00
David Osipov	2a01ca2d6f	Add adversarial tests for client, handshake, masking, and relay modules - Introduced `client_adversarial_tests.rs` to stress test connection limits and IP tracker race conditions. - Added `handshake_adversarial_tests.rs` for mutational bit-flipping tests and timing neutrality checks. - Created `masking_adversarial_tests.rs` to validate probing indistinguishability and SSRF prevention. - Implemented `relay_adversarial_tests.rs` to ensure HOL blocking prevention and data quota enforcement. - Updated respective modules to include new test paths.	2026-03-19 17:31:19 +04:00
Alexey	44376b5652	Merge pull request #463 from DavidOsipov/pr-sec-1 [WIP] Enhance metrics configuration, add health monitoring tests, security hardening, perf optimizations & loads of tests	2026-03-18 23:02:58 +03:00
David Osipov	c7cf37898b	feat: enhance quota user lock management and testing - Adjusted QUOTA_USER_LOCKS_MAX based on test and non-test configurations to improve flexibility. - Implemented logic to retain existing locks when the maximum quota is reached, ensuring efficient memory usage. - Added comprehensive tests for quota user lock functionality, including cache reuse, saturation behavior, and race conditions. - Enhanced StatsIo struct to manage wake scheduling for read and write operations, preventing unnecessary self-wakes. - Introduced separate replay checker domains for handshake and TLS to ensure isolation and prevent cross-pollution of keys. - Added security tests for replay checker to validate domain separation and window clamping behavior.	2026-03-18 23:55:08 +04:00
David Osipov	20e205189c	Enhance TLS Emulator with ALPN Support and Add Adversarial Tests - Modified `build_emulated_server_hello` to accept ALPN (Application-Layer Protocol Negotiation) as an optional parameter, allowing for the embedding of ALPN markers in the application data payload. - Implemented logic to handle oversized ALPN values and ensure they do not interfere with the application data payload. - Added new security tests in `emulator_security_tests.rs` to validate the behavior of the ALPN embedding, including scenarios for oversized ALPN and preference for certificate payloads over ALPN markers. - Introduced `send_adversarial_tests.rs` to cover edge cases and potential issues in the middle proxy's send functionality, ensuring robustness against various failure modes. - Updated `middle_proxy` module to include new test modules and ensure proper handling of writer commands during data transmission.	2026-03-18 17:04:50 +04:00
David Osipov	97d4a1c5c8	Refactor and enhance security in proxy and handshake modules - Updated `direct_relay_security_tests.rs` to ensure sanitized paths are correctly validated against resolved paths. - Added tests for symlink handling in `unknown_dc_log_path_revalidation` to prevent symlink target escape vulnerabilities. - Modified `handshake.rs` to use a more robust hashing strategy for eviction offsets, improving the eviction logic in `auth_probe_record_failure_with_state`. - Introduced new tests in `handshake_security_tests.rs` to validate eviction logic under various conditions, ensuring low fail streak entries are prioritized for eviction. - Simplified `route_mode.rs` by removing unnecessary atomic mode tracking, streamlining the transition logic in `RouteRuntimeController`. - Enhanced `route_mode_security_tests.rs` with comprehensive tests for mode transitions and their effects on session states, ensuring consistency under concurrent modifications. - Cleaned up `emulator.rs` by removing unused ALPN extension handling, improving code clarity and maintainability.	2026-03-18 01:40:38 +04:00
David Osipov	c2443e6f1a	Refactor auth probe eviction logic and improve performance - Simplified eviction candidate selection in `auth_probe_record_failure_with_state` by tracking the oldest candidate directly. - Enhanced the handling of stale entries to ensure newcomers are tracked even under capacity constraints. - Added tests to verify behavior under stress conditions and ensure newcomers are correctly managed. - Updated `decode_user_secrets` to prioritize preferred users based on SNI hints. - Introduced new tests for TLS SNI handling and replay protection mechanisms. - Improved deduplication hash stability and collision resistance in middle relay logic. - Refined cutover handling in route mode to ensure consistent error messaging and session management.	2026-03-18 00:38:59 +04:00
David Osipov	a7cffb547e	Implement idle timeout for masking relay and add corresponding tests - Introduced `copy_with_idle_timeout` function to handle reading and writing with an idle timeout. - Updated the proxy masking logic to use the new idle timeout function. - Added tests to verify that idle relays are closed by the idle timeout before the global relay timeout. - Ensured that connect refusal paths respect the masking budget and that responses followed by silence are cut off by the idle timeout. - Added tests for adversarial scenarios where clients may attempt to drip-feed data beyond the idle timeout.	2026-03-17 22:48:13 +04:00
David Osipov	f0c37f233e	Refactor health management: implement remove_writer_if_empty method for cleaner writer removal logic and update related functions to enhance efficiency in handling closed writers.	2026-03-17 21:38:15 +04:00
David Osipov	60953bcc2c	Refactor user connection limit checks and enhance health monitoring tests: update warning messages, add new tests for draining writers, and improve state management	2026-03-17 20:53:37 +04:00
David Osipov	2c06288b40	Enhance UserConnectionReservation: add runtime handle for cross-thread IP cleanup and implement tests for user expiration and connection limits	2026-03-17 20:21:01 +04:00
David Osipov	0284b9f9e3	Refactor health integration tests to use wait_for_pool_empty for improved readability and timeout handling	2026-03-17 20:14:07 +04:00
David Osipov	4e3f42dce3	Add must_use attribute to UserConnectionReservation and RouteConnectionLease structs for better resource management	2026-03-17 19:55:55 +04:00
David Osipov	50a827e7fd	Merge upstream/flow-sec into pr-sec-1	2026-03-17 19:48:53 +04:00
David Osipov	d81140ccec	Enhance UserConnectionReservation management: add active state and release method, improve cleanup on drop, and implement tests for immediate release and concurrent handling	2026-03-17 19:39:29 +04:00
David Osipov	c540a6657f	Implement user connection reservation management and enhance relay task handling in proxy	2026-03-17 19:05:26 +04:00
David Osipov	4808a30185	Merge upstream/main into flow-sec rehearsal: resolve config and middle-proxy health conflicts	2026-03-17 18:35:54 +04:00
David Osipov	1357f3cc4c	bump version to 3.3.20 and implement connection lease management for direct and middle relays	2026-03-17 18:16:17 +04:00
David Osipov	d9aa6f4956	Merge upstream/main into pr-sec-1	2026-03-17 17:49:10 +04:00
Alexey	4f55d08c51	Merge pull request #454 from DavidOsipov/pr-sec-1 PR-SEC-1: Доп. харденинг и маскинг	2026-03-17 15:35:08 +03:00
David Osipov	93caab1aec	feat(proxy): refactor auth probe failure handling and add concurrent failure tests	2026-03-17 16:25:29 +04:00
David Osipov	0c6bb3a641	feat(proxy): implement auth probe eviction logic and corresponding tests	2026-03-17 15:43:07 +04:00
David Osipov	b2e15327fe	feat(proxy): enhance auth probe handling with IPv6 normalization and eviction logic	2026-03-17 15:15:12 +04:00
Alexey	2e8be87ccf	ME Writer Draining-state fixes	2026-03-17 13:58:01 +03:00
Alexey	d78360982c	Hot-Reload fixes	2026-03-17 13:02:12 +03:00
Alexey	822bcbf7a5	Update Cargo.toml	2026-03-17 11:21:35 +03:00
Alexey	b25ec97a43	Merge pull request #447 from DavidOsipov/pr-sec-1 PR-SEC-1 (WIP): Первый PR с узкой пачкой исправлений безопасности и маскировки. Упор сделан на /src/proxy	2026-03-17 11:20:36 +03:00
David Osipov	8821e38013	feat(proxy): enhance auth probe capacity with stale entry pruning and new tests	2026-03-17 02:19:14 +04:00
David Osipov	a1caebbe6f	feat(proxy): implement timeout handling for client payload reads and add corresponding tests	2026-03-17 01:53:44 +04:00
David Osipov	e0d821c6b6	Merge remote-tracking branch 'upstream/main' into pr-sec-1	2026-03-17 01:51:35 +04:00
David Osipov	205fc88718	feat(proxy): enhance logging and deduplication for unknown datacenters - Implemented a mechanism to log unknown datacenter indices with a distinct limit to avoid excessive logging. - Introduced tests to ensure that logging is deduplicated per datacenter index and respects the distinct limit. - Updated the fallback logic for datacenter resolution to prevent panics when only a single datacenter is available. feat(proxy): add authentication probe throttling - Added a pre-authentication probe throttling mechanism to limit the rate of invalid TLS and MTProto handshake attempts. - Introduced a backoff strategy for repeated failures and ensured that successful handshakes reset the failure count. - Implemented tests to validate the behavior of the authentication probe under various conditions. fix(proxy): ensure proper flushing of masked writes - Added a flush operation after writing initial data to the mask writer to ensure data integrity. refactor(proxy): optimize desynchronization deduplication - Replaced the Mutex-based deduplication structure with a DashMap for improved concurrency and performance. - Implemented a bounded cache for deduplication to limit memory usage and prevent stale entries from persisting. test(proxy): enhance security tests for middle relay and handshake - Added comprehensive tests for the middle relay and handshake processes, including scenarios for deduplication and authentication probe behavior. - Ensured that the tests cover edge cases and validate the expected behavior of the system under load.	2026-03-17 01:29:30 +04:00
David Osipov	e4a50f9286	feat(tls): add boot time timestamp constant and validation for SNI hostnames - Introduced `BOOT_TIME_MAX_SECS` constant to define the maximum accepted boot-time timestamp. - Updated `validate_tls_handshake_at_time` to utilize the new boot time constant for timestamp validation. - Enhanced `extract_sni_from_client_hello` to validate SNI hostnames against specified criteria, rejecting invalid hostnames. - Added tests to ensure proper handling of boot time timestamps and SNI validation. feat(handshake): improve user secret decoding and ALPN enforcement - Refactored user secret decoding to provide better error handling and logging for invalid secrets. - Added tests for concurrent identical handshakes to ensure replay protection works as expected. - Implemented ALPN enforcement in handshake processing, rejecting unsupported protocols and allowing valid ones. fix(masking): implement timeout handling for masking operations - Added timeout handling for writing proxy headers and consuming client data in masking. - Adjusted timeout durations for testing to ensure faster feedback during unit tests. - Introduced tests to verify behavior when masking is disabled and when proxy header writes exceed the timeout. test(masking): add tests for slowloris connections and proxy header timeouts - Created tests to validate that slowloris connections are closed by consume timeout when masking is disabled. - Added a test for proxy header write timeout to ensure it returns false when the write operation does not complete.	2026-03-16 21:37:59 +04:00
David Osipov	213ce4555a	Merge remote-tracking branch 'upstream/main' into pr-sec-1	2026-03-16 20:51:53 +04:00
David Osipov	5a16e68487	Enhance TLS record handling and security tests - Enforce TLS record length constraints in client handling to comply with RFC 8446, rejecting records outside the range of 512 to 16,384 bytes. - Update security tests to validate behavior for oversized and undersized TLS records, ensuring they are correctly masked or rejected. - Introduce new tests to verify the handling of TLS records in both generic and client handler pipelines. - Refactor handshake logic to enforce mode restrictions based on transport type, preventing misuse of secure tags. - Add tests for nonce generation and encryption consistency, ensuring correct behavior for different configurations. - Improve masking tests to ensure proper logging and detection of client types, including SSH and unknown probes.	2026-03-16 20:43:49 +04:00
David Osipov	6ffbc51fb0	security: harden handshake/masking flows and add adversarial regressions - forward valid-TLS/invalid-MTProto clients to mask backend in both client paths\n- harden TLS validation against timing and clock edge cases\n- move replay tracking behind successful authentication to avoid cache pollution\n- tighten secret decoding and key-material handling paths\n- add dedicated security test modules for tls/client/handshake/masking\n- include production-path regression for ClientHandler fallback behavior	2026-03-16 20:04:41 +04:00
David Osipov	dcab19a64f	ci: remove CI workflow changes (deferred to later PR)	2026-03-16 13:56:46 +04:00
David Osipov	f10ca192fa	chore: merge upstream/main (`92972ab`) into pr-sec-1	2026-03-16 13:50:46 +04:00
David Osipov	2bd9036908	ci: add security policy, cargo-deny configuration, and audit workflow - Add deny.toml with license/advisory policy for cargo-deny - Add security.yml GitHub Actions workflow for automated audit - Update rust.yml with hardened clippy lint enforcement - Update Cargo.toml/Cargo.lock with audit-related dependency additions - Fix clippy lint placement in config.toml (Clippy lints must not live in rustflags) Part of PR-SEC-1: no Rust source changes, establishes CI gates for all subsequent PRs.	2026-03-15 00:30:36 +04:00