This commit introduces a comprehensive set of improvements to enhance
the security, reliability, and configurability of the proxy server,
specifically targeting adversarial resilience and high-load concurrency.
Security & Cryptography:
- Zeroize MTProto cryptographic key material (`dec_key`, `enc_key`)
immediately after use to prevent memory leakage on early returns.
- Move TLS handshake replay tracking after full policy/ALPN validation
to prevent cache poisoning by unauthenticated probes.
- Add `proxy_protocol_trusted_cidrs` configuration to restrict PROXY
protocol headers to trusted networks, rejecting spoofed IPs.
Adversarial Resilience & DoS Mitigation:
- Implement "Tiny Frame Debt" tracking in the middle-relay to prevent
CPU exhaustion from malicious 0-byte or 1-byte frame floods.
- Add `mask_relay_max_bytes` to strictly bound unauthenticated fallback
connections, preventing the proxy from being abused as an open relay.
- Add a 5ms prefetch window (`mask_classifier_prefetch_timeout_ms`) to
correctly assemble and classify fragmented HTTP/1.1 and HTTP/2 probes
(e.g., `PRI * HTTP/2.0`) before routing them to masking heuristics.
- Prevent recursive masking loops (FD exhaustion) by verifying the mask
target is not the proxy's own listener via local interface enumeration.
Concurrency & Reliability:
- Eliminate executor waker storms during quota lock contention by replacing
the spin-waker task with inline `Sleep` and exponential backoff.
- Roll back user quota reservations (`rollback_me2c_quota_reservation`)
if a network write fails, preventing Head-of-Line (HoL) blocking from
permanently burning data quotas.
- Recover gracefully from idle-registry `Mutex` poisoning instead of
panicking, ensuring isolated thread failures do not break the proxy.
- Fix `auth_probe_scan_start_offset` modulo logic to ensure bounds safety.
Testing:
- Add extensive adversarial, timing, fuzzing, and invariant test suites
for both the client and handshake modules.
- Bump versions of several dependencies in Cargo.toml for improved functionality and security, including:
- socket2 to 0.6
- nix to 0.31
- toml to 1.0
- x509-parser to 0.18
- dashmap to 6.1
- rand to 0.10
- reqwest to 0.13
- notify to 8.2
- ipnetwork to 0.21
- webpki-roots to 1.0
- criterion to 0.8
- Introduce `OnceLock` for secure random number generation in multiple modules to ensure thread safety and reduce overhead.
- Refactor random number generation calls to use the new `RngExt` trait methods for consistency and clarity.
- Add new PNG files for architectural documentation.
- Introduced red-team expected-fail tests for client masking shape hardening.
- Added integration tests for masking AB envelope blur to improve obfuscation.
- Implemented masking security tests to validate the behavior of masking under various conditions.
- Created tests for masking shape above-cap blur to ensure proper functionality.
- Developed adversarial tests for masking shape hardening to evaluate robustness against attacks.
- Added timing normalization security tests to assess the effectiveness of timing obfuscation.
- Implemented red-team expected-fail tests for timing side-channel vulnerabilities.
- Removed assertions for expected client hello messages in multiple TLS fallback tests to streamline the test logic.
- Updated the tests to focus on verifying the trailing TLS records received after the fallback.
- Enhanced the masking functionality by adding shape hardening features, including dynamic padding based on sent data size.
- Modified the relay_to_mask function to accommodate new parameters for shape hardening.
- Updated masking security tests to reflect changes in the relay_to_mask function signature.
- Introduced multiple adversarial tests for MTProto handshake to ensure robustness against replay attacks, invalid mutations, and concurrent flooding.
- Implemented a function to build proxy headers based on the specified version, improving the handling of masking protocols.
- Added tests to validate the behavior of the masking functionality under various conditions, including unknown proxy protocol versions and oversized payloads.
- Enhanced relay tests to ensure stability and performance under high load and half-close scenarios.
- Introduced a new test module for middle relay idle policy security tests, covering various scenarios including soft mark, hard close, and grace periods.
- Implemented functions to create crypto readers and encrypt data for testing.
- Enhanced the Stats struct to include counters for relay idle soft marks, hard closes, pressure evictions, and protocol desync closes.
- Added corresponding increment and retrieval methods for the new stats fields.
- Simplified eviction candidate selection in `auth_probe_record_failure_with_state` by tracking the oldest candidate directly.
- Enhanced the handling of stale entries to ensure newcomers are tracked even under capacity constraints.
- Added tests to verify behavior under stress conditions and ensure newcomers are correctly managed.
- Updated `decode_user_secrets` to prioritize preferred users based on SNI hints.
- Introduced new tests for TLS SNI handling and replay protection mechanisms.
- Improved deduplication hash stability and collision resistance in middle relay logic.
- Refined cutover handling in route mode to ensure consistent error messaging and session management.
- Introduced `copy_with_idle_timeout` function to handle reading and writing with an idle timeout.
- Updated the proxy masking logic to use the new idle timeout function.
- Added tests to verify that idle relays are closed by the idle timeout before the global relay timeout.
- Ensured that connect refusal paths respect the masking budget and that responses followed by silence are cut off by the idle timeout.
- Added tests for adversarial scenarios where clients may attempt to drip-feed data beyond the idle timeout.
- Bump telemt dependency version from 3.3.15 to 3.3.19.
- Add `metrics_listen` option to `config.toml` for specifying a custom address for the metrics endpoint.
- Update `ServerConfig` struct to include `metrics_listen` and adjust logic in `spawn_metrics_if_configured` to prioritize this new option over `metrics_port`.
- Enhance error handling for invalid listen addresses in metrics setup.
- Implemented a mechanism to log unknown datacenter indices with a distinct limit to avoid excessive logging.
- Introduced tests to ensure that logging is deduplicated per datacenter index and respects the distinct limit.
- Updated the fallback logic for datacenter resolution to prevent panics when only a single datacenter is available.
feat(proxy): add authentication probe throttling
- Added a pre-authentication probe throttling mechanism to limit the rate of invalid TLS and MTProto handshake attempts.
- Introduced a backoff strategy for repeated failures and ensured that successful handshakes reset the failure count.
- Implemented tests to validate the behavior of the authentication probe under various conditions.
fix(proxy): ensure proper flushing of masked writes
- Added a flush operation after writing initial data to the mask writer to ensure data integrity.
refactor(proxy): optimize desynchronization deduplication
- Replaced the Mutex-based deduplication structure with a DashMap for improved concurrency and performance.
- Implemented a bounded cache for deduplication to limit memory usage and prevent stale entries from persisting.
test(proxy): enhance security tests for middle relay and handshake
- Added comprehensive tests for the middle relay and handshake processes, including scenarios for deduplication and authentication probe behavior.
- Ensured that the tests cover edge cases and validate the expected behavior of the system under load.
This commit adds support for configuring the data path via a
configuration file or command-line option. This may be useful
on systems without systemd, such as OpenWrt or Alpine Linux.
Signed-off-by: Maxim Anisimov <maxim.anisimov.ua@gmail.com>