mirror of https://github.com/telemt/telemt.git
Integration test merge: upstream/main into flow-sec security branch (prefer flow-sec on conflicts)
This commit is contained in:
commit
7b44496706
|
|
@ -0,0 +1,208 @@
|
|||
# Code of Conduct
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
Telemt exists to solve technical problems.
|
||||
|
||||
Telemt is open to contributors who want to learn, improve and build meaningful systems together.
|
||||
|
||||
It is a place for building, testing, reasoning, documenting, and improving systems.
|
||||
|
||||
Discussions that advance this work are in scope. Discussions that divert it are not.
|
||||
|
||||
Technology has consequences. Responsibility is inherent.
|
||||
|
||||
> **Zweck bestimmt die Form.**
|
||||
|
||||
> Purpose defines form.
|
||||
|
||||
---
|
||||
|
||||
## 2. Principles
|
||||
|
||||
* **Technical over emotional**
|
||||
Arguments are grounded in data, logs, reproducible cases, or clear reasoning.
|
||||
|
||||
* **Clarity over noise**
|
||||
Communication is structured, concise, and relevant.
|
||||
|
||||
* **Openness with standards**
|
||||
Participation is open. The work remains disciplined.
|
||||
|
||||
* **Independence of judgment**
|
||||
Claims are evaluated on technical merit, not affiliation or posture.
|
||||
|
||||
* **Responsibility over capability**
|
||||
Capability does not justify careless use.
|
||||
|
||||
* **Cooperation over friction**
|
||||
Progress depends on coordination, mutual support, and honest review.
|
||||
|
||||
* **Good intent, rigorous method**
|
||||
Assume good intent, but require rigor.
|
||||
|
||||
> **Aussagen gelten nach ihrer Begründung.**
|
||||
|
||||
> Claims are weighed by evidence.
|
||||
|
||||
---
|
||||
|
||||
## 3. Expected Behavior
|
||||
|
||||
Participants are expected to:
|
||||
|
||||
* Communicate directly and respectfully
|
||||
* Support claims with evidence
|
||||
* Stay within technical scope
|
||||
* Accept critique and provide it constructively
|
||||
* Reduce noise, duplication, and ambiguity
|
||||
* Help others reach correct and reproducible outcomes
|
||||
* Act in a way that improves the system as a whole
|
||||
|
||||
Precision is learned.
|
||||
|
||||
New contributors are welcome. They are expected to grow into these standards. Existing contributors are expected to make that growth possible.
|
||||
|
||||
> **Wer behauptet, belegt.**
|
||||
|
||||
> Whoever claims, proves.
|
||||
|
||||
---
|
||||
|
||||
## 4. Unacceptable Behavior
|
||||
|
||||
The following is not allowed:
|
||||
|
||||
* Personal attacks, insults, harassment, or intimidation
|
||||
* Repeatedly derailing discussion away from Telemt’s purpose
|
||||
* Spam, flooding, or repeated low-quality input
|
||||
* Misinformation presented as fact
|
||||
* Attempts to degrade, destabilize, or exhaust Telemt or its participants
|
||||
* Use of Telemt or its spaces to enable harm
|
||||
|
||||
Telemt is not a venue for disputes that displace technical work.
|
||||
Such discussions may be closed, removed, or redirected.
|
||||
|
||||
> **Störung ist kein Beitrag.**
|
||||
|
||||
> Disruption is not contribution.
|
||||
|
||||
---
|
||||
|
||||
## 5. Security and Misuse
|
||||
|
||||
Telemt is intended for responsible use.
|
||||
|
||||
* Do not use it to plan, coordinate, or execute harm
|
||||
* Do not publish vulnerabilities without responsible disclosure
|
||||
* Report security issues privately where possible
|
||||
|
||||
Security is both technical and behavioral.
|
||||
|
||||
> **Verantwortung endet nicht am Code.**
|
||||
|
||||
> Responsibility does not end at the code.
|
||||
|
||||
---
|
||||
|
||||
## 6. Openness
|
||||
|
||||
Telemt is open to contributors of different backgrounds, experience levels, and working styles.
|
||||
|
||||
Standards are public, legible, and applied to the work itself.
|
||||
|
||||
Questions are welcome. Careful disagreement is welcome. Honest correction is welcome.
|
||||
|
||||
Gatekeeping by obscurity, status signaling, or hostility is not.
|
||||
|
||||
---
|
||||
|
||||
## 7. Scope
|
||||
|
||||
This Code of Conduct applies to all official spaces:
|
||||
|
||||
* Source repositories (issues, pull requests, discussions)
|
||||
* Documentation
|
||||
* Communication channels associated with Telemt
|
||||
|
||||
---
|
||||
|
||||
## 8. Maintainer Stewardship
|
||||
|
||||
Maintainers are responsible for final decisions in matters of conduct, scope, and direction.
|
||||
|
||||
This responsibility is stewardship: preserving continuity, protecting signal, maintaining standards, and keeping Telemt workable for others.
|
||||
|
||||
Judgment should be exercised with restraint, consistency, and institutional responsibility.
|
||||
|
||||
Not every decision requires extended debate.
|
||||
Not every intervention requires public explanation.
|
||||
|
||||
All decisions are expected to serve the durability, clarity, and integrity of Telemt.
|
||||
|
||||
> **Ordnung ist Voraussetzung der Funktion.**
|
||||
|
||||
> Order is the precondition of function.
|
||||
|
||||
---
|
||||
|
||||
## 9. Enforcement
|
||||
|
||||
Maintainers may act to preserve the integrity of Telemt, including by:
|
||||
|
||||
* Removing content
|
||||
* Locking discussions
|
||||
* Rejecting contributions
|
||||
* Restricting or banning participants
|
||||
|
||||
Actions are taken to maintain function, continuity, and signal quality.
|
||||
|
||||
Where possible, correction is preferred to exclusion.
|
||||
|
||||
Where necessary, exclusion is preferred to decay.
|
||||
|
||||
---
|
||||
|
||||
## 10. Final
|
||||
|
||||
Telemt is built on discipline, structure, and shared intent.
|
||||
|
||||
Signal over noise.
|
||||
Facts over opinion.
|
||||
Systems over rhetoric.
|
||||
|
||||
Work is collective.
|
||||
Outcomes are shared.
|
||||
Responsibility is distributed.
|
||||
|
||||
Precision is learned.
|
||||
Rigor is expected.
|
||||
Help is part of the work.
|
||||
|
||||
> **Ordnung ist Voraussetzung der Freiheit.**
|
||||
|
||||
If you contribute — contribute with care.
|
||||
If you speak — speak with substance.
|
||||
If you engage — engage constructively.
|
||||
|
||||
---
|
||||
|
||||
## 11. After All
|
||||
|
||||
Systems outlive intentions.
|
||||
|
||||
What is built will be used.
|
||||
What is released will propagate.
|
||||
What is maintained will define the future state.
|
||||
|
||||
There is no neutral infrastructure, only infrastructure shaped well or poorly.
|
||||
|
||||
> **Jedes System trägt Verantwortung.**
|
||||
|
||||
> Every system carries responsibility.
|
||||
|
||||
Stability requires discipline.
|
||||
Freedom requires structure.
|
||||
Trust requires honesty.
|
||||
|
||||
In the end, the system reflects its contributors.
|
||||
|
|
@ -1,6 +1,6 @@
|
|||
[package]
|
||||
name = "telemt"
|
||||
version = "3.3.20"
|
||||
version = "3.3.24"
|
||||
edition = "2024"
|
||||
|
||||
[dependencies]
|
||||
|
|
|
|||
|
|
@ -0,0 +1,294 @@
|
|||
# Telemt Config Parameters Reference
|
||||
|
||||
This document lists all configuration keys accepted by `config.toml`.
|
||||
|
||||
> [!WARNING]
|
||||
>
|
||||
> The configuration parameters detailed in this document are intended for advanced users and fine-tuning purposes. Modifying these settings without a clear understanding of their function may lead to application instability or other unexpected behavior. Please proceed with caution and at your own risk.
|
||||
|
||||
## Top-level keys
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| include | `String` (special directive) | `null` | — | Includes another TOML file with `include = "relative/or/absolute/path.toml"`; includes are processed recursively before parsing. |
|
||||
| show_link | `"*" \| String[]` | `[]` (`ShowLink::None`) | — | Legacy top-level link visibility selector (`"*"` for all users or explicit usernames list). |
|
||||
| dc_overrides | `Map<String, String[]>` | `{}` | — | Overrides DC endpoints for non-standard DCs; key is DC id string, value is `ip:port` list. |
|
||||
| default_dc | `u8 \| null` | `null` (effective fallback: `2` in ME routing) | — | Default DC index used for unmapped non-standard DCs. |
|
||||
|
||||
## [general]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| data_path | `String \| null` | `null` | — | Optional runtime data directory path. |
|
||||
| prefer_ipv6 | `bool` | `false` | — | Prefer IPv6 where applicable in runtime logic. |
|
||||
| fast_mode | `bool` | `true` | — | Enables fast-path optimizations for traffic processing. |
|
||||
| use_middle_proxy | `bool` | `true` | none | Enables ME transport mode; if `false`, runtime falls back to direct DC routing. |
|
||||
| proxy_secret_path | `String \| null` | `"proxy-secret"` | Path may be `null`. | Path to Telegram infrastructure proxy-secret file used by ME handshake logic. |
|
||||
| proxy_config_v4_cache_path | `String \| null` | `"cache/proxy-config-v4.txt"` | — | Optional cache path for raw `getProxyConfig` (IPv4) snapshot. |
|
||||
| proxy_config_v6_cache_path | `String \| null` | `"cache/proxy-config-v6.txt"` | — | Optional cache path for raw `getProxyConfigV6` (IPv6) snapshot. |
|
||||
| ad_tag | `String \| null` | `null` | — | Global fallback ad tag (32 hex characters). |
|
||||
| middle_proxy_nat_ip | `IpAddr \| null` | `null` | Must be a valid IP when set. | Manual public NAT IP override used as ME address material when set. |
|
||||
| middle_proxy_nat_probe | `bool` | `true` | Auto-forced to `true` when `use_middle_proxy = true`. | Enables ME NAT probing; runtime may force it on when ME mode is active. |
|
||||
| middle_proxy_nat_stun | `String \| null` | `null` | Deprecated. Use `network.stun_servers`. | Deprecated legacy single STUN server for NAT probing. |
|
||||
| middle_proxy_nat_stun_servers | `String[]` | `[]` | Deprecated. Use `network.stun_servers`. | Deprecated legacy STUN list for NAT probing fallback. |
|
||||
| stun_nat_probe_concurrency | `usize` | `8` | Must be `> 0`. | Maximum number of parallel STUN probes during NAT/public endpoint discovery. |
|
||||
| middle_proxy_pool_size | `usize` | `8` | none | Target size of active ME writer pool. |
|
||||
| middle_proxy_warm_standby | `usize` | `16` | none | Reserved compatibility field in current runtime revision. |
|
||||
| me_init_retry_attempts | `u32` | `0` | `0..=1_000_000`. | Startup retries for ME pool initialization (`0` means unlimited). |
|
||||
| me2dc_fallback | `bool` | `true` | — | Allows fallback from ME mode to direct DC when ME startup fails. |
|
||||
| me_keepalive_enabled | `bool` | `true` | none | Enables periodic ME keepalive/ping traffic. |
|
||||
| me_keepalive_interval_secs | `u64` | `8` | none | Base ME keepalive interval in seconds. |
|
||||
| me_keepalive_jitter_secs | `u64` | `2` | none | Keepalive jitter in seconds to reduce synchronized bursts. |
|
||||
| me_keepalive_payload_random | `bool` | `true` | none | Randomizes keepalive payload bytes instead of fixed zero payload. |
|
||||
| rpc_proxy_req_every | `u64` | `0` | `0` or `10..=300`. | Interval for service `RPC_PROXY_REQ` activity signals (`0` disables). |
|
||||
| me_writer_cmd_channel_capacity | `usize` | `4096` | Must be `> 0`. | Capacity of per-writer command channel. |
|
||||
| me_route_channel_capacity | `usize` | `768` | Must be `> 0`. | Capacity of per-connection ME response route channel. |
|
||||
| me_c2me_channel_capacity | `usize` | `1024` | Must be `> 0`. | Capacity of per-client command queue (client reader -> ME sender). |
|
||||
| me_reader_route_data_wait_ms | `u64` | `2` | `0..=20`. | Bounded wait for routing ME DATA to per-connection queue (`0` = no wait). |
|
||||
| me_d2c_flush_batch_max_frames | `usize` | `32` | `1..=512`. | Max ME->client frames coalesced before flush. |
|
||||
| me_d2c_flush_batch_max_bytes | `usize` | `131072` | `4096..=2_097_152`. | Max ME->client payload bytes coalesced before flush. |
|
||||
| me_d2c_flush_batch_max_delay_us | `u64` | `500` | `0..=5000`. | Max microsecond wait for coalescing more ME->client frames (`0` disables timed coalescing). |
|
||||
| me_d2c_ack_flush_immediate | `bool` | `true` | — | Flushes client writer immediately after quick-ack write. |
|
||||
| direct_relay_copy_buf_c2s_bytes | `usize` | `65536` | `4096..=1_048_576`. | Copy buffer size for client->DC direction in direct relay. |
|
||||
| direct_relay_copy_buf_s2c_bytes | `usize` | `262144` | `8192..=2_097_152`. | Copy buffer size for DC->client direction in direct relay. |
|
||||
| crypto_pending_buffer | `usize` | `262144` | — | Max pending ciphertext buffer per client writer (bytes). |
|
||||
| max_client_frame | `usize` | `16777216` | — | Maximum allowed client MTProto frame size (bytes). |
|
||||
| desync_all_full | `bool` | `false` | — | Emits full crypto-desync forensic logs for every event. |
|
||||
| beobachten | `bool` | `true` | — | Enables per-IP forensic observation buckets. |
|
||||
| beobachten_minutes | `u64` | `10` | Must be `> 0`. | Retention window (minutes) for per-IP observation buckets. |
|
||||
| beobachten_flush_secs | `u64` | `15` | Must be `> 0`. | Snapshot flush interval (seconds) for observation output file. |
|
||||
| beobachten_file | `String` | `"cache/beobachten.txt"` | — | Observation snapshot output file path. |
|
||||
| hardswap | `bool` | `true` | none | Enables generation-based ME hardswap strategy. |
|
||||
| me_warmup_stagger_enabled | `bool` | `true` | none | Staggers extra ME warmup dials to avoid connection spikes. |
|
||||
| me_warmup_step_delay_ms | `u64` | `500` | none | Base delay in milliseconds between warmup dial steps. |
|
||||
| me_warmup_step_jitter_ms | `u64` | `300` | none | Additional random delay in milliseconds for warmup steps. |
|
||||
| me_reconnect_max_concurrent_per_dc | `u32` | `8` | none | Limits concurrent reconnect workers per DC during health recovery. |
|
||||
| me_reconnect_backoff_base_ms | `u64` | `500` | none | Initial reconnect backoff in milliseconds. |
|
||||
| me_reconnect_backoff_cap_ms | `u64` | `30000` | none | Maximum reconnect backoff cap in milliseconds. |
|
||||
| me_reconnect_fast_retry_count | `u32` | `16` | none | Immediate retry budget before long backoff behavior applies. |
|
||||
| me_single_endpoint_shadow_writers | `u8` | `2` | `0..=32`. | Additional reserve writers for one-endpoint DC groups. |
|
||||
| me_single_endpoint_outage_mode_enabled | `bool` | `true` | — | Enables aggressive outage recovery for one-endpoint DC groups. |
|
||||
| me_single_endpoint_outage_disable_quarantine | `bool` | `true` | — | Ignores endpoint quarantine in one-endpoint outage mode. |
|
||||
| me_single_endpoint_outage_backoff_min_ms | `u64` | `250` | Must be `> 0`; also `<= me_single_endpoint_outage_backoff_max_ms`. | Minimum reconnect backoff in outage mode (ms). |
|
||||
| me_single_endpoint_outage_backoff_max_ms | `u64` | `3000` | Must be `> 0`; also `>= me_single_endpoint_outage_backoff_min_ms`. | Maximum reconnect backoff in outage mode (ms). |
|
||||
| me_single_endpoint_shadow_rotate_every_secs | `u64` | `900` | — | Periodic shadow writer rotation interval (`0` disables). |
|
||||
| me_floor_mode | `"static" \| "adaptive"` | `"adaptive"` | — | Writer floor policy mode. |
|
||||
| me_adaptive_floor_idle_secs | `u64` | `90` | — | Idle time before adaptive floor may reduce one-endpoint target. |
|
||||
| me_adaptive_floor_min_writers_single_endpoint | `u8` | `1` | `1..=32`. | Minimum adaptive writer target for one-endpoint DC groups. |
|
||||
| me_adaptive_floor_min_writers_multi_endpoint | `u8` | `1` | `1..=32`. | Minimum adaptive writer target for multi-endpoint DC groups. |
|
||||
| me_adaptive_floor_recover_grace_secs | `u64` | `180` | — | Grace period to hold static floor after activity. |
|
||||
| me_adaptive_floor_writers_per_core_total | `u16` | `48` | Must be `> 0`. | Global writer budget per logical CPU core in adaptive mode. |
|
||||
| me_adaptive_floor_cpu_cores_override | `u16` | `0` | — | Manual CPU core count override (`0` uses auto-detection). |
|
||||
| me_adaptive_floor_max_extra_writers_single_per_core | `u16` | `1` | — | Per-core max extra writers above base floor for one-endpoint DCs. |
|
||||
| me_adaptive_floor_max_extra_writers_multi_per_core | `u16` | `2` | — | Per-core max extra writers above base floor for multi-endpoint DCs. |
|
||||
| me_adaptive_floor_max_active_writers_per_core | `u16` | `64` | Must be `> 0`. | Hard cap for active ME writers per logical CPU core. |
|
||||
| me_adaptive_floor_max_warm_writers_per_core | `u16` | `64` | Must be `> 0`. | Hard cap for warm ME writers per logical CPU core. |
|
||||
| me_adaptive_floor_max_active_writers_global | `u32` | `256` | Must be `> 0`. | Hard global cap for active ME writers. |
|
||||
| me_adaptive_floor_max_warm_writers_global | `u32` | `256` | Must be `> 0`. | Hard global cap for warm ME writers. |
|
||||
| upstream_connect_retry_attempts | `u32` | `2` | Must be `> 0`. | Connect attempts for selected upstream before error/fallback. |
|
||||
| upstream_connect_retry_backoff_ms | `u64` | `100` | — | Delay between upstream connect attempts (ms). |
|
||||
| upstream_connect_budget_ms | `u64` | `3000` | Must be `> 0`. | Total wall-clock budget for one upstream connect request (ms). |
|
||||
| upstream_unhealthy_fail_threshold | `u32` | `5` | Must be `> 0`. | Consecutive failed requests before upstream is marked unhealthy. |
|
||||
| upstream_connect_failfast_hard_errors | `bool` | `false` | — | Skips additional retries for hard non-transient connect errors. |
|
||||
| stun_iface_mismatch_ignore | `bool` | `false` | none | Reserved compatibility flag in current runtime revision. |
|
||||
| unknown_dc_log_path | `String \| null` | `"unknown-dc.txt"` | — | File path for unknown-DC request logging (`null` disables file path). |
|
||||
| unknown_dc_file_log_enabled | `bool` | `false` | — | Enables unknown-DC file logging. |
|
||||
| log_level | `"debug" \| "verbose" \| "normal" \| "silent"` | `"normal"` | — | Runtime logging verbosity. |
|
||||
| disable_colors | `bool` | `false` | — | Disables ANSI colors in logs. |
|
||||
| me_socks_kdf_policy | `"strict" \| "compat"` | `"strict"` | — | SOCKS-bound KDF fallback policy for ME handshake. |
|
||||
| me_route_backpressure_base_timeout_ms | `u64` | `25` | Must be `> 0`. | Base backpressure timeout for route-channel send (ms). |
|
||||
| me_route_backpressure_high_timeout_ms | `u64` | `120` | Must be `>= me_route_backpressure_base_timeout_ms`. | High backpressure timeout when queue occupancy exceeds watermark (ms). |
|
||||
| me_route_backpressure_high_watermark_pct | `u8` | `80` | `1..=100`. | Queue occupancy threshold (%) for high timeout mode. |
|
||||
| me_health_interval_ms_unhealthy | `u64` | `1000` | Must be `> 0`. | Health monitor interval while writer coverage is degraded (ms). |
|
||||
| me_health_interval_ms_healthy | `u64` | `3000` | Must be `> 0`. | Health monitor interval while writer coverage is healthy (ms). |
|
||||
| me_admission_poll_ms | `u64` | `1000` | Must be `> 0`. | Poll interval for conditional-admission checks (ms). |
|
||||
| me_warn_rate_limit_ms | `u64` | `5000` | Must be `> 0`. | Cooldown for repetitive ME warning logs (ms). |
|
||||
| me_route_no_writer_mode | `"async_recovery_failfast" \| "inline_recovery_legacy" \| "hybrid_async_persistent"` | `"hybrid_async_persistent"` | — | Route behavior when no writer is immediately available. |
|
||||
| me_route_no_writer_wait_ms | `u64` | `250` | `10..=5000`. | Max wait in async-recovery failfast mode (ms). |
|
||||
| me_route_inline_recovery_attempts | `u32` | `3` | Must be `> 0`. | Inline recovery attempts in legacy mode. |
|
||||
| me_route_inline_recovery_wait_ms | `u64` | `3000` | `10..=30000`. | Max inline recovery wait in legacy mode (ms). |
|
||||
| fast_mode_min_tls_record | `usize` | `0` | — | Minimum TLS record size when fast-mode coalescing is enabled (`0` disables). |
|
||||
| update_every | `u64 \| null` | `300` | If set: must be `> 0`; if `null`: legacy fallback path is used. | Unified refresh interval for ME config and proxy-secret updater tasks. |
|
||||
| me_reinit_every_secs | `u64` | `900` | Must be `> 0`. | Periodic interval for zero-downtime ME reinit cycle. |
|
||||
| me_hardswap_warmup_delay_min_ms | `u64` | `1000` | Must be `<= me_hardswap_warmup_delay_max_ms`. | Lower bound for hardswap warmup dial spacing. |
|
||||
| me_hardswap_warmup_delay_max_ms | `u64` | `2000` | Must be `> 0`. | Upper bound for hardswap warmup dial spacing. |
|
||||
| me_hardswap_warmup_extra_passes | `u8` | `3` | Must be within `[0, 10]`. | Additional warmup passes after the base pass in one hardswap cycle. |
|
||||
| me_hardswap_warmup_pass_backoff_base_ms | `u64` | `500` | Must be `> 0`. | Base backoff between extra hardswap warmup passes. |
|
||||
| me_config_stable_snapshots | `u8` | `2` | Must be `> 0`. | Number of identical ME config snapshots required before apply. |
|
||||
| me_config_apply_cooldown_secs | `u64` | `300` | none | Cooldown between applied ME endpoint-map updates. |
|
||||
| me_snapshot_require_http_2xx | `bool` | `true` | — | Requires 2xx HTTP responses for applying config snapshots. |
|
||||
| me_snapshot_reject_empty_map | `bool` | `true` | — | Rejects empty config snapshots. |
|
||||
| me_snapshot_min_proxy_for_lines | `u32` | `1` | Must be `> 0`. | Minimum parsed `proxy_for` rows required to accept snapshot. |
|
||||
| proxy_secret_stable_snapshots | `u8` | `2` | Must be `> 0`. | Number of identical proxy-secret snapshots required before rotation. |
|
||||
| proxy_secret_rotate_runtime | `bool` | `true` | none | Enables runtime proxy-secret rotation from updater snapshots. |
|
||||
| me_secret_atomic_snapshot | `bool` | `true` | — | Keeps selector and secret bytes from the same snapshot atomically. |
|
||||
| proxy_secret_len_max | `usize` | `256` | Must be within `[32, 4096]`. | Upper length limit for accepted proxy-secret bytes. |
|
||||
| me_pool_drain_ttl_secs | `u64` | `90` | none | Time window where stale writers remain fallback-eligible after map change. |
|
||||
| me_pool_drain_threshold | `u64` | `128` | — | Max draining stale writers before batch force-close (`0` disables threshold cleanup). |
|
||||
| me_pool_drain_soft_evict_enabled | `bool` | `true` | — | Enables gradual soft-eviction of stale writers during drain/reinit instead of immediate hard close. |
|
||||
| me_pool_drain_soft_evict_grace_secs | `u64` | `30` | `0..=3600`. | Grace period before stale writers become soft-evict candidates. |
|
||||
| me_pool_drain_soft_evict_per_writer | `u8` | `1` | `1..=16`. | Maximum stale routes soft-evicted per writer in one eviction pass. |
|
||||
| me_pool_drain_soft_evict_budget_per_core | `u16` | `8` | `1..=64`. | Per-core budget limiting aggregate soft-eviction work per pass. |
|
||||
| me_pool_drain_soft_evict_cooldown_ms | `u64` | `5000` | Must be `> 0`. | Cooldown between consecutive soft-eviction passes (ms). |
|
||||
| me_bind_stale_mode | `"never" \| "ttl" \| "always"` | `"ttl"` | — | Policy for new binds on stale draining writers. |
|
||||
| me_bind_stale_ttl_secs | `u64` | `90` | — | TTL for stale bind allowance when stale mode is `ttl`. |
|
||||
| me_pool_min_fresh_ratio | `f32` | `0.8` | Must be within `[0.0, 1.0]`. | Minimum fresh desired-DC coverage ratio before stale writers are drained. |
|
||||
| me_reinit_drain_timeout_secs | `u64` | `120` | `0` disables force-close; if `> 0` and `< me_pool_drain_ttl_secs`, runtime bumps it to TTL. | Force-close timeout for draining stale writers (`0` keeps indefinite draining). |
|
||||
| proxy_secret_auto_reload_secs | `u64` | `3600` | Deprecated. Use `general.update_every`. | Deprecated legacy secret reload interval (fallback when `update_every` is not set). |
|
||||
| proxy_config_auto_reload_secs | `u64` | `3600` | Deprecated. Use `general.update_every`. | Deprecated legacy config reload interval (fallback when `update_every` is not set). |
|
||||
| me_reinit_singleflight | `bool` | `true` | — | Serializes ME reinit cycles across trigger sources. |
|
||||
| me_reinit_trigger_channel | `usize` | `64` | Must be `> 0`. | Trigger queue capacity for reinit scheduler. |
|
||||
| me_reinit_coalesce_window_ms | `u64` | `200` | — | Trigger coalescing window before starting reinit (ms). |
|
||||
| me_deterministic_writer_sort | `bool` | `true` | — | Enables deterministic candidate sort for writer binding path. |
|
||||
| me_writer_pick_mode | `"sorted_rr" \| "p2c"` | `"p2c"` | — | Writer selection mode for route bind path. |
|
||||
| me_writer_pick_sample_size | `u8` | `3` | `2..=4`. | Number of candidates sampled by picker in `p2c` mode. |
|
||||
| ntp_check | `bool` | `true` | — | Enables NTP drift check at startup. |
|
||||
| ntp_servers | `String[]` | `["pool.ntp.org"]` | — | NTP servers used for drift check. |
|
||||
| auto_degradation_enabled | `bool` | `true` | none | Reserved compatibility flag in current runtime revision. |
|
||||
| degradation_min_unavailable_dc_groups | `u8` | `2` | none | Reserved compatibility threshold in current runtime revision. |
|
||||
|
||||
## [general.modes]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| classic | `bool` | `false` | — | Enables classic MTProxy mode. |
|
||||
| secure | `bool` | `false` | — | Enables secure mode. |
|
||||
| tls | `bool` | `true` | — | Enables TLS mode. |
|
||||
|
||||
## [general.links]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| show | `"*" \| String[]` | `"*"` | — | Selects users whose tg:// links are shown at startup. |
|
||||
| public_host | `String \| null` | `null` | — | Public hostname/IP override for generated tg:// links. |
|
||||
| public_port | `u16 \| null` | `null` | — | Public port override for generated tg:// links. |
|
||||
|
||||
## [general.telemetry]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| core_enabled | `bool` | `true` | — | Enables core hot-path telemetry counters. |
|
||||
| user_enabled | `bool` | `true` | — | Enables per-user telemetry counters. |
|
||||
| me_level | `"silent" \| "normal" \| "debug"` | `"normal"` | — | Middle-End telemetry verbosity level. |
|
||||
|
||||
## [network]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| ipv4 | `bool` | `true` | — | Enables IPv4 networking. |
|
||||
| ipv6 | `bool` | `false` | — | Enables/disables IPv6 when set |
|
||||
| prefer | `u8` | `4` | Must be `4` or `6`. | Preferred IP family for selection (`4` or `6`). |
|
||||
| multipath | `bool` | `false` | — | Enables multipath behavior where supported. |
|
||||
| stun_use | `bool` | `true` | none | Global STUN switch; when `false`, STUN probing path is disabled. |
|
||||
| stun_servers | `String[]` | Built-in STUN list (13 hosts) | Deduplicated; empty values are removed. | Primary STUN server list for NAT/public endpoint discovery. |
|
||||
| stun_tcp_fallback | `bool` | `true` | none | Enables TCP fallback for STUN when UDP path is blocked. |
|
||||
| http_ip_detect_urls | `String[]` | `["https://ifconfig.me/ip", "https://api.ipify.org"]` | none | HTTP fallback endpoints for public IP detection when STUN is unavailable. |
|
||||
| cache_public_ip_path | `String` | `"cache/public_ip.txt"` | — | File path for caching detected public IP. |
|
||||
| dns_overrides | `String[]` | `[]` | Must match `host:port:ip`; IPv6 must be bracketed. | Runtime DNS overrides in `host:port:ip` format. |
|
||||
|
||||
## [server]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| port | `u16` | `443` | — | Main proxy listen port. |
|
||||
| listen_addr_ipv4 | `String \| null` | `"0.0.0.0"` | — | IPv4 bind address for TCP listener. |
|
||||
| listen_addr_ipv6 | `String \| null` | `"::"` | — | IPv6 bind address for TCP listener. |
|
||||
| listen_unix_sock | `String \| null` | `null` | — | Unix socket path for listener. |
|
||||
| listen_unix_sock_perm | `String \| null` | `null` | — | Unix socket permissions in octal string (e.g., `"0666"`). |
|
||||
| listen_tcp | `bool \| null` | `null` (auto) | — | Explicit TCP listener enable/disable override. |
|
||||
| proxy_protocol | `bool` | `false` | — | Enables HAProxy PROXY protocol parsing on incoming client connections. |
|
||||
| proxy_protocol_header_timeout_ms | `u64` | `500` | Must be `> 0`. | Timeout for PROXY protocol header read/parse (ms). |
|
||||
| metrics_port | `u16 \| null` | `null` | — | Metrics endpoint port (enables metrics listener). |
|
||||
| metrics_listen | `String \| null` | `null` | — | Full metrics bind address (`IP:PORT`), overrides `metrics_port`. |
|
||||
| metrics_whitelist | `IpNetwork[]` | `["127.0.0.1/32", "::1/128"]` | — | CIDR whitelist for metrics endpoint access. |
|
||||
| max_connections | `u32` | `10000` | — | Max concurrent client connections (`0` = unlimited). |
|
||||
|
||||
## [server.api]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| enabled | `bool` | `true` | — | Enables control-plane REST API. |
|
||||
| listen | `String` | `"0.0.0.0:9091"` | Must be valid `IP:PORT`. | API bind address in `IP:PORT` format. |
|
||||
| whitelist | `IpNetwork[]` | `["127.0.0.0/8"]` | — | CIDR whitelist allowed to access API. |
|
||||
| auth_header | `String` | `""` | — | Exact expected `Authorization` header value (empty = disabled). |
|
||||
| request_body_limit_bytes | `usize` | `65536` | Must be `> 0`. | Maximum accepted HTTP request body size. |
|
||||
| minimal_runtime_enabled | `bool` | `true` | — | Enables minimal runtime snapshots endpoint logic. |
|
||||
| minimal_runtime_cache_ttl_ms | `u64` | `1000` | `0..=60000`. | Cache TTL for minimal runtime snapshots (ms; `0` disables cache). |
|
||||
| runtime_edge_enabled | `bool` | `false` | — | Enables runtime edge endpoints. |
|
||||
| runtime_edge_cache_ttl_ms | `u64` | `1000` | `0..=60000`. | Cache TTL for runtime edge aggregation payloads (ms). |
|
||||
| runtime_edge_top_n | `usize` | `10` | `1..=1000`. | Top-N size for edge connection leaderboard. |
|
||||
| runtime_edge_events_capacity | `usize` | `256` | `16..=4096`. | Ring-buffer capacity for runtime edge events. |
|
||||
| read_only | `bool` | `false` | — | Rejects mutating API endpoints when enabled. |
|
||||
|
||||
## [[server.listeners]]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| ip | `IpAddr` | — | — | Listener bind IP. |
|
||||
| announce | `String \| null` | — | — | Public IP/domain announced in proxy links (priority over `announce_ip`). |
|
||||
| announce_ip | `IpAddr \| null` | — | — | Deprecated legacy announce IP (migrated to `announce` if needed). |
|
||||
| proxy_protocol | `bool \| null` | `null` | — | Per-listener override for PROXY protocol enable flag. |
|
||||
| reuse_allow | `bool` | `false` | — | Enables `SO_REUSEPORT` for multi-instance bind sharing. |
|
||||
|
||||
## [timeouts]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| client_handshake | `u64` | `30` | — | Client handshake timeout. |
|
||||
| tg_connect | `u64` | `10` | — | Upstream Telegram connect timeout. |
|
||||
| client_keepalive | `u64` | `15` | — | Client keepalive timeout. |
|
||||
| client_ack | `u64` | `90` | — | Client ACK timeout. |
|
||||
| me_one_retry | `u8` | `12` | none | Fast reconnect attempts budget for single-endpoint DC scenarios. |
|
||||
| me_one_timeout_ms | `u64` | `1200` | none | Timeout in milliseconds for each quick single-endpoint reconnect attempt. |
|
||||
|
||||
## [censorship]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| tls_domain | `String` | `"petrovich.ru"` | — | Primary TLS domain used in fake TLS handshake profile. |
|
||||
| tls_domains | `String[]` | `[]` | — | Additional TLS domains for generating multiple links. |
|
||||
| mask | `bool` | `true` | — | Enables masking/fronting relay mode. |
|
||||
| mask_host | `String \| null` | `null` | — | Upstream mask host for TLS fronting relay. |
|
||||
| mask_port | `u16` | `443` | — | Upstream mask port for TLS fronting relay. |
|
||||
| mask_unix_sock | `String \| null` | `null` | — | Unix socket path for mask backend instead of TCP host/port. |
|
||||
| fake_cert_len | `usize` | `2048` | — | Length of synthetic certificate payload when emulation data is unavailable. |
|
||||
| tls_emulation | `bool` | `true` | — | Enables certificate/TLS behavior emulation from cached real fronts. |
|
||||
| tls_front_dir | `String` | `"tlsfront"` | — | Directory path for TLS front cache storage. |
|
||||
| server_hello_delay_min_ms | `u64` | `0` | — | Minimum server_hello delay for anti-fingerprint behavior (ms). |
|
||||
| server_hello_delay_max_ms | `u64` | `0` | — | Maximum server_hello delay for anti-fingerprint behavior (ms). |
|
||||
| tls_new_session_tickets | `u8` | `0` | — | Number of `NewSessionTicket` messages to emit after handshake. |
|
||||
| tls_full_cert_ttl_secs | `u64` | `90` | — | TTL for sending full cert payload per (domain, client IP) tuple. |
|
||||
| alpn_enforce | `bool` | `true` | — | Enforces ALPN echo behavior based on client preference. |
|
||||
| mask_proxy_protocol | `u8` | `0` | — | PROXY protocol mode for mask backend (`0` disabled, `1` v1, `2` v2). |
|
||||
|
||||
## [access]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | TOML shape example | Description |
|
||||
|---|---|---|---|---|---|
|
||||
| users | `Map<String, String>` | `{"default": "000…000"}` | Secret must be 32 hex characters. | `[access.users]`<br>`user = "32-hex secret"`<br>`user2 = "32-hex secret"` | User credentials map used for client authentication. |
|
||||
| user_ad_tags | `Map<String, String>` | `{}` | Every value must be exactly 32 hex characters. | `[access.user_ad_tags]`<br>`user = "32-hex ad_tag"` | Per-user ad tags used as override over `general.ad_tag`. |
|
||||
| user_max_tcp_conns | `Map<String, usize>` | `{}` | — | `[access.user_max_tcp_conns]`<br>`user = 500` | Per-user maximum concurrent TCP connections. |
|
||||
| user_expirations | `Map<String, DateTime<Utc>>` | `{}` | Timestamp must be valid RFC3339/ISO-8601 datetime. | `[access.user_expirations]`<br>`user = "2026-12-31T23:59:59Z"` | Per-user account expiration timestamps. |
|
||||
| user_data_quota | `Map<String, u64>` | `{}` | — | `[access.user_data_quota]`<br>`user = 1073741824` | Per-user traffic quota in bytes. |
|
||||
| user_max_unique_ips | `Map<String, usize>` | `{}` | — | `[access.user_max_unique_ips]`<br>`user = 16` | Per-user unique source IP limits. |
|
||||
| user_max_unique_ips_global_each | `usize` | `0` | — | `user_max_unique_ips_global_each = 0` | Global fallback used when `[access.user_max_unique_ips]` has no per-user override. |
|
||||
| user_max_unique_ips_mode | `"active_window" \| "time_window" \| "combined"` | `"active_window"` | — | `user_max_unique_ips_mode = "active_window"` | Unique source IP limit accounting mode. |
|
||||
| user_max_unique_ips_window_secs | `u64` | `30` | Must be `> 0`. | `user_max_unique_ips_window_secs = 30` | Window size (seconds) used by unique-IP accounting modes that use time windows. |
|
||||
| replay_check_len | `usize` | `65536` | — | `replay_check_len = 65536` | Replay-protection storage length. |
|
||||
| replay_window_secs | `u64` | `1800` | — | `replay_window_secs = 1800` | Replay-protection window in seconds. |
|
||||
| ignore_time_skew | `bool` | `false` | — | `ignore_time_skew = false` | Disables client/server timestamp skew checks in replay validation when enabled. |
|
||||
|
||||
## [[upstreams]]
|
||||
|
||||
| Parameter | Type | Default | Constraints / validation | Description |
|
||||
|---|---|---|---|---|
|
||||
| type | `"direct" \| "socks4" \| "socks5"` | — | Required field. | Upstream transport type selector. |
|
||||
| weight | `u16` | `1` | none | Base weight used by weighted-random upstream selection. |
|
||||
| enabled | `bool` | `true` | none | Disabled entries are excluded from upstream selection at runtime. |
|
||||
| scopes | `String` | `""` | none | Comma-separated scope tags used for request-level upstream filtering. |
|
||||
| interface | `String \| null` | `null` | Optional; type-specific runtime rules apply. | Optional outbound interface/local bind hint (supported with type-specific rules). |
|
||||
| bind_addresses | `String[] \| null` | `null` | Applies to `type = "direct"`. | Optional explicit local source bind addresses for `type = "direct"`. |
|
||||
| address | `String` | — | Required for `type = "socks4"` and `type = "socks5"`. | SOCKS server endpoint (`host:port` or `ip:port`) for SOCKS upstream types. |
|
||||
| user_id | `String \| null` | `null` | Only for `type = "socks4"`. | SOCKS4 CONNECT user ID (`type = "socks4"` only). |
|
||||
| username | `String \| null` | `null` | Only for `type = "socks5"`. | SOCKS5 username (`type = "socks5"` only). |
|
||||
| password | `String \| null` | `null` | Only for `type = "socks5"`. | SOCKS5 password (`type = "socks5"` only). |
|
||||
568
install.sh
568
install.sh
|
|
@ -1,115 +1,525 @@
|
|||
#!/bin/sh
|
||||
set -eu
|
||||
|
||||
# --- Global Configurations ---
|
||||
REPO="${REPO:-telemt/telemt}"
|
||||
BIN_NAME="${BIN_NAME:-telemt}"
|
||||
VERSION="${1:-${VERSION:-latest}}"
|
||||
INSTALL_DIR="${INSTALL_DIR:-/usr/local/bin}"
|
||||
INSTALL_DIR="${INSTALL_DIR:-/bin}"
|
||||
CONFIG_DIR="${CONFIG_DIR:-/etc/telemt}"
|
||||
CONFIG_FILE="${CONFIG_FILE:-${CONFIG_DIR}/telemt.toml}"
|
||||
WORK_DIR="${WORK_DIR:-/opt/telemt}"
|
||||
SERVICE_NAME="telemt"
|
||||
TEMP_DIR=""
|
||||
SUDO=""
|
||||
|
||||
say() {
|
||||
printf '%s\n' "$*"
|
||||
}
|
||||
# --- Argument Parsing ---
|
||||
ACTION="install"
|
||||
TARGET_VERSION="${VERSION:-latest}"
|
||||
|
||||
die() {
|
||||
printf 'Error: %s\n' "$*" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
need_cmd() {
|
||||
command -v "$1" >/dev/null 2>&1 || die "required command not found: $1"
|
||||
}
|
||||
|
||||
detect_os() {
|
||||
os="$(uname -s)"
|
||||
case "$os" in
|
||||
Linux) printf 'linux\n' ;;
|
||||
OpenBSD) printf 'openbsd\n' ;;
|
||||
*) printf '%s\n' "$os" ;;
|
||||
while [ $# -gt 0 ]; do
|
||||
case "$1" in
|
||||
-h|--help)
|
||||
ACTION="help"
|
||||
shift
|
||||
;;
|
||||
uninstall|--uninstall)
|
||||
[ "$ACTION" != "purge" ] && ACTION="uninstall"
|
||||
shift
|
||||
;;
|
||||
--purge)
|
||||
ACTION="purge"
|
||||
shift
|
||||
;;
|
||||
install|--install)
|
||||
ACTION="install"
|
||||
shift
|
||||
;;
|
||||
-*)
|
||||
printf '[ERROR] Unknown option: %s\n' "$1" >&2
|
||||
exit 1
|
||||
;;
|
||||
*)
|
||||
if [ "$ACTION" = "install" ]; then
|
||||
TARGET_VERSION="$1"
|
||||
fi
|
||||
shift
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# --- Core Functions ---
|
||||
say() { printf '[INFO] %s\n' "$*"; }
|
||||
die() { printf '[ERROR] %s\n' "$*" >&2; exit 1; }
|
||||
|
||||
cleanup() {
|
||||
if [ -n "${TEMP_DIR:-}" ] && [ -d "$TEMP_DIR" ]; then
|
||||
rm -rf -- "$TEMP_DIR"
|
||||
fi
|
||||
}
|
||||
|
||||
trap cleanup EXIT INT TERM
|
||||
|
||||
show_help() {
|
||||
say "Usage: $0 [version | install | uninstall | --purge | --help]"
|
||||
say " version Install specific version (e.g. 1.0.0, default: latest)"
|
||||
say " uninstall Remove the binary and service (keeps config)"
|
||||
say " --purge Remove everything including configuration"
|
||||
exit 0
|
||||
}
|
||||
|
||||
user_exists() {
|
||||
if command -v getent >/dev/null 2>&1; then
|
||||
getent passwd "$1" >/dev/null 2>&1
|
||||
else
|
||||
grep -q "^${1}:" /etc/passwd 2>/dev/null
|
||||
fi
|
||||
}
|
||||
|
||||
group_exists() {
|
||||
if command -v getent >/dev/null 2>&1; then
|
||||
getent group "$1" >/dev/null 2>&1
|
||||
else
|
||||
grep -q "^${1}:" /etc/group 2>/dev/null
|
||||
fi
|
||||
}
|
||||
|
||||
verify_common() {
|
||||
[ -z "$BIN_NAME" ] && die "BIN_NAME cannot be empty."
|
||||
[ -z "$INSTALL_DIR" ] && die "INSTALL_DIR cannot be empty."
|
||||
[ -z "$CONFIG_DIR" ] && die "CONFIG_DIR cannot be empty."
|
||||
|
||||
if [ "$(id -u)" -eq 0 ]; then
|
||||
SUDO=""
|
||||
else
|
||||
if ! command -v sudo >/dev/null 2>&1; then
|
||||
die "This script requires root or sudo. Neither found."
|
||||
fi
|
||||
SUDO="sudo"
|
||||
say "sudo is available. Caching credentials..."
|
||||
if ! sudo -v; then
|
||||
die "Failed to cache sudo credentials"
|
||||
fi
|
||||
fi
|
||||
|
||||
case "${INSTALL_DIR}${CONFIG_DIR}${WORK_DIR}" in
|
||||
*[!a-zA-Z0-9_./-]*)
|
||||
die "Invalid characters in path variables. Only alphanumeric, _, ., -, and / are allowed."
|
||||
;;
|
||||
esac
|
||||
|
||||
case "$BIN_NAME" in
|
||||
*[!a-zA-Z0-9_-]*) die "Invalid characters in BIN_NAME: $BIN_NAME" ;;
|
||||
esac
|
||||
|
||||
for path in "$CONFIG_DIR" "$WORK_DIR"; do
|
||||
check_path="$path"
|
||||
|
||||
while [ "$check_path" != "/" ] && [ "${check_path%"/"}" != "$check_path" ]; do
|
||||
check_path="${check_path%"/"}"
|
||||
done
|
||||
[ -z "$check_path" ] && check_path="/"
|
||||
|
||||
case "$check_path" in
|
||||
/|/bin|/sbin|/usr|/usr/bin|/usr/local|/etc|/opt|/var|/home|/root|/tmp)
|
||||
die "Safety check failed: '$path' is a critical system directory."
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
for cmd in uname grep find rm chown chmod mv head mktemp; do
|
||||
command -v "$cmd" >/dev/null 2>&1 || die "Required command not found: $cmd"
|
||||
done
|
||||
}
|
||||
|
||||
verify_install_deps() {
|
||||
if ! command -v curl >/dev/null 2>&1 && ! command -v wget >/dev/null 2>&1; then
|
||||
die "Neither curl nor wget is installed."
|
||||
fi
|
||||
command -v tar >/dev/null 2>&1 || die "Required command not found: tar"
|
||||
command -v gzip >/dev/null 2>&1 || die "Required command not found: gzip"
|
||||
command -v cp >/dev/null 2>&1 || command -v install >/dev/null 2>&1 || die "Need cp or install"
|
||||
|
||||
if ! command -v setcap >/dev/null 2>&1; then
|
||||
say "setcap is missing. Installing required capability tools..."
|
||||
if command -v apk >/dev/null 2>&1; then
|
||||
$SUDO apk add --no-cache libcap || die "Failed to install libcap"
|
||||
elif command -v apt-get >/dev/null 2>&1; then
|
||||
$SUDO apt-get update -qq && $SUDO apt-get install -y -qq libcap2-bin || die "Failed to install libcap2-bin"
|
||||
elif command -v dnf >/dev/null 2>&1 || command -v yum >/dev/null 2>&1; then
|
||||
$SUDO ${YUM_CMD:-yum} install -y -q libcap || die "Failed to install libcap"
|
||||
else
|
||||
die "Cannot install 'setcap'. Package manager not found. Please install libcap manually."
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
detect_arch() {
|
||||
arch="$(uname -m)"
|
||||
case "$arch" in
|
||||
x86_64|amd64) printf 'x86_64\n' ;;
|
||||
aarch64|arm64) printf 'aarch64\n' ;;
|
||||
*) die "unsupported architecture: $arch" ;;
|
||||
sys_arch="$(uname -m)"
|
||||
case "$sys_arch" in
|
||||
x86_64|amd64) echo "x86_64" ;;
|
||||
aarch64|arm64) echo "aarch64" ;;
|
||||
*) die "Unsupported architecture: $sys_arch" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
detect_libc() {
|
||||
case "$(ldd --version 2>&1 || true)" in
|
||||
*musl*) printf 'musl\n' ;;
|
||||
*) printf 'gnu\n' ;;
|
||||
esac
|
||||
if command -v ldd >/dev/null 2>&1 && ldd --version 2>&1 | grep -qi musl; then
|
||||
echo "musl"; return 0
|
||||
fi
|
||||
|
||||
if grep -q '^ID=alpine' /etc/os-release 2>/dev/null || grep -q '^ID="alpine"' /etc/os-release 2>/dev/null; then
|
||||
echo "musl"; return 0
|
||||
fi
|
||||
for f in /lib/ld-musl-*.so.* /lib64/ld-musl-*.so.*; do
|
||||
if [ -e "$f" ]; then
|
||||
echo "musl"; return 0
|
||||
fi
|
||||
done
|
||||
echo "gnu"
|
||||
}
|
||||
|
||||
fetch_to_stdout() {
|
||||
url="$1"
|
||||
fetch_file() {
|
||||
fetch_url="$1"
|
||||
fetch_out="$2"
|
||||
|
||||
if command -v curl >/dev/null 2>&1; then
|
||||
curl -fsSL "$url"
|
||||
curl -fsSL "$fetch_url" -o "$fetch_out" || return 1
|
||||
elif command -v wget >/dev/null 2>&1; then
|
||||
wget -qO- "$url"
|
||||
wget -qO "$fetch_out" "$fetch_url" || return 1
|
||||
else
|
||||
die "neither curl nor wget is installed"
|
||||
die "curl or wget required"
|
||||
fi
|
||||
}
|
||||
|
||||
ensure_user_group() {
|
||||
nologin_bin="/bin/false"
|
||||
|
||||
cmd_nologin="$(command -v nologin 2>/dev/null || true)"
|
||||
if [ -n "$cmd_nologin" ] && [ -x "$cmd_nologin" ]; then
|
||||
nologin_bin="$cmd_nologin"
|
||||
else
|
||||
for bin in /sbin/nologin /usr/sbin/nologin; do
|
||||
if [ -x "$bin" ]; then
|
||||
nologin_bin="$bin"
|
||||
break
|
||||
fi
|
||||
done
|
||||
fi
|
||||
|
||||
if ! group_exists telemt; then
|
||||
if command -v groupadd >/dev/null 2>&1; then
|
||||
$SUDO groupadd -r telemt || die "Failed to create group via groupadd"
|
||||
elif command -v addgroup >/dev/null 2>&1; then
|
||||
$SUDO addgroup -S telemt || die "Failed to create group via addgroup"
|
||||
else
|
||||
die "Cannot create group: neither groupadd nor addgroup found"
|
||||
fi
|
||||
fi
|
||||
|
||||
if ! user_exists telemt; then
|
||||
if command -v useradd >/dev/null 2>&1; then
|
||||
$SUDO useradd -r -g telemt -d "$WORK_DIR" -s "$nologin_bin" -c "Telemt Proxy" telemt || die "Failed to create user via useradd"
|
||||
elif command -v adduser >/dev/null 2>&1; then
|
||||
$SUDO adduser -S -D -H -h "$WORK_DIR" -s "$nologin_bin" -G telemt telemt || die "Failed to create user via adduser"
|
||||
else
|
||||
die "Cannot create user: neither useradd nor adduser found"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
setup_dirs() {
|
||||
say "Setting up directories..."
|
||||
$SUDO mkdir -p "$WORK_DIR" "$CONFIG_DIR" || die "Failed to create directories"
|
||||
$SUDO chown telemt:telemt "$WORK_DIR" || die "Failed to set owner on WORK_DIR"
|
||||
$SUDO chmod 750 "$WORK_DIR" || die "Failed to set permissions on WORK_DIR"
|
||||
}
|
||||
|
||||
stop_service() {
|
||||
say "Stopping service if running..."
|
||||
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then
|
||||
$SUDO systemctl stop "$SERVICE_NAME" 2>/dev/null || true
|
||||
elif command -v rc-service >/dev/null 2>&1; then
|
||||
$SUDO rc-service "$SERVICE_NAME" stop 2>/dev/null || true
|
||||
fi
|
||||
}
|
||||
|
||||
install_binary() {
|
||||
src="$1"
|
||||
dst="$2"
|
||||
bin_src="$1"
|
||||
bin_dst="$2"
|
||||
|
||||
if [ -w "$INSTALL_DIR" ] || { [ ! -e "$INSTALL_DIR" ] && [ -w "$(dirname "$INSTALL_DIR")" ]; }; then
|
||||
mkdir -p "$INSTALL_DIR"
|
||||
install -m 0755 "$src" "$dst"
|
||||
elif command -v sudo >/dev/null 2>&1; then
|
||||
sudo mkdir -p "$INSTALL_DIR"
|
||||
sudo install -m 0755 "$src" "$dst"
|
||||
$SUDO mkdir -p "$INSTALL_DIR" || die "Failed to create install directory"
|
||||
if command -v install >/dev/null 2>&1; then
|
||||
$SUDO install -m 0755 "$bin_src" "$bin_dst" || die "Failed to install binary"
|
||||
else
|
||||
die "cannot write to $INSTALL_DIR and sudo is not available"
|
||||
$SUDO rm -f "$bin_dst"
|
||||
$SUDO cp "$bin_src" "$bin_dst" || die "Failed to copy binary"
|
||||
$SUDO chmod 0755 "$bin_dst" || die "Failed to set permissions"
|
||||
fi
|
||||
|
||||
if [ ! -x "$bin_dst" ]; then
|
||||
die "Failed to install binary or it is not executable: $bin_dst"
|
||||
fi
|
||||
|
||||
say "Granting network bind capabilities to bind port 443..."
|
||||
if ! $SUDO setcap cap_net_bind_service=+ep "$bin_dst" 2>/dev/null; then
|
||||
say "[WARNING] Failed to apply setcap. The service will NOT be able to open port 443!"
|
||||
say "[WARNING] This usually happens inside unprivileged Docker/LXC containers."
|
||||
fi
|
||||
}
|
||||
|
||||
need_cmd uname
|
||||
need_cmd tar
|
||||
need_cmd mktemp
|
||||
need_cmd grep
|
||||
need_cmd install
|
||||
generate_secret() {
|
||||
if command -v openssl >/dev/null 2>&1; then
|
||||
secret="$(openssl rand -hex 16 2>/dev/null)" && [ -n "$secret" ] && { echo "$secret"; return 0; }
|
||||
fi
|
||||
if command -v xxd >/dev/null 2>&1; then
|
||||
secret="$(dd if=/dev/urandom bs=1 count=16 2>/dev/null | xxd -p | tr -d '\n')" && [ -n "$secret" ] && { echo "$secret"; return 0; }
|
||||
fi
|
||||
secret="$(dd if=/dev/urandom bs=1 count=16 2>/dev/null | od -An -tx1 | tr -d ' \n')" && [ -n "$secret" ] && { echo "$secret"; return 0; }
|
||||
return 1
|
||||
}
|
||||
|
||||
ARCH="$(detect_arch)"
|
||||
OS="$(detect_os)"
|
||||
generate_config_content() {
|
||||
cat <<EOF
|
||||
[general]
|
||||
use_middle_proxy = false
|
||||
|
||||
if [ "$OS" != "linux" ]; then
|
||||
case "$OS" in
|
||||
openbsd)
|
||||
die "install.sh installs only Linux release artifacts. On OpenBSD, build from source (see docs/OPENBSD.en.md)."
|
||||
;;
|
||||
*)
|
||||
die "unsupported operating system for install.sh: $OS"
|
||||
;;
|
||||
esac
|
||||
fi
|
||||
[general.modes]
|
||||
classic = false
|
||||
secure = false
|
||||
tls = true
|
||||
|
||||
LIBC="$(detect_libc)"
|
||||
[server]
|
||||
port = 443
|
||||
|
||||
case "$VERSION" in
|
||||
latest)
|
||||
URL="https://github.com/$REPO/releases/latest/download/${BIN_NAME}-${ARCH}-linux-${LIBC}.tar.gz"
|
||||
[server.api]
|
||||
enabled = true
|
||||
listen = "127.0.0.1:9091"
|
||||
whitelist = ["127.0.0.1/32"]
|
||||
|
||||
[censorship]
|
||||
tls_domain = "petrovich.ru"
|
||||
|
||||
[access.users]
|
||||
hello = "$1"
|
||||
EOF
|
||||
}
|
||||
|
||||
install_config() {
|
||||
config_exists=0
|
||||
|
||||
if [ -n "$SUDO" ]; then
|
||||
$SUDO sh -c "[ -f '$CONFIG_FILE' ]" 2>/dev/null && config_exists=1 || true
|
||||
else
|
||||
[ -f "$CONFIG_FILE" ] && config_exists=1 || true
|
||||
fi
|
||||
|
||||
if [ "$config_exists" -eq 1 ]; then
|
||||
say "Config already exists, skipping generation."
|
||||
return 0
|
||||
fi
|
||||
|
||||
toml_secret="$(generate_secret)" || die "Failed to generate secret"
|
||||
say "Creating config at $CONFIG_FILE..."
|
||||
|
||||
tmp_conf="$(mktemp "${TEMP_DIR:-/tmp}/telemt_conf.XXXXXX")" || die "Failed to create temp config"
|
||||
generate_config_content "$toml_secret" > "$tmp_conf" || die "Failed to write temp config"
|
||||
|
||||
$SUDO mv "$tmp_conf" "$CONFIG_FILE" || die "Failed to install config file"
|
||||
$SUDO chown root:telemt "$CONFIG_FILE" || die "Failed to set owner"
|
||||
$SUDO chmod 640 "$CONFIG_FILE" || die "Failed to set config permissions"
|
||||
|
||||
say "Secret for user 'hello': $toml_secret"
|
||||
}
|
||||
|
||||
generate_systemd_content() {
|
||||
cat <<EOF
|
||||
[Unit]
|
||||
Description=Telemt Proxy Service
|
||||
After=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=telemt
|
||||
Group=telemt
|
||||
WorkingDirectory=$WORK_DIR
|
||||
ExecStart=${INSTALL_DIR}/${BIN_NAME} ${CONFIG_FILE}
|
||||
Restart=on-failure
|
||||
LimitNOFILE=65536
|
||||
AmbientCapabilities=CAP_NET_BIND_SERVICE
|
||||
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
EOF
|
||||
}
|
||||
|
||||
generate_openrc_content() {
|
||||
cat <<EOF
|
||||
#!/sbin/openrc-run
|
||||
name="$SERVICE_NAME"
|
||||
description="Telemt Proxy Service"
|
||||
command="${INSTALL_DIR}/${BIN_NAME}"
|
||||
command_args="${CONFIG_FILE}"
|
||||
command_background=true
|
||||
command_user="telemt:telemt"
|
||||
pidfile="/run/\${RC_SVCNAME}.pid"
|
||||
directory="${WORK_DIR}"
|
||||
rc_ulimit="-n 65536"
|
||||
depend() { need net; use logger; }
|
||||
EOF
|
||||
}
|
||||
|
||||
install_service() {
|
||||
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then
|
||||
say "Installing systemd service..."
|
||||
tmp_svc="$(mktemp "${TEMP_DIR:-/tmp}/${SERVICE_NAME}.service.XXXXXX")" || die "Failed to create temp service"
|
||||
generate_systemd_content > "$tmp_svc" || die "Failed to generate service content"
|
||||
|
||||
$SUDO mv "$tmp_svc" "/etc/systemd/system/${SERVICE_NAME}.service" || die "Failed to move service file"
|
||||
$SUDO chown root:root "/etc/systemd/system/${SERVICE_NAME}.service"
|
||||
$SUDO chmod 644 "/etc/systemd/system/${SERVICE_NAME}.service"
|
||||
|
||||
$SUDO systemctl daemon-reload || die "Failed to reload systemd"
|
||||
$SUDO systemctl enable "$SERVICE_NAME" || die "Failed to enable service"
|
||||
$SUDO systemctl start "$SERVICE_NAME" || die "Failed to start service"
|
||||
|
||||
elif command -v rc-update >/dev/null 2>&1; then
|
||||
say "Installing OpenRC service..."
|
||||
tmp_svc="$(mktemp "${TEMP_DIR:-/tmp}/${SERVICE_NAME}.init.XXXXXX")" || die "Failed to create temp file"
|
||||
generate_openrc_content > "$tmp_svc" || die "Failed to generate init content"
|
||||
|
||||
$SUDO mv "$tmp_svc" "/etc/init.d/${SERVICE_NAME}" || die "Failed to move service file"
|
||||
$SUDO chown root:root "/etc/init.d/${SERVICE_NAME}"
|
||||
$SUDO chmod 0755 "/etc/init.d/${SERVICE_NAME}"
|
||||
|
||||
$SUDO rc-update add "$SERVICE_NAME" default 2>/dev/null || die "Failed to register service"
|
||||
$SUDO rc-service "$SERVICE_NAME" start 2>/dev/null || die "Failed to start OpenRC service"
|
||||
else
|
||||
say "No service manager found. You can start it manually with:"
|
||||
if [ -n "$SUDO" ]; then
|
||||
say " sudo -u telemt ${INSTALL_DIR}/${BIN_NAME} ${CONFIG_FILE}"
|
||||
else
|
||||
say " su -s /bin/sh telemt -c '${INSTALL_DIR}/${BIN_NAME} ${CONFIG_FILE}'"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
kill_user_procs() {
|
||||
say "Ensuring $BIN_NAME processes are killed..."
|
||||
|
||||
if pkill_cmd="$(command -v pkill 2>/dev/null)"; then
|
||||
$SUDO "$pkill_cmd" -u telemt "$BIN_NAME" 2>/dev/null || true
|
||||
sleep 1
|
||||
$SUDO "$pkill_cmd" -9 -u telemt "$BIN_NAME" 2>/dev/null || true
|
||||
elif killall_cmd="$(command -v killall 2>/dev/null)"; then
|
||||
$SUDO "$killall_cmd" "$BIN_NAME" 2>/dev/null || true
|
||||
sleep 1
|
||||
$SUDO "$killall_cmd" -9 "$BIN_NAME" 2>/dev/null || true
|
||||
fi
|
||||
}
|
||||
|
||||
uninstall() {
|
||||
purge_data=0
|
||||
[ "$ACTION" = "purge" ] && purge_data=1
|
||||
|
||||
say "Uninstalling $BIN_NAME..."
|
||||
stop_service
|
||||
|
||||
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then
|
||||
$SUDO systemctl disable "$SERVICE_NAME" 2>/dev/null || true
|
||||
$SUDO rm -f "/etc/systemd/system/${SERVICE_NAME}.service"
|
||||
$SUDO systemctl daemon-reload || true
|
||||
elif command -v rc-update >/dev/null 2>&1; then
|
||||
$SUDO rc-update del "$SERVICE_NAME" 2>/dev/null || true
|
||||
$SUDO rm -f "/etc/init.d/${SERVICE_NAME}"
|
||||
fi
|
||||
|
||||
kill_user_procs
|
||||
|
||||
$SUDO rm -f "${INSTALL_DIR}/${BIN_NAME}"
|
||||
|
||||
$SUDO userdel telemt 2>/dev/null || $SUDO deluser telemt 2>/dev/null || true
|
||||
$SUDO groupdel telemt 2>/dev/null || $SUDO delgroup telemt 2>/dev/null || true
|
||||
|
||||
if [ "$purge_data" -eq 1 ]; then
|
||||
say "Purging configuration and data..."
|
||||
$SUDO rm -rf "$CONFIG_DIR" "$WORK_DIR"
|
||||
else
|
||||
say "Note: Configuration in $CONFIG_DIR was kept. Run with '--purge' to remove it."
|
||||
fi
|
||||
|
||||
say "Uninstallation complete."
|
||||
exit 0
|
||||
}
|
||||
|
||||
# ============================================================================
|
||||
# Main Entry Point
|
||||
# ============================================================================
|
||||
|
||||
case "$ACTION" in
|
||||
help)
|
||||
show_help
|
||||
;;
|
||||
*)
|
||||
URL="https://github.com/$REPO/releases/download/${VERSION}/${BIN_NAME}-${ARCH}-linux-${LIBC}.tar.gz"
|
||||
uninstall|purge)
|
||||
verify_common
|
||||
uninstall
|
||||
;;
|
||||
install)
|
||||
say "Starting installation..."
|
||||
verify_common
|
||||
verify_install_deps
|
||||
|
||||
ARCH="$(detect_arch)"
|
||||
LIBC="$(detect_libc)"
|
||||
say "Detected system: $ARCH-linux-$LIBC"
|
||||
|
||||
FILE_NAME="${BIN_NAME}-${ARCH}-linux-${LIBC}.tar.gz"
|
||||
FILE_NAME="$(printf '%s' "$FILE_NAME" | tr -d ' \t\n\r')"
|
||||
|
||||
if [ "$TARGET_VERSION" = "latest" ]; then
|
||||
DL_URL="https://github.com/${REPO}/releases/latest/download/${FILE_NAME}"
|
||||
else
|
||||
DL_URL="https://github.com/${REPO}/releases/download/${TARGET_VERSION}/${FILE_NAME}"
|
||||
fi
|
||||
|
||||
TEMP_DIR="$(mktemp -d)" || die "Failed to create temp directory"
|
||||
if [ -z "$TEMP_DIR" ] || [ ! -d "$TEMP_DIR" ]; then
|
||||
die "Temp directory creation failed"
|
||||
fi
|
||||
|
||||
say "Downloading from $DL_URL..."
|
||||
fetch_file "$DL_URL" "${TEMP_DIR}/archive.tar.gz" || die "Download failed (check version or network)"
|
||||
|
||||
gzip -dc "${TEMP_DIR}/archive.tar.gz" | tar -xf - -C "$TEMP_DIR" || die "Extraction failed"
|
||||
|
||||
EXTRACTED_BIN="$(find "$TEMP_DIR" -type f -name "$BIN_NAME" -print 2>/dev/null | head -n 1)"
|
||||
[ -z "$EXTRACTED_BIN" ] && die "Binary '$BIN_NAME' not found in archive"
|
||||
|
||||
ensure_user_group
|
||||
setup_dirs
|
||||
stop_service
|
||||
|
||||
say "Installing binary..."
|
||||
install_binary "$EXTRACTED_BIN" "${INSTALL_DIR}/${BIN_NAME}"
|
||||
|
||||
install_config
|
||||
install_service
|
||||
|
||||
say ""
|
||||
say "============================================="
|
||||
say "Installation complete!"
|
||||
say "============================================="
|
||||
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then
|
||||
say "To check the logs, run:"
|
||||
say " journalctl -u $SERVICE_NAME -f"
|
||||
say ""
|
||||
fi
|
||||
say "To get user connection links, run:"
|
||||
if command -v jq >/dev/null 2>&1; then
|
||||
say " curl -s http://127.0.0.1:9091/v1/users | jq -r '.data[] | \"User: \\(.username)\\n\\(.links.tls[0] // empty)\"'"
|
||||
else
|
||||
say " curl -s http://127.0.0.1:9091/v1/users"
|
||||
say " (Note: Install 'jq' package to see the links nicely formatted)"
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
|
||||
TMPDIR="$(mktemp -d)"
|
||||
trap 'rm -rf "$TMPDIR"' EXIT INT TERM
|
||||
|
||||
say "Installing $BIN_NAME ($VERSION) for $ARCH-linux-$LIBC..."
|
||||
fetch_to_stdout "$URL" | tar -xzf - -C "$TMPDIR"
|
||||
|
||||
[ -f "$TMPDIR/$BIN_NAME" ] || die "archive did not contain $BIN_NAME"
|
||||
|
||||
install_binary "$TMPDIR/$BIN_NAME" "$INSTALL_DIR/$BIN_NAME"
|
||||
|
||||
say "Installed: $INSTALL_DIR/$BIN_NAME"
|
||||
"$INSTALL_DIR/$BIN_NAME" --version 2>/dev/null || true
|
||||
|
|
|
|||
|
|
@ -195,6 +195,8 @@ pub(super) struct ZeroPoolData {
|
|||
pub(super) pool_swap_total: u64,
|
||||
pub(super) pool_drain_active: u64,
|
||||
pub(super) pool_force_close_total: u64,
|
||||
pub(super) pool_drain_soft_evict_total: u64,
|
||||
pub(super) pool_drain_soft_evict_writer_total: u64,
|
||||
pub(super) pool_stale_pick_total: u64,
|
||||
pub(super) writer_removed_total: u64,
|
||||
pub(super) writer_removed_unexpected_total: u64,
|
||||
|
|
@ -235,6 +237,7 @@ pub(super) struct MeWritersSummary {
|
|||
pub(super) available_pct: f64,
|
||||
pub(super) required_writers: usize,
|
||||
pub(super) alive_writers: usize,
|
||||
pub(super) coverage_ratio: f64,
|
||||
pub(super) coverage_pct: f64,
|
||||
pub(super) fresh_alive_writers: usize,
|
||||
pub(super) fresh_coverage_pct: f64,
|
||||
|
|
@ -283,6 +286,7 @@ pub(super) struct DcStatus {
|
|||
pub(super) floor_max: usize,
|
||||
pub(super) floor_capped: bool,
|
||||
pub(super) alive_writers: usize,
|
||||
pub(super) coverage_ratio: f64,
|
||||
pub(super) coverage_pct: f64,
|
||||
pub(super) fresh_alive_writers: usize,
|
||||
pub(super) fresh_coverage_pct: f64,
|
||||
|
|
@ -360,6 +364,11 @@ pub(super) struct MinimalMeRuntimeData {
|
|||
pub(super) me_reconnect_backoff_cap_ms: u64,
|
||||
pub(super) me_reconnect_fast_retry_count: u32,
|
||||
pub(super) me_pool_drain_ttl_secs: u64,
|
||||
pub(super) me_pool_drain_soft_evict_enabled: bool,
|
||||
pub(super) me_pool_drain_soft_evict_grace_secs: u64,
|
||||
pub(super) me_pool_drain_soft_evict_per_writer: u8,
|
||||
pub(super) me_pool_drain_soft_evict_budget_per_core: u16,
|
||||
pub(super) me_pool_drain_soft_evict_cooldown_ms: u64,
|
||||
pub(super) me_pool_force_close_secs: u64,
|
||||
pub(super) me_pool_min_fresh_ratio: f32,
|
||||
pub(super) me_bind_stale_mode: &'static str,
|
||||
|
|
|
|||
|
|
@ -113,6 +113,7 @@ pub(super) struct RuntimeMeQualityDcRttData {
|
|||
pub(super) rtt_ema_ms: Option<f64>,
|
||||
pub(super) alive_writers: usize,
|
||||
pub(super) required_writers: usize,
|
||||
pub(super) coverage_ratio: f64,
|
||||
pub(super) coverage_pct: f64,
|
||||
}
|
||||
|
||||
|
|
@ -388,6 +389,7 @@ pub(super) async fn build_runtime_me_quality_data(shared: &ApiShared) -> Runtime
|
|||
rtt_ema_ms: dc.rtt_ms,
|
||||
alive_writers: dc.alive_writers,
|
||||
required_writers: dc.required_writers,
|
||||
coverage_ratio: dc.coverage_ratio,
|
||||
coverage_pct: dc.coverage_pct,
|
||||
})
|
||||
.collect(),
|
||||
|
|
|
|||
|
|
@ -96,6 +96,8 @@ pub(super) fn build_zero_all_data(stats: &Stats, configured_users: usize) -> Zer
|
|||
pool_swap_total: stats.get_pool_swap_total(),
|
||||
pool_drain_active: stats.get_pool_drain_active(),
|
||||
pool_force_close_total: stats.get_pool_force_close_total(),
|
||||
pool_drain_soft_evict_total: stats.get_pool_drain_soft_evict_total(),
|
||||
pool_drain_soft_evict_writer_total: stats.get_pool_drain_soft_evict_writer_total(),
|
||||
pool_stale_pick_total: stats.get_pool_stale_pick_total(),
|
||||
writer_removed_total: stats.get_me_writer_removed_total(),
|
||||
writer_removed_unexpected_total: stats.get_me_writer_removed_unexpected_total(),
|
||||
|
|
@ -313,6 +315,7 @@ async fn get_minimal_payload_cached(
|
|||
available_pct: status.available_pct,
|
||||
required_writers: status.required_writers,
|
||||
alive_writers: status.alive_writers,
|
||||
coverage_ratio: status.coverage_ratio,
|
||||
coverage_pct: status.coverage_pct,
|
||||
fresh_alive_writers: status.fresh_alive_writers,
|
||||
fresh_coverage_pct: status.fresh_coverage_pct,
|
||||
|
|
@ -370,6 +373,7 @@ async fn get_minimal_payload_cached(
|
|||
floor_max: entry.floor_max,
|
||||
floor_capped: entry.floor_capped,
|
||||
alive_writers: entry.alive_writers,
|
||||
coverage_ratio: entry.coverage_ratio,
|
||||
coverage_pct: entry.coverage_pct,
|
||||
fresh_alive_writers: entry.fresh_alive_writers,
|
||||
fresh_coverage_pct: entry.fresh_coverage_pct,
|
||||
|
|
@ -427,6 +431,11 @@ async fn get_minimal_payload_cached(
|
|||
me_reconnect_backoff_cap_ms: runtime.me_reconnect_backoff_cap_ms,
|
||||
me_reconnect_fast_retry_count: runtime.me_reconnect_fast_retry_count,
|
||||
me_pool_drain_ttl_secs: runtime.me_pool_drain_ttl_secs,
|
||||
me_pool_drain_soft_evict_enabled: runtime.me_pool_drain_soft_evict_enabled,
|
||||
me_pool_drain_soft_evict_grace_secs: runtime.me_pool_drain_soft_evict_grace_secs,
|
||||
me_pool_drain_soft_evict_per_writer: runtime.me_pool_drain_soft_evict_per_writer,
|
||||
me_pool_drain_soft_evict_budget_per_core: runtime.me_pool_drain_soft_evict_budget_per_core,
|
||||
me_pool_drain_soft_evict_cooldown_ms: runtime.me_pool_drain_soft_evict_cooldown_ms,
|
||||
me_pool_force_close_secs: runtime.me_pool_force_close_secs,
|
||||
me_pool_min_fresh_ratio: runtime.me_pool_min_fresh_ratio,
|
||||
me_bind_stale_mode: runtime.me_bind_stale_mode,
|
||||
|
|
@ -495,6 +504,7 @@ fn disabled_me_writers(now_epoch_secs: u64, reason: &'static str) -> MeWritersDa
|
|||
available_pct: 0.0,
|
||||
required_writers: 0,
|
||||
alive_writers: 0,
|
||||
coverage_ratio: 0.0,
|
||||
coverage_pct: 0.0,
|
||||
fresh_alive_writers: 0,
|
||||
fresh_coverage_pct: 0.0,
|
||||
|
|
|
|||
|
|
@ -27,8 +27,8 @@ const DEFAULT_ME_C2ME_CHANNEL_CAPACITY: usize = 1024;
|
|||
const DEFAULT_ME_READER_ROUTE_DATA_WAIT_MS: u64 = 2;
|
||||
const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_FRAMES: usize = 32;
|
||||
const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_BYTES: usize = 128 * 1024;
|
||||
const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_DELAY_US: u64 = 1500;
|
||||
const DEFAULT_ME_D2C_ACK_FLUSH_IMMEDIATE: bool = false;
|
||||
const DEFAULT_ME_D2C_FLUSH_BATCH_MAX_DELAY_US: u64 = 500;
|
||||
const DEFAULT_ME_D2C_ACK_FLUSH_IMMEDIATE: bool = true;
|
||||
const DEFAULT_DIRECT_RELAY_COPY_BUF_C2S_BYTES: usize = 64 * 1024;
|
||||
const DEFAULT_DIRECT_RELAY_COPY_BUF_S2C_BYTES: usize = 256 * 1024;
|
||||
const DEFAULT_ME_WRITER_PICK_SAMPLE_SIZE: u8 = 3;
|
||||
|
|
@ -36,7 +36,16 @@ const DEFAULT_ME_HEALTH_INTERVAL_MS_UNHEALTHY: u64 = 1000;
|
|||
const DEFAULT_ME_HEALTH_INTERVAL_MS_HEALTHY: u64 = 3000;
|
||||
const DEFAULT_ME_ADMISSION_POLL_MS: u64 = 1000;
|
||||
const DEFAULT_ME_WARN_RATE_LIMIT_MS: u64 = 5000;
|
||||
const DEFAULT_ME_ROUTE_HYBRID_MAX_WAIT_MS: u64 = 3000;
|
||||
const DEFAULT_ME_ROUTE_BLOCKING_SEND_TIMEOUT_MS: u64 = 250;
|
||||
const DEFAULT_ME_C2ME_SEND_TIMEOUT_MS: u64 = 4000;
|
||||
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_ENABLED: bool = true;
|
||||
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_GRACE_SECS: u64 = 30;
|
||||
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_PER_WRITER: u8 = 1;
|
||||
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_BUDGET_PER_CORE: u16 = 8;
|
||||
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_COOLDOWN_MS: u64 = 5000;
|
||||
const DEFAULT_USER_MAX_UNIQUE_IPS_WINDOW_SECS: u64 = 30;
|
||||
const DEFAULT_ACCEPT_PERMIT_TIMEOUT_MS: u64 = 250;
|
||||
const DEFAULT_UPSTREAM_CONNECT_RETRY_ATTEMPTS: u32 = 2;
|
||||
const DEFAULT_UPSTREAM_UNHEALTHY_FAIL_THRESHOLD: u32 = 5;
|
||||
const DEFAULT_UPSTREAM_CONNECT_BUDGET_MS: u64 = 3000;
|
||||
|
|
@ -87,11 +96,11 @@ pub(crate) fn default_connect_timeout() -> u64 {
|
|||
}
|
||||
|
||||
pub(crate) fn default_keepalive() -> u64 {
|
||||
60
|
||||
15
|
||||
}
|
||||
|
||||
pub(crate) fn default_ack_timeout() -> u64 {
|
||||
300
|
||||
90
|
||||
}
|
||||
pub(crate) fn default_me_one_retry() -> u8 {
|
||||
12
|
||||
|
|
@ -153,6 +162,10 @@ pub(crate) fn default_server_max_connections() -> u32 {
|
|||
10_000
|
||||
}
|
||||
|
||||
pub(crate) fn default_accept_permit_timeout_ms() -> u64 {
|
||||
DEFAULT_ACCEPT_PERMIT_TIMEOUT_MS
|
||||
}
|
||||
|
||||
pub(crate) fn default_prefer_4() -> u8 {
|
||||
4
|
||||
}
|
||||
|
|
@ -377,6 +390,18 @@ pub(crate) fn default_me_warn_rate_limit_ms() -> u64 {
|
|||
DEFAULT_ME_WARN_RATE_LIMIT_MS
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_route_hybrid_max_wait_ms() -> u64 {
|
||||
DEFAULT_ME_ROUTE_HYBRID_MAX_WAIT_MS
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_route_blocking_send_timeout_ms() -> u64 {
|
||||
DEFAULT_ME_ROUTE_BLOCKING_SEND_TIMEOUT_MS
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_c2me_send_timeout_ms() -> u64 {
|
||||
DEFAULT_ME_C2ME_SEND_TIMEOUT_MS
|
||||
}
|
||||
|
||||
pub(crate) fn default_upstream_connect_retry_attempts() -> u32 {
|
||||
DEFAULT_UPSTREAM_CONNECT_RETRY_ATTEMPTS
|
||||
}
|
||||
|
|
@ -594,6 +619,26 @@ pub(crate) fn default_me_pool_drain_threshold() -> u64 {
|
|||
128
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_pool_drain_soft_evict_enabled() -> bool {
|
||||
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_ENABLED
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_pool_drain_soft_evict_grace_secs() -> u64 {
|
||||
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_GRACE_SECS
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_pool_drain_soft_evict_per_writer() -> u8 {
|
||||
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_PER_WRITER
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_pool_drain_soft_evict_budget_per_core() -> u16 {
|
||||
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_BUDGET_PER_CORE
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_pool_drain_soft_evict_cooldown_ms() -> u64 {
|
||||
DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_COOLDOWN_MS
|
||||
}
|
||||
|
||||
pub(crate) fn default_me_bind_stale_ttl_secs() -> u64 {
|
||||
default_me_pool_drain_ttl_secs()
|
||||
}
|
||||
|
|
|
|||
|
|
@ -346,6 +346,12 @@ impl ProxyConfig {
|
|||
));
|
||||
}
|
||||
|
||||
if config.general.me_c2me_send_timeout_ms > 60_000 {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_c2me_send_timeout_ms must be within [0, 60000]".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if config.general.me_reader_route_data_wait_ms > 20 {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_reader_route_data_wait_ms must be within [0, 20]".to_string(),
|
||||
|
|
@ -406,6 +412,35 @@ impl ProxyConfig {
|
|||
));
|
||||
}
|
||||
|
||||
if config.general.me_pool_drain_soft_evict_grace_secs > 3600 {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_pool_drain_soft_evict_grace_secs must be within [0, 3600]".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if config.general.me_pool_drain_soft_evict_per_writer == 0
|
||||
|| config.general.me_pool_drain_soft_evict_per_writer > 16
|
||||
{
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_pool_drain_soft_evict_per_writer must be within [1, 16]".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if config.general.me_pool_drain_soft_evict_budget_per_core == 0
|
||||
|| config.general.me_pool_drain_soft_evict_budget_per_core > 64
|
||||
{
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_pool_drain_soft_evict_budget_per_core must be within [1, 64]"
|
||||
.to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if config.general.me_pool_drain_soft_evict_cooldown_ms == 0 {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_pool_drain_soft_evict_cooldown_ms must be > 0".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if config.access.user_max_unique_ips_window_secs == 0 {
|
||||
return Err(ProxyError::Config(
|
||||
"access.user_max_unique_ips_window_secs must be > 0".to_string(),
|
||||
|
|
@ -577,6 +612,11 @@ impl ProxyConfig {
|
|||
"general.me_route_backpressure_base_timeout_ms must be > 0".to_string(),
|
||||
));
|
||||
}
|
||||
if config.general.me_route_backpressure_base_timeout_ms > 5000 {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_route_backpressure_base_timeout_ms must be within [1, 5000]".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if config.general.me_route_backpressure_high_timeout_ms
|
||||
< config.general.me_route_backpressure_base_timeout_ms
|
||||
|
|
@ -585,6 +625,11 @@ impl ProxyConfig {
|
|||
"general.me_route_backpressure_high_timeout_ms must be >= general.me_route_backpressure_base_timeout_ms".to_string(),
|
||||
));
|
||||
}
|
||||
if config.general.me_route_backpressure_high_timeout_ms > 5000 {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_route_backpressure_high_timeout_ms must be within [1, 5000]".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if !(1..=100).contains(&config.general.me_route_backpressure_high_watermark_pct) {
|
||||
return Err(ProxyError::Config(
|
||||
|
|
@ -598,6 +643,18 @@ impl ProxyConfig {
|
|||
));
|
||||
}
|
||||
|
||||
if !(50..=60_000).contains(&config.general.me_route_hybrid_max_wait_ms) {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_route_hybrid_max_wait_ms must be within [50, 60000]".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if config.general.me_route_blocking_send_timeout_ms > 5000 {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_route_blocking_send_timeout_ms must be within [0, 5000]".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if !(2..=4).contains(&config.general.me_writer_pick_sample_size) {
|
||||
return Err(ProxyError::Config(
|
||||
"general.me_writer_pick_sample_size must be within [2, 4]".to_string(),
|
||||
|
|
@ -658,6 +715,12 @@ impl ProxyConfig {
|
|||
));
|
||||
}
|
||||
|
||||
if config.server.accept_permit_timeout_ms > 60_000 {
|
||||
return Err(ProxyError::Config(
|
||||
"server.accept_permit_timeout_ms must be within [0, 60000]".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
if config.general.effective_me_pool_force_close_secs() > 0
|
||||
&& config.general.effective_me_pool_force_close_secs()
|
||||
< config.general.me_pool_drain_ttl_secs
|
||||
|
|
@ -1571,6 +1634,47 @@ mod tests {
|
|||
let _ = std::fs::remove_file(path_valid);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn me_route_backpressure_base_timeout_ms_out_of_range_is_rejected() {
|
||||
let toml = r#"
|
||||
[general]
|
||||
me_route_backpressure_base_timeout_ms = 5001
|
||||
|
||||
[censorship]
|
||||
tls_domain = "example.com"
|
||||
|
||||
[access.users]
|
||||
user = "00000000000000000000000000000000"
|
||||
"#;
|
||||
let dir = std::env::temp_dir();
|
||||
let path = dir.join("telemt_me_route_backpressure_base_timeout_ms_out_of_range_test.toml");
|
||||
std::fs::write(&path, toml).unwrap();
|
||||
let err = ProxyConfig::load(&path).unwrap_err().to_string();
|
||||
assert!(err.contains("general.me_route_backpressure_base_timeout_ms must be within [1, 5000]"));
|
||||
let _ = std::fs::remove_file(path);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn me_route_backpressure_high_timeout_ms_out_of_range_is_rejected() {
|
||||
let toml = r#"
|
||||
[general]
|
||||
me_route_backpressure_base_timeout_ms = 100
|
||||
me_route_backpressure_high_timeout_ms = 5001
|
||||
|
||||
[censorship]
|
||||
tls_domain = "example.com"
|
||||
|
||||
[access.users]
|
||||
user = "00000000000000000000000000000000"
|
||||
"#;
|
||||
let dir = std::env::temp_dir();
|
||||
let path = dir.join("telemt_me_route_backpressure_high_timeout_ms_out_of_range_test.toml");
|
||||
std::fs::write(&path, toml).unwrap();
|
||||
let err = ProxyConfig::load(&path).unwrap_err().to_string();
|
||||
assert!(err.contains("general.me_route_backpressure_high_timeout_ms must be within [1, 5000]"));
|
||||
let _ = std::fs::remove_file(path);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn me_route_no_writer_wait_ms_out_of_range_is_rejected() {
|
||||
let toml = r#"
|
||||
|
|
|
|||
|
|
@ -462,6 +462,11 @@ pub struct GeneralConfig {
|
|||
#[serde(default = "default_me_c2me_channel_capacity")]
|
||||
pub me_c2me_channel_capacity: usize,
|
||||
|
||||
/// Maximum wait in milliseconds for enqueueing C2ME commands when the queue is full.
|
||||
/// `0` keeps legacy unbounded wait behavior.
|
||||
#[serde(default = "default_me_c2me_send_timeout_ms")]
|
||||
pub me_c2me_send_timeout_ms: u64,
|
||||
|
||||
/// Bounded wait in milliseconds for routing ME DATA to per-connection queue.
|
||||
/// `0` keeps legacy no-wait behavior.
|
||||
#[serde(default = "default_me_reader_route_data_wait_ms")]
|
||||
|
|
@ -716,6 +721,15 @@ pub struct GeneralConfig {
|
|||
#[serde(default = "default_me_route_no_writer_wait_ms")]
|
||||
pub me_route_no_writer_wait_ms: u64,
|
||||
|
||||
/// Maximum cumulative wait in milliseconds for hybrid no-writer mode before failfast.
|
||||
#[serde(default = "default_me_route_hybrid_max_wait_ms")]
|
||||
pub me_route_hybrid_max_wait_ms: u64,
|
||||
|
||||
/// Maximum wait in milliseconds for blocking ME writer channel send fallback.
|
||||
/// `0` keeps legacy unbounded wait behavior.
|
||||
#[serde(default = "default_me_route_blocking_send_timeout_ms")]
|
||||
pub me_route_blocking_send_timeout_ms: u64,
|
||||
|
||||
/// Number of inline recovery attempts in legacy mode.
|
||||
#[serde(default = "default_me_route_inline_recovery_attempts")]
|
||||
pub me_route_inline_recovery_attempts: u32,
|
||||
|
|
@ -803,6 +817,26 @@ pub struct GeneralConfig {
|
|||
#[serde(default = "default_me_pool_drain_threshold")]
|
||||
pub me_pool_drain_threshold: u64,
|
||||
|
||||
/// Enable staged client eviction for draining ME writers that remain non-empty past TTL.
|
||||
#[serde(default = "default_me_pool_drain_soft_evict_enabled")]
|
||||
pub me_pool_drain_soft_evict_enabled: bool,
|
||||
|
||||
/// Extra grace in seconds after drain TTL before soft-eviction stage starts.
|
||||
#[serde(default = "default_me_pool_drain_soft_evict_grace_secs")]
|
||||
pub me_pool_drain_soft_evict_grace_secs: u64,
|
||||
|
||||
/// Maximum number of client sessions to evict from one draining writer per health tick.
|
||||
#[serde(default = "default_me_pool_drain_soft_evict_per_writer")]
|
||||
pub me_pool_drain_soft_evict_per_writer: u8,
|
||||
|
||||
/// Soft-eviction budget per CPU core for one health tick.
|
||||
#[serde(default = "default_me_pool_drain_soft_evict_budget_per_core")]
|
||||
pub me_pool_drain_soft_evict_budget_per_core: u16,
|
||||
|
||||
/// Cooldown for repetitive soft-eviction on the same writer in milliseconds.
|
||||
#[serde(default = "default_me_pool_drain_soft_evict_cooldown_ms")]
|
||||
pub me_pool_drain_soft_evict_cooldown_ms: u64,
|
||||
|
||||
/// Policy for new binds on stale draining writers.
|
||||
#[serde(default)]
|
||||
pub me_bind_stale_mode: MeBindStaleMode,
|
||||
|
|
@ -901,6 +935,7 @@ impl Default for GeneralConfig {
|
|||
me_writer_cmd_channel_capacity: default_me_writer_cmd_channel_capacity(),
|
||||
me_route_channel_capacity: default_me_route_channel_capacity(),
|
||||
me_c2me_channel_capacity: default_me_c2me_channel_capacity(),
|
||||
me_c2me_send_timeout_ms: default_me_c2me_send_timeout_ms(),
|
||||
me_reader_route_data_wait_ms: default_me_reader_route_data_wait_ms(),
|
||||
me_d2c_flush_batch_max_frames: default_me_d2c_flush_batch_max_frames(),
|
||||
me_d2c_flush_batch_max_bytes: default_me_d2c_flush_batch_max_bytes(),
|
||||
|
|
@ -955,6 +990,8 @@ impl Default for GeneralConfig {
|
|||
me_warn_rate_limit_ms: default_me_warn_rate_limit_ms(),
|
||||
me_route_no_writer_mode: MeRouteNoWriterMode::default(),
|
||||
me_route_no_writer_wait_ms: default_me_route_no_writer_wait_ms(),
|
||||
me_route_hybrid_max_wait_ms: default_me_route_hybrid_max_wait_ms(),
|
||||
me_route_blocking_send_timeout_ms: default_me_route_blocking_send_timeout_ms(),
|
||||
me_route_inline_recovery_attempts: default_me_route_inline_recovery_attempts(),
|
||||
me_route_inline_recovery_wait_ms: default_me_route_inline_recovery_wait_ms(),
|
||||
links: LinksConfig::default(),
|
||||
|
|
@ -984,6 +1021,13 @@ impl Default for GeneralConfig {
|
|||
proxy_secret_len_max: default_proxy_secret_len_max(),
|
||||
me_pool_drain_ttl_secs: default_me_pool_drain_ttl_secs(),
|
||||
me_pool_drain_threshold: default_me_pool_drain_threshold(),
|
||||
me_pool_drain_soft_evict_enabled: default_me_pool_drain_soft_evict_enabled(),
|
||||
me_pool_drain_soft_evict_grace_secs: default_me_pool_drain_soft_evict_grace_secs(),
|
||||
me_pool_drain_soft_evict_per_writer: default_me_pool_drain_soft_evict_per_writer(),
|
||||
me_pool_drain_soft_evict_budget_per_core:
|
||||
default_me_pool_drain_soft_evict_budget_per_core(),
|
||||
me_pool_drain_soft_evict_cooldown_ms:
|
||||
default_me_pool_drain_soft_evict_cooldown_ms(),
|
||||
me_bind_stale_mode: MeBindStaleMode::default(),
|
||||
me_bind_stale_ttl_secs: default_me_bind_stale_ttl_secs(),
|
||||
me_pool_min_fresh_ratio: default_me_pool_min_fresh_ratio(),
|
||||
|
|
@ -1187,6 +1231,11 @@ pub struct ServerConfig {
|
|||
/// 0 means unlimited.
|
||||
#[serde(default = "default_server_max_connections")]
|
||||
pub max_connections: u32,
|
||||
|
||||
/// Maximum wait in milliseconds while acquiring a connection slot permit.
|
||||
/// `0` keeps legacy unbounded wait behavior.
|
||||
#[serde(default = "default_accept_permit_timeout_ms")]
|
||||
pub accept_permit_timeout_ms: u64,
|
||||
}
|
||||
|
||||
impl Default for ServerConfig {
|
||||
|
|
@ -1207,6 +1256,7 @@ impl Default for ServerConfig {
|
|||
api: ApiConfig::default(),
|
||||
listeners: Vec::new(),
|
||||
max_connections: default_server_max_connections(),
|
||||
accept_permit_timeout_ms: default_accept_permit_timeout_ms(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -253,6 +253,7 @@ pub(crate) fn format_uptime(total_secs: u64) -> String {
|
|||
format!("{} / {} seconds", parts.join(", "), total_secs)
|
||||
}
|
||||
|
||||
#[allow(dead_code)]
|
||||
pub(crate) async fn wait_until_admission_open(admission_rx: &mut watch::Receiver<bool>) -> bool {
|
||||
loop {
|
||||
if *admission_rx.borrow() {
|
||||
|
|
|
|||
|
|
@ -24,7 +24,7 @@ use crate::transport::{
|
|||
ListenOptions, UpstreamManager, create_listener, find_listener_processes,
|
||||
};
|
||||
|
||||
use super::helpers::{is_expected_handshake_eof, print_proxy_links, wait_until_admission_open};
|
||||
use super::helpers::{is_expected_handshake_eof, print_proxy_links};
|
||||
|
||||
pub(crate) struct BoundListeners {
|
||||
pub(crate) listeners: Vec<(TcpListener, bool)>,
|
||||
|
|
@ -195,7 +195,7 @@ pub(crate) async fn bind_listeners(
|
|||
has_unix_listener = true;
|
||||
|
||||
let mut config_rx_unix: watch::Receiver<Arc<ProxyConfig>> = config_rx.clone();
|
||||
let mut admission_rx_unix = admission_rx.clone();
|
||||
let admission_rx_unix = admission_rx.clone();
|
||||
let stats = stats.clone();
|
||||
let upstream_manager = upstream_manager.clone();
|
||||
let replay_checker = replay_checker.clone();
|
||||
|
|
@ -212,17 +212,44 @@ pub(crate) async fn bind_listeners(
|
|||
let unix_conn_counter = Arc::new(std::sync::atomic::AtomicU64::new(1));
|
||||
|
||||
loop {
|
||||
if !wait_until_admission_open(&mut admission_rx_unix).await {
|
||||
warn!("Conditional-admission gate channel closed for unix listener");
|
||||
break;
|
||||
}
|
||||
match unix_listener.accept().await {
|
||||
Ok((stream, _)) => {
|
||||
let permit = match max_connections_unix.clone().acquire_owned().await {
|
||||
Ok(permit) => permit,
|
||||
Err(_) => {
|
||||
error!("Connection limiter is closed");
|
||||
break;
|
||||
if !*admission_rx_unix.borrow() {
|
||||
drop(stream);
|
||||
continue;
|
||||
}
|
||||
let accept_permit_timeout_ms = config_rx_unix
|
||||
.borrow()
|
||||
.server
|
||||
.accept_permit_timeout_ms;
|
||||
let permit = if accept_permit_timeout_ms == 0 {
|
||||
match max_connections_unix.clone().acquire_owned().await {
|
||||
Ok(permit) => permit,
|
||||
Err(_) => {
|
||||
error!("Connection limiter is closed");
|
||||
break;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
match tokio::time::timeout(
|
||||
Duration::from_millis(accept_permit_timeout_ms),
|
||||
max_connections_unix.clone().acquire_owned(),
|
||||
)
|
||||
.await
|
||||
{
|
||||
Ok(Ok(permit)) => permit,
|
||||
Ok(Err(_)) => {
|
||||
error!("Connection limiter is closed");
|
||||
break;
|
||||
}
|
||||
Err(_) => {
|
||||
debug!(
|
||||
timeout_ms = accept_permit_timeout_ms,
|
||||
"Dropping accepted unix connection: permit wait timeout"
|
||||
);
|
||||
drop(stream);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
};
|
||||
let conn_id =
|
||||
|
|
@ -312,7 +339,7 @@ pub(crate) fn spawn_tcp_accept_loops(
|
|||
) {
|
||||
for (listener, listener_proxy_protocol) in listeners {
|
||||
let mut config_rx: watch::Receiver<Arc<ProxyConfig>> = config_rx.clone();
|
||||
let mut admission_rx_tcp = admission_rx.clone();
|
||||
let admission_rx_tcp = admission_rx.clone();
|
||||
let stats = stats.clone();
|
||||
let upstream_manager = upstream_manager.clone();
|
||||
let replay_checker = replay_checker.clone();
|
||||
|
|
@ -327,17 +354,46 @@ pub(crate) fn spawn_tcp_accept_loops(
|
|||
|
||||
tokio::spawn(async move {
|
||||
loop {
|
||||
if !wait_until_admission_open(&mut admission_rx_tcp).await {
|
||||
warn!("Conditional-admission gate channel closed for tcp listener");
|
||||
break;
|
||||
}
|
||||
match listener.accept().await {
|
||||
Ok((stream, peer_addr)) => {
|
||||
let permit = match max_connections_tcp.clone().acquire_owned().await {
|
||||
Ok(permit) => permit,
|
||||
Err(_) => {
|
||||
error!("Connection limiter is closed");
|
||||
break;
|
||||
if !*admission_rx_tcp.borrow() {
|
||||
debug!(peer = %peer_addr, "Admission gate closed, dropping connection");
|
||||
drop(stream);
|
||||
continue;
|
||||
}
|
||||
let accept_permit_timeout_ms = config_rx
|
||||
.borrow()
|
||||
.server
|
||||
.accept_permit_timeout_ms;
|
||||
let permit = if accept_permit_timeout_ms == 0 {
|
||||
match max_connections_tcp.clone().acquire_owned().await {
|
||||
Ok(permit) => permit,
|
||||
Err(_) => {
|
||||
error!("Connection limiter is closed");
|
||||
break;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
match tokio::time::timeout(
|
||||
Duration::from_millis(accept_permit_timeout_ms),
|
||||
max_connections_tcp.clone().acquire_owned(),
|
||||
)
|
||||
.await
|
||||
{
|
||||
Ok(Ok(permit)) => permit,
|
||||
Ok(Err(_)) => {
|
||||
error!("Connection limiter is closed");
|
||||
break;
|
||||
}
|
||||
Err(_) => {
|
||||
debug!(
|
||||
peer = %peer_addr,
|
||||
timeout_ms = accept_permit_timeout_ms,
|
||||
"Dropping accepted connection: permit wait timeout"
|
||||
);
|
||||
drop(stream);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
};
|
||||
let config = config_rx.borrow_and_update().clone();
|
||||
|
|
|
|||
|
|
@ -238,6 +238,11 @@ pub(crate) async fn initialize_me_pool(
|
|||
config.general.hardswap,
|
||||
config.general.me_pool_drain_ttl_secs,
|
||||
config.general.me_pool_drain_threshold,
|
||||
config.general.me_pool_drain_soft_evict_enabled,
|
||||
config.general.me_pool_drain_soft_evict_grace_secs,
|
||||
config.general.me_pool_drain_soft_evict_per_writer,
|
||||
config.general.me_pool_drain_soft_evict_budget_per_core,
|
||||
config.general.me_pool_drain_soft_evict_cooldown_ms,
|
||||
config.general.effective_me_pool_force_close_secs(),
|
||||
config.general.me_pool_min_fresh_ratio,
|
||||
config.general.me_hardswap_warmup_delay_min_ms,
|
||||
|
|
@ -262,6 +267,8 @@ pub(crate) async fn initialize_me_pool(
|
|||
config.general.me_warn_rate_limit_ms,
|
||||
config.general.me_route_no_writer_mode,
|
||||
config.general.me_route_no_writer_wait_ms,
|
||||
config.general.me_route_hybrid_max_wait_ms,
|
||||
config.general.me_route_blocking_send_timeout_ms,
|
||||
config.general.me_route_inline_recovery_attempts,
|
||||
config.general.me_route_inline_recovery_wait_ms,
|
||||
);
|
||||
|
|
|
|||
|
|
@ -484,7 +484,7 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
|
|||
Duration::from_secs(config.access.replay_window_secs),
|
||||
));
|
||||
|
||||
let buffer_pool = Arc::new(BufferPool::with_config(16 * 1024, 4096));
|
||||
let buffer_pool = Arc::new(BufferPool::with_config(64 * 1024, 4096));
|
||||
|
||||
connectivity::run_startup_connectivity(
|
||||
&config,
|
||||
|
|
|
|||
209
src/metrics.rs
209
src/metrics.rs
|
|
@ -292,6 +292,109 @@ async fn render_metrics(stats: &Stats, config: &ProxyConfig, ip_tracker: &UserIp
|
|||
"telemt_connections_bad_total {}",
|
||||
if core_enabled { stats.get_connects_bad() } else { 0 }
|
||||
);
|
||||
let _ = writeln!(out, "# HELP telemt_connections_current Current active connections");
|
||||
let _ = writeln!(out, "# TYPE telemt_connections_current gauge");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_connections_current {}",
|
||||
if core_enabled {
|
||||
stats.get_current_connections_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
let _ = writeln!(out, "# HELP telemt_connections_direct_current Current active direct connections");
|
||||
let _ = writeln!(out, "# TYPE telemt_connections_direct_current gauge");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_connections_direct_current {}",
|
||||
if core_enabled {
|
||||
stats.get_current_connections_direct()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
let _ = writeln!(out, "# HELP telemt_connections_me_current Current active middle-end connections");
|
||||
let _ = writeln!(out, "# TYPE telemt_connections_me_current gauge");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_connections_me_current {}",
|
||||
if core_enabled {
|
||||
stats.get_current_connections_me()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_relay_adaptive_promotions_total Adaptive relay tier promotions"
|
||||
);
|
||||
let _ = writeln!(out, "# TYPE telemt_relay_adaptive_promotions_total counter");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_relay_adaptive_promotions_total {}",
|
||||
if core_enabled {
|
||||
stats.get_relay_adaptive_promotions_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_relay_adaptive_demotions_total Adaptive relay tier demotions"
|
||||
);
|
||||
let _ = writeln!(out, "# TYPE telemt_relay_adaptive_demotions_total counter");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_relay_adaptive_demotions_total {}",
|
||||
if core_enabled {
|
||||
stats.get_relay_adaptive_demotions_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_relay_adaptive_hard_promotions_total Adaptive relay hard promotions triggered by write pressure"
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# TYPE telemt_relay_adaptive_hard_promotions_total counter"
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_relay_adaptive_hard_promotions_total {}",
|
||||
if core_enabled {
|
||||
stats.get_relay_adaptive_hard_promotions_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
let _ = writeln!(out, "# HELP telemt_reconnect_evict_total Reconnect-driven session evictions");
|
||||
let _ = writeln!(out, "# TYPE telemt_reconnect_evict_total counter");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_reconnect_evict_total {}",
|
||||
if core_enabled {
|
||||
stats.get_reconnect_evict_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_reconnect_stale_close_total Sessions closed because they became stale after reconnect"
|
||||
);
|
||||
let _ = writeln!(out, "# TYPE telemt_reconnect_stale_close_total counter");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_reconnect_stale_close_total {}",
|
||||
if core_enabled {
|
||||
stats.get_reconnect_stale_close_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
|
||||
let _ = writeln!(out, "# HELP telemt_handshake_timeouts_total Handshake timeouts");
|
||||
let _ = writeln!(out, "# TYPE telemt_handshake_timeouts_total counter");
|
||||
|
|
@ -1547,6 +1650,36 @@ async fn render_metrics(stats: &Stats, config: &ProxyConfig, ip_tracker: &UserIp
|
|||
}
|
||||
);
|
||||
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_pool_drain_soft_evict_total Soft-evicted client sessions on stuck draining writers"
|
||||
);
|
||||
let _ = writeln!(out, "# TYPE telemt_pool_drain_soft_evict_total counter");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_pool_drain_soft_evict_total {}",
|
||||
if me_allows_normal {
|
||||
stats.get_pool_drain_soft_evict_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_pool_drain_soft_evict_writer_total Draining writers with at least one soft eviction"
|
||||
);
|
||||
let _ = writeln!(out, "# TYPE telemt_pool_drain_soft_evict_writer_total counter");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_pool_drain_soft_evict_writer_total {}",
|
||||
if me_allows_normal {
|
||||
stats.get_pool_drain_soft_evict_writer_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
|
||||
let _ = writeln!(out, "# HELP telemt_pool_stale_pick_total Stale writer fallback picks for new binds");
|
||||
let _ = writeln!(out, "# TYPE telemt_pool_stale_pick_total counter");
|
||||
let _ = writeln!(
|
||||
|
|
@ -1559,6 +1692,57 @@ async fn render_metrics(stats: &Stats, config: &ProxyConfig, ip_tracker: &UserIp
|
|||
}
|
||||
);
|
||||
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_me_writer_close_signal_drop_total Close-signal drops for already-removed ME writers"
|
||||
);
|
||||
let _ = writeln!(out, "# TYPE telemt_me_writer_close_signal_drop_total counter");
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_me_writer_close_signal_drop_total {}",
|
||||
if me_allows_normal {
|
||||
stats.get_me_writer_close_signal_drop_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_me_writer_close_signal_channel_full_total Close-signal drops caused by full writer command channels"
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# TYPE telemt_me_writer_close_signal_channel_full_total counter"
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_me_writer_close_signal_channel_full_total {}",
|
||||
if me_allows_normal {
|
||||
stats.get_me_writer_close_signal_channel_full_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# HELP telemt_me_draining_writers_reap_progress_total Draining-writer removals processed by reap cleanup"
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"# TYPE telemt_me_draining_writers_reap_progress_total counter"
|
||||
);
|
||||
let _ = writeln!(
|
||||
out,
|
||||
"telemt_me_draining_writers_reap_progress_total {}",
|
||||
if me_allows_normal {
|
||||
stats.get_me_draining_writers_reap_progress_total()
|
||||
} else {
|
||||
0
|
||||
}
|
||||
);
|
||||
|
||||
let _ = writeln!(out, "# HELP telemt_me_writer_removed_total Total ME writer removals");
|
||||
let _ = writeln!(out, "# TYPE telemt_me_writer_removed_total counter");
|
||||
let _ = writeln!(
|
||||
|
|
@ -1864,6 +2048,8 @@ mod tests {
|
|||
stats.increment_connects_all();
|
||||
stats.increment_connects_all();
|
||||
stats.increment_connects_bad();
|
||||
stats.increment_current_connections_direct();
|
||||
stats.increment_current_connections_me();
|
||||
stats.increment_handshake_timeouts();
|
||||
stats.increment_upstream_connect_attempt_total();
|
||||
stats.increment_upstream_connect_attempt_total();
|
||||
|
|
@ -1895,6 +2081,9 @@ mod tests {
|
|||
|
||||
assert!(output.contains("telemt_connections_total 2"));
|
||||
assert!(output.contains("telemt_connections_bad_total 1"));
|
||||
assert!(output.contains("telemt_connections_current 2"));
|
||||
assert!(output.contains("telemt_connections_direct_current 1"));
|
||||
assert!(output.contains("telemt_connections_me_current 1"));
|
||||
assert!(output.contains("telemt_handshake_timeouts_total 1"));
|
||||
assert!(output.contains("telemt_upstream_connect_attempt_total 2"));
|
||||
assert!(output.contains("telemt_upstream_connect_success_total 1"));
|
||||
|
|
@ -1937,6 +2126,9 @@ mod tests {
|
|||
let output = render_metrics(&stats, &config, &tracker).await;
|
||||
assert!(output.contains("telemt_connections_total 0"));
|
||||
assert!(output.contains("telemt_connections_bad_total 0"));
|
||||
assert!(output.contains("telemt_connections_current 0"));
|
||||
assert!(output.contains("telemt_connections_direct_current 0"));
|
||||
assert!(output.contains("telemt_connections_me_current 0"));
|
||||
assert!(output.contains("telemt_handshake_timeouts_total 0"));
|
||||
assert!(output.contains("telemt_user_unique_ips_current{user="));
|
||||
assert!(output.contains("telemt_user_unique_ips_recent_window{user="));
|
||||
|
|
@ -1970,11 +2162,28 @@ mod tests {
|
|||
assert!(output.contains("# TYPE telemt_uptime_seconds gauge"));
|
||||
assert!(output.contains("# TYPE telemt_connections_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_connections_bad_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_connections_current gauge"));
|
||||
assert!(output.contains("# TYPE telemt_connections_direct_current gauge"));
|
||||
assert!(output.contains("# TYPE telemt_connections_me_current gauge"));
|
||||
assert!(output.contains("# TYPE telemt_relay_adaptive_promotions_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_relay_adaptive_demotions_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_relay_adaptive_hard_promotions_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_reconnect_evict_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_reconnect_stale_close_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_handshake_timeouts_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_upstream_connect_attempt_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_me_rpc_proxy_req_signal_sent_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_me_idle_close_by_peer_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_me_writer_removed_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_me_writer_close_signal_drop_total counter"));
|
||||
assert!(output.contains(
|
||||
"# TYPE telemt_me_writer_close_signal_channel_full_total counter"
|
||||
));
|
||||
assert!(output.contains(
|
||||
"# TYPE telemt_me_draining_writers_reap_progress_total counter"
|
||||
));
|
||||
assert!(output.contains("# TYPE telemt_pool_drain_soft_evict_total counter"));
|
||||
assert!(output.contains("# TYPE telemt_pool_drain_soft_evict_writer_total counter"));
|
||||
assert!(output.contains(
|
||||
"# TYPE telemt_me_writer_removed_unexpected_minus_restored_total gauge"
|
||||
));
|
||||
|
|
|
|||
|
|
@ -0,0 +1,383 @@
|
|||
use dashmap::DashMap;
|
||||
use std::cmp::max;
|
||||
use std::sync::OnceLock;
|
||||
use std::time::{Duration, Instant};
|
||||
|
||||
const EMA_ALPHA: f64 = 0.2;
|
||||
const PROFILE_TTL: Duration = Duration::from_secs(300);
|
||||
const THROUGHPUT_UP_BPS: f64 = 8_000_000.0;
|
||||
const THROUGHPUT_DOWN_BPS: f64 = 2_000_000.0;
|
||||
const RATIO_CONFIRM_THRESHOLD: f64 = 1.12;
|
||||
const TIER1_HOLD_TICKS: u32 = 8;
|
||||
const TIER2_HOLD_TICKS: u32 = 4;
|
||||
const QUIET_DEMOTE_TICKS: u32 = 480;
|
||||
const HARD_COOLDOWN_TICKS: u32 = 20;
|
||||
const HARD_PENDING_THRESHOLD: u32 = 3;
|
||||
const HARD_PARTIAL_RATIO_THRESHOLD: f64 = 0.25;
|
||||
const DIRECT_C2S_CAP_BYTES: usize = 128 * 1024;
|
||||
const DIRECT_S2C_CAP_BYTES: usize = 512 * 1024;
|
||||
const ME_FRAMES_CAP: usize = 96;
|
||||
const ME_BYTES_CAP: usize = 384 * 1024;
|
||||
const ME_DELAY_MIN_US: u64 = 150;
|
||||
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
|
||||
pub enum AdaptiveTier {
|
||||
Base = 0,
|
||||
Tier1 = 1,
|
||||
Tier2 = 2,
|
||||
Tier3 = 3,
|
||||
}
|
||||
|
||||
impl AdaptiveTier {
|
||||
pub fn promote(self) -> Self {
|
||||
match self {
|
||||
Self::Base => Self::Tier1,
|
||||
Self::Tier1 => Self::Tier2,
|
||||
Self::Tier2 => Self::Tier3,
|
||||
Self::Tier3 => Self::Tier3,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn demote(self) -> Self {
|
||||
match self {
|
||||
Self::Base => Self::Base,
|
||||
Self::Tier1 => Self::Base,
|
||||
Self::Tier2 => Self::Tier1,
|
||||
Self::Tier3 => Self::Tier2,
|
||||
}
|
||||
}
|
||||
|
||||
fn ratio(self) -> (usize, usize) {
|
||||
match self {
|
||||
Self::Base => (1, 1),
|
||||
Self::Tier1 => (5, 4),
|
||||
Self::Tier2 => (3, 2),
|
||||
Self::Tier3 => (2, 1),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn as_u8(self) -> u8 {
|
||||
self as u8
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub enum TierTransitionReason {
|
||||
SoftConfirmed,
|
||||
HardPressure,
|
||||
QuietDemotion,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub struct TierTransition {
|
||||
pub from: AdaptiveTier,
|
||||
pub to: AdaptiveTier,
|
||||
pub reason: TierTransitionReason,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy, Default)]
|
||||
pub struct RelaySignalSample {
|
||||
pub c2s_bytes: u64,
|
||||
pub s2c_requested_bytes: u64,
|
||||
pub s2c_written_bytes: u64,
|
||||
pub s2c_write_ops: u64,
|
||||
pub s2c_partial_writes: u64,
|
||||
pub s2c_consecutive_pending_writes: u32,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct SessionAdaptiveController {
|
||||
tier: AdaptiveTier,
|
||||
max_tier_seen: AdaptiveTier,
|
||||
throughput_ema_bps: f64,
|
||||
incoming_ema_bps: f64,
|
||||
outgoing_ema_bps: f64,
|
||||
tier1_hold_ticks: u32,
|
||||
tier2_hold_ticks: u32,
|
||||
quiet_ticks: u32,
|
||||
hard_cooldown_ticks: u32,
|
||||
}
|
||||
|
||||
impl SessionAdaptiveController {
|
||||
pub fn new(initial_tier: AdaptiveTier) -> Self {
|
||||
Self {
|
||||
tier: initial_tier,
|
||||
max_tier_seen: initial_tier,
|
||||
throughput_ema_bps: 0.0,
|
||||
incoming_ema_bps: 0.0,
|
||||
outgoing_ema_bps: 0.0,
|
||||
tier1_hold_ticks: 0,
|
||||
tier2_hold_ticks: 0,
|
||||
quiet_ticks: 0,
|
||||
hard_cooldown_ticks: 0,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn max_tier_seen(&self) -> AdaptiveTier {
|
||||
self.max_tier_seen
|
||||
}
|
||||
|
||||
pub fn observe(&mut self, sample: RelaySignalSample, tick_secs: f64) -> Option<TierTransition> {
|
||||
if tick_secs <= f64::EPSILON {
|
||||
return None;
|
||||
}
|
||||
|
||||
if self.hard_cooldown_ticks > 0 {
|
||||
self.hard_cooldown_ticks -= 1;
|
||||
}
|
||||
|
||||
let c2s_bps = (sample.c2s_bytes as f64 * 8.0) / tick_secs;
|
||||
let incoming_bps = (sample.s2c_requested_bytes as f64 * 8.0) / tick_secs;
|
||||
let outgoing_bps = (sample.s2c_written_bytes as f64 * 8.0) / tick_secs;
|
||||
let throughput = c2s_bps.max(outgoing_bps);
|
||||
|
||||
self.throughput_ema_bps = ema(self.throughput_ema_bps, throughput);
|
||||
self.incoming_ema_bps = ema(self.incoming_ema_bps, incoming_bps);
|
||||
self.outgoing_ema_bps = ema(self.outgoing_ema_bps, outgoing_bps);
|
||||
|
||||
let tier1_now = self.throughput_ema_bps >= THROUGHPUT_UP_BPS;
|
||||
if tier1_now {
|
||||
self.tier1_hold_ticks = self.tier1_hold_ticks.saturating_add(1);
|
||||
} else {
|
||||
self.tier1_hold_ticks = 0;
|
||||
}
|
||||
|
||||
let ratio = if self.outgoing_ema_bps <= f64::EPSILON {
|
||||
0.0
|
||||
} else {
|
||||
self.incoming_ema_bps / self.outgoing_ema_bps
|
||||
};
|
||||
let tier2_now = ratio >= RATIO_CONFIRM_THRESHOLD;
|
||||
if tier2_now {
|
||||
self.tier2_hold_ticks = self.tier2_hold_ticks.saturating_add(1);
|
||||
} else {
|
||||
self.tier2_hold_ticks = 0;
|
||||
}
|
||||
|
||||
let partial_ratio = if sample.s2c_write_ops == 0 {
|
||||
0.0
|
||||
} else {
|
||||
sample.s2c_partial_writes as f64 / sample.s2c_write_ops as f64
|
||||
};
|
||||
let hard_now = sample.s2c_consecutive_pending_writes >= HARD_PENDING_THRESHOLD
|
||||
|| partial_ratio >= HARD_PARTIAL_RATIO_THRESHOLD;
|
||||
|
||||
if hard_now && self.hard_cooldown_ticks == 0 {
|
||||
return self.promote(TierTransitionReason::HardPressure, HARD_COOLDOWN_TICKS);
|
||||
}
|
||||
|
||||
if self.tier1_hold_ticks >= TIER1_HOLD_TICKS && self.tier2_hold_ticks >= TIER2_HOLD_TICKS {
|
||||
return self.promote(TierTransitionReason::SoftConfirmed, 0);
|
||||
}
|
||||
|
||||
let demote_candidate = self.throughput_ema_bps < THROUGHPUT_DOWN_BPS && !tier2_now && !hard_now;
|
||||
if demote_candidate {
|
||||
self.quiet_ticks = self.quiet_ticks.saturating_add(1);
|
||||
if self.quiet_ticks >= QUIET_DEMOTE_TICKS {
|
||||
self.quiet_ticks = 0;
|
||||
return self.demote(TierTransitionReason::QuietDemotion);
|
||||
}
|
||||
} else {
|
||||
self.quiet_ticks = 0;
|
||||
}
|
||||
|
||||
None
|
||||
}
|
||||
|
||||
fn promote(
|
||||
&mut self,
|
||||
reason: TierTransitionReason,
|
||||
hard_cooldown_ticks: u32,
|
||||
) -> Option<TierTransition> {
|
||||
let from = self.tier;
|
||||
let to = from.promote();
|
||||
if from == to {
|
||||
return None;
|
||||
}
|
||||
self.tier = to;
|
||||
self.max_tier_seen = max(self.max_tier_seen, to);
|
||||
self.hard_cooldown_ticks = hard_cooldown_ticks;
|
||||
self.tier1_hold_ticks = 0;
|
||||
self.tier2_hold_ticks = 0;
|
||||
self.quiet_ticks = 0;
|
||||
Some(TierTransition { from, to, reason })
|
||||
}
|
||||
|
||||
fn demote(&mut self, reason: TierTransitionReason) -> Option<TierTransition> {
|
||||
let from = self.tier;
|
||||
let to = from.demote();
|
||||
if from == to {
|
||||
return None;
|
||||
}
|
||||
self.tier = to;
|
||||
self.tier1_hold_ticks = 0;
|
||||
self.tier2_hold_ticks = 0;
|
||||
Some(TierTransition { from, to, reason })
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
struct UserAdaptiveProfile {
|
||||
tier: AdaptiveTier,
|
||||
seen_at: Instant,
|
||||
}
|
||||
|
||||
fn profiles() -> &'static DashMap<String, UserAdaptiveProfile> {
|
||||
static USER_PROFILES: OnceLock<DashMap<String, UserAdaptiveProfile>> = OnceLock::new();
|
||||
USER_PROFILES.get_or_init(DashMap::new)
|
||||
}
|
||||
|
||||
pub fn seed_tier_for_user(user: &str) -> AdaptiveTier {
|
||||
let now = Instant::now();
|
||||
if let Some(entry) = profiles().get(user) {
|
||||
let value = entry.value();
|
||||
if now.duration_since(value.seen_at) <= PROFILE_TTL {
|
||||
return value.tier;
|
||||
}
|
||||
}
|
||||
AdaptiveTier::Base
|
||||
}
|
||||
|
||||
pub fn record_user_tier(user: &str, tier: AdaptiveTier) {
|
||||
let now = Instant::now();
|
||||
if let Some(mut entry) = profiles().get_mut(user) {
|
||||
let existing = *entry;
|
||||
let effective = if now.duration_since(existing.seen_at) > PROFILE_TTL {
|
||||
tier
|
||||
} else {
|
||||
max(existing.tier, tier)
|
||||
};
|
||||
*entry = UserAdaptiveProfile {
|
||||
tier: effective,
|
||||
seen_at: now,
|
||||
};
|
||||
return;
|
||||
}
|
||||
profiles().insert(
|
||||
user.to_string(),
|
||||
UserAdaptiveProfile { tier, seen_at: now },
|
||||
);
|
||||
}
|
||||
|
||||
pub fn direct_copy_buffers_for_tier(
|
||||
tier: AdaptiveTier,
|
||||
base_c2s: usize,
|
||||
base_s2c: usize,
|
||||
) -> (usize, usize) {
|
||||
let (num, den) = tier.ratio();
|
||||
(
|
||||
scale(base_c2s, num, den, DIRECT_C2S_CAP_BYTES),
|
||||
scale(base_s2c, num, den, DIRECT_S2C_CAP_BYTES),
|
||||
)
|
||||
}
|
||||
|
||||
pub fn me_flush_policy_for_tier(
|
||||
tier: AdaptiveTier,
|
||||
base_frames: usize,
|
||||
base_bytes: usize,
|
||||
base_delay: Duration,
|
||||
) -> (usize, usize, Duration) {
|
||||
let (num, den) = tier.ratio();
|
||||
let frames = scale(base_frames, num, den, ME_FRAMES_CAP).max(1);
|
||||
let bytes = scale(base_bytes, num, den, ME_BYTES_CAP).max(4096);
|
||||
let delay_us = base_delay.as_micros() as u64;
|
||||
let adjusted_delay_us = match tier {
|
||||
AdaptiveTier::Base => delay_us,
|
||||
AdaptiveTier::Tier1 => (delay_us.saturating_mul(7)).saturating_div(10),
|
||||
AdaptiveTier::Tier2 => delay_us.saturating_div(2),
|
||||
AdaptiveTier::Tier3 => (delay_us.saturating_mul(3)).saturating_div(10),
|
||||
}
|
||||
.max(ME_DELAY_MIN_US)
|
||||
.min(delay_us.max(ME_DELAY_MIN_US));
|
||||
(frames, bytes, Duration::from_micros(adjusted_delay_us))
|
||||
}
|
||||
|
||||
fn ema(prev: f64, value: f64) -> f64 {
|
||||
if prev <= f64::EPSILON {
|
||||
value
|
||||
} else {
|
||||
(prev * (1.0 - EMA_ALPHA)) + (value * EMA_ALPHA)
|
||||
}
|
||||
}
|
||||
|
||||
fn scale(base: usize, numerator: usize, denominator: usize, cap: usize) -> usize {
|
||||
let scaled = base
|
||||
.saturating_mul(numerator)
|
||||
.saturating_div(denominator.max(1));
|
||||
scaled.min(cap).max(1)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
fn sample(
|
||||
c2s_bytes: u64,
|
||||
s2c_requested_bytes: u64,
|
||||
s2c_written_bytes: u64,
|
||||
s2c_write_ops: u64,
|
||||
s2c_partial_writes: u64,
|
||||
s2c_consecutive_pending_writes: u32,
|
||||
) -> RelaySignalSample {
|
||||
RelaySignalSample {
|
||||
c2s_bytes,
|
||||
s2c_requested_bytes,
|
||||
s2c_written_bytes,
|
||||
s2c_write_ops,
|
||||
s2c_partial_writes,
|
||||
s2c_consecutive_pending_writes,
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_soft_promotion_requires_tier1_and_tier2() {
|
||||
let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Base);
|
||||
let tick_secs = 0.25;
|
||||
let mut promoted = None;
|
||||
for _ in 0..8 {
|
||||
promoted = ctrl.observe(
|
||||
sample(
|
||||
300_000, // ~9.6 Mbps
|
||||
320_000, // incoming > outgoing to confirm tier2
|
||||
250_000,
|
||||
10,
|
||||
0,
|
||||
0,
|
||||
),
|
||||
tick_secs,
|
||||
);
|
||||
}
|
||||
|
||||
let transition = promoted.expect("expected soft promotion");
|
||||
assert_eq!(transition.from, AdaptiveTier::Base);
|
||||
assert_eq!(transition.to, AdaptiveTier::Tier1);
|
||||
assert_eq!(transition.reason, TierTransitionReason::SoftConfirmed);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_hard_promotion_on_pending_pressure() {
|
||||
let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Base);
|
||||
let transition = ctrl
|
||||
.observe(
|
||||
sample(10_000, 20_000, 10_000, 4, 1, 3),
|
||||
0.25,
|
||||
)
|
||||
.expect("expected hard promotion");
|
||||
assert_eq!(transition.reason, TierTransitionReason::HardPressure);
|
||||
assert_eq!(transition.to, AdaptiveTier::Tier1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_quiet_demotion_is_slow_and_stepwise() {
|
||||
let mut ctrl = SessionAdaptiveController::new(AdaptiveTier::Tier2);
|
||||
let mut demotion = None;
|
||||
for _ in 0..QUIET_DEMOTE_TICKS {
|
||||
demotion = ctrl.observe(sample(1, 1, 1, 1, 0, 0), 0.25);
|
||||
}
|
||||
|
||||
let transition = demotion.expect("expected quiet demotion");
|
||||
assert_eq!(transition.from, AdaptiveTier::Tier2);
|
||||
assert_eq!(transition.to, AdaptiveTier::Tier1);
|
||||
assert_eq!(transition.reason, TierTransitionReason::QuietDemotion);
|
||||
}
|
||||
}
|
||||
|
|
@ -84,6 +84,7 @@ use crate::proxy::handshake::{HandshakeSuccess, handle_mtproto_handshake, handle
|
|||
use crate::proxy::masking::handle_bad_client;
|
||||
use crate::proxy::middle_relay::handle_via_middle_proxy;
|
||||
use crate::proxy::route_mode::{RelayRouteMode, RouteRuntimeController};
|
||||
use crate::proxy::session_eviction::register_session;
|
||||
|
||||
fn beobachten_ttl(config: &ProxyConfig) -> Duration {
|
||||
let minutes = config.general.beobachten_minutes;
|
||||
|
|
@ -866,6 +867,17 @@ impl RunningClientHandler {
|
|||
}
|
||||
};
|
||||
|
||||
let registration = register_session(&user, success.dc_idx);
|
||||
if registration.replaced_existing {
|
||||
stats.increment_reconnect_evict_total();
|
||||
warn!(
|
||||
user = %user,
|
||||
dc = success.dc_idx,
|
||||
"Reconnect detected: replacing active session for user+dc"
|
||||
);
|
||||
}
|
||||
let session_lease = registration.lease;
|
||||
|
||||
let route_snapshot = route_runtime.snapshot();
|
||||
let session_id = rng.u64();
|
||||
let relay_result = if config.general.use_middle_proxy
|
||||
|
|
@ -885,6 +897,7 @@ impl RunningClientHandler {
|
|||
route_runtime.subscribe(),
|
||||
route_snapshot,
|
||||
session_id,
|
||||
session_lease.clone(),
|
||||
)
|
||||
.await
|
||||
} else {
|
||||
|
|
@ -901,6 +914,7 @@ impl RunningClientHandler {
|
|||
route_runtime.subscribe(),
|
||||
route_snapshot,
|
||||
session_id,
|
||||
session_lease.clone(),
|
||||
)
|
||||
.await
|
||||
}
|
||||
|
|
@ -918,6 +932,7 @@ impl RunningClientHandler {
|
|||
route_runtime.subscribe(),
|
||||
route_snapshot,
|
||||
session_id,
|
||||
session_lease.clone(),
|
||||
)
|
||||
.await
|
||||
};
|
||||
|
|
|
|||
|
|
@ -22,6 +22,8 @@ use crate::proxy::route_mode::{
|
|||
RelayRouteMode, RouteCutoverState, ROUTE_SWITCH_ERROR_MSG, affected_cutover_state,
|
||||
cutover_stagger_delay,
|
||||
};
|
||||
use crate::proxy::adaptive_buffers;
|
||||
use crate::proxy::session_eviction::SessionLease;
|
||||
use crate::stats::Stats;
|
||||
use crate::stream::{BufferPool, CryptoReader, CryptoWriter};
|
||||
use crate::transport::UpstreamManager;
|
||||
|
|
@ -183,6 +185,7 @@ pub(crate) async fn handle_via_direct<R, W>(
|
|||
mut route_rx: watch::Receiver<RouteCutoverState>,
|
||||
route_snapshot: RouteCutoverState,
|
||||
session_id: u64,
|
||||
session_lease: SessionLease,
|
||||
) -> Result<()>
|
||||
where
|
||||
R: AsyncRead + Unpin + Send + 'static,
|
||||
|
|
@ -222,17 +225,27 @@ where
|
|||
stats.increment_user_connects(user);
|
||||
let _direct_connection_lease = stats.acquire_direct_connection_lease();
|
||||
|
||||
let seed_tier = adaptive_buffers::seed_tier_for_user(user);
|
||||
let (c2s_copy_buf, s2c_copy_buf) = adaptive_buffers::direct_copy_buffers_for_tier(
|
||||
seed_tier,
|
||||
config.general.direct_relay_copy_buf_c2s_bytes,
|
||||
config.general.direct_relay_copy_buf_s2c_bytes,
|
||||
);
|
||||
|
||||
let relay_result = relay_bidirectional(
|
||||
client_reader,
|
||||
client_writer,
|
||||
tg_reader,
|
||||
tg_writer,
|
||||
config.general.direct_relay_copy_buf_c2s_bytes,
|
||||
config.general.direct_relay_copy_buf_s2c_bytes,
|
||||
c2s_copy_buf,
|
||||
s2c_copy_buf,
|
||||
user,
|
||||
success.dc_idx,
|
||||
Arc::clone(&stats),
|
||||
config.access.user_data_quota.get(user).copied(),
|
||||
buffer_pool,
|
||||
session_lease,
|
||||
seed_tier,
|
||||
);
|
||||
tokio::pin!(relay_result);
|
||||
let relay_result = loop {
|
||||
|
|
|
|||
|
|
@ -1,5 +1,6 @@
|
|||
//! Proxy Defs
|
||||
|
||||
pub mod adaptive_buffers;
|
||||
pub mod client;
|
||||
pub mod direct_relay;
|
||||
pub mod handshake;
|
||||
|
|
@ -7,6 +8,7 @@ pub mod masking;
|
|||
pub mod middle_relay;
|
||||
pub mod route_mode;
|
||||
pub mod relay;
|
||||
pub mod session_eviction;
|
||||
|
||||
pub use client::ClientHandler;
|
||||
#[allow(unused_imports)]
|
||||
|
|
|
|||
|
|
@ -0,0 +1,46 @@
|
|||
/// Session eviction is intentionally disabled in runtime.
|
||||
///
|
||||
/// The initial `user+dc` single-lease model caused valid parallel client
|
||||
/// connections to evict each other. Keep the API shape for compatibility,
|
||||
/// but make it a no-op until a safer policy is introduced.
|
||||
|
||||
#[derive(Debug, Clone, Default)]
|
||||
pub struct SessionLease;
|
||||
|
||||
impl SessionLease {
|
||||
pub fn is_stale(&self) -> bool {
|
||||
false
|
||||
}
|
||||
|
||||
#[allow(dead_code)]
|
||||
pub fn release(&self) {}
|
||||
}
|
||||
|
||||
pub struct RegistrationResult {
|
||||
pub lease: SessionLease,
|
||||
pub replaced_existing: bool,
|
||||
}
|
||||
|
||||
pub fn register_session(_user: &str, _dc_idx: i16) -> RegistrationResult {
|
||||
RegistrationResult {
|
||||
lease: SessionLease,
|
||||
replaced_existing: false,
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_session_eviction_disabled_behavior() {
|
||||
let first = register_session("alice", 2);
|
||||
let second = register_session("alice", 2);
|
||||
assert!(!first.replaced_existing);
|
||||
assert!(!second.replaced_existing);
|
||||
assert!(!first.lease.is_stale());
|
||||
assert!(!second.lease.is_stale());
|
||||
first.lease.release();
|
||||
second.lease.release();
|
||||
}
|
||||
}
|
||||
|
|
@ -14,8 +14,7 @@ use std::sync::Arc;
|
|||
// ============= Configuration =============
|
||||
|
||||
/// Default buffer size
|
||||
/// CHANGED: Reduced from 64KB to 16KB to match TLS record size and prevent bufferbloat.
|
||||
pub const DEFAULT_BUFFER_SIZE: usize = 16 * 1024;
|
||||
pub const DEFAULT_BUFFER_SIZE: usize = 64 * 1024;
|
||||
|
||||
/// Default maximum number of pooled buffers
|
||||
pub const DEFAULT_MAX_BUFFERS: usize = 1024;
|
||||
|
|
|
|||
|
|
@ -299,6 +299,11 @@ async fn run_update_cycle(
|
|||
cfg.general.hardswap,
|
||||
cfg.general.me_pool_drain_ttl_secs,
|
||||
cfg.general.me_pool_drain_threshold,
|
||||
cfg.general.me_pool_drain_soft_evict_enabled,
|
||||
cfg.general.me_pool_drain_soft_evict_grace_secs,
|
||||
cfg.general.me_pool_drain_soft_evict_per_writer,
|
||||
cfg.general.me_pool_drain_soft_evict_budget_per_core,
|
||||
cfg.general.me_pool_drain_soft_evict_cooldown_ms,
|
||||
cfg.general.effective_me_pool_force_close_secs(),
|
||||
cfg.general.me_pool_min_fresh_ratio,
|
||||
cfg.general.me_hardswap_warmup_delay_min_ms,
|
||||
|
|
@ -526,6 +531,11 @@ pub async fn me_config_updater(
|
|||
cfg.general.hardswap,
|
||||
cfg.general.me_pool_drain_ttl_secs,
|
||||
cfg.general.me_pool_drain_threshold,
|
||||
cfg.general.me_pool_drain_soft_evict_enabled,
|
||||
cfg.general.me_pool_drain_soft_evict_grace_secs,
|
||||
cfg.general.me_pool_drain_soft_evict_per_writer,
|
||||
cfg.general.me_pool_drain_soft_evict_budget_per_core,
|
||||
cfg.general.me_pool_drain_soft_evict_cooldown_ms,
|
||||
cfg.general.effective_me_pool_force_close_secs(),
|
||||
cfg.general.me_pool_min_fresh_ratio,
|
||||
cfg.general.me_hardswap_warmup_delay_min_ms,
|
||||
|
|
|
|||
|
|
@ -83,6 +83,11 @@ async fn make_pool(
|
|||
general.hardswap,
|
||||
general.me_pool_drain_ttl_secs,
|
||||
general.me_pool_drain_threshold,
|
||||
general.me_pool_drain_soft_evict_enabled,
|
||||
general.me_pool_drain_soft_evict_grace_secs,
|
||||
general.me_pool_drain_soft_evict_per_writer,
|
||||
general.me_pool_drain_soft_evict_budget_per_core,
|
||||
general.me_pool_drain_soft_evict_cooldown_ms,
|
||||
general.effective_me_pool_force_close_secs(),
|
||||
general.me_pool_min_fresh_ratio,
|
||||
general.me_hardswap_warmup_delay_min_ms,
|
||||
|
|
@ -107,6 +112,8 @@ async fn make_pool(
|
|||
general.me_warn_rate_limit_ms,
|
||||
MeRouteNoWriterMode::default(),
|
||||
general.me_route_no_writer_wait_ms,
|
||||
general.me_route_hybrid_max_wait_ms,
|
||||
general.me_route_blocking_send_timeout_ms,
|
||||
general.me_route_inline_recovery_attempts,
|
||||
general.me_route_inline_recovery_wait_ms,
|
||||
);
|
||||
|
|
@ -220,10 +227,11 @@ async fn set_writer_runtime_state(
|
|||
async fn reap_draining_writers_clears_warn_state_when_pool_empty() {
|
||||
let (pool, _rng) = make_pool(128, 1, 1).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
warn_next_allowed.insert(11, Instant::now() + Duration::from_secs(5));
|
||||
warn_next_allowed.insert(22, Instant::now() + Duration::from_secs(5));
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert!(warn_next_allowed.is_empty());
|
||||
}
|
||||
|
|
@ -232,6 +240,8 @@ async fn reap_draining_writers_clears_warn_state_when_pool_empty() {
|
|||
async fn reap_draining_writers_respects_threshold_across_multiple_overflow_cycles() {
|
||||
let threshold = 3u64;
|
||||
let (pool, _rng) = make_pool(threshold, 1, 1).await;
|
||||
pool.me_pool_drain_soft_evict_enabled
|
||||
.store(false, Ordering::Relaxed);
|
||||
let now_epoch_secs = MePool::now_epoch_secs();
|
||||
|
||||
for writer_id in 1..=60u64 {
|
||||
|
|
@ -246,8 +256,9 @@ async fn reap_draining_writers_respects_threshold_across_multiple_overflow_cycle
|
|||
}
|
||||
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
for _ in 0..64 {
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
if writer_count(&pool).await <= threshold as usize {
|
||||
break;
|
||||
}
|
||||
|
|
@ -275,11 +286,12 @@ async fn reap_draining_writers_handles_large_empty_writer_population() {
|
|||
}
|
||||
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
for _ in 0..24 {
|
||||
if writer_count(&pool).await == 0 {
|
||||
break;
|
||||
}
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
}
|
||||
|
||||
assert_eq!(writer_count(&pool).await, 0);
|
||||
|
|
@ -303,11 +315,12 @@ async fn reap_draining_writers_processes_mass_deadline_expiry_without_unbounded_
|
|||
}
|
||||
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
for _ in 0..40 {
|
||||
if writer_count(&pool).await == 0 {
|
||||
break;
|
||||
}
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
}
|
||||
|
||||
assert_eq!(writer_count(&pool).await, 0);
|
||||
|
|
@ -318,6 +331,7 @@ async fn reap_draining_writers_maintains_warn_state_subset_property_under_bulk_c
|
|||
let (pool, _rng) = make_pool(128, 1, 1).await;
|
||||
let now_epoch_secs = MePool::now_epoch_secs();
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
for wave in 0..40u64 {
|
||||
for offset in 0..8u64 {
|
||||
|
|
@ -331,7 +345,7 @@ async fn reap_draining_writers_maintains_warn_state_subset_property_under_bulk_c
|
|||
.await;
|
||||
}
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
assert!(warn_next_allowed.len() <= writer_count(&pool).await);
|
||||
|
||||
let ids = sorted_writer_ids(&pool).await;
|
||||
|
|
@ -339,7 +353,7 @@ async fn reap_draining_writers_maintains_warn_state_subset_property_under_bulk_c
|
|||
let _ = pool.remove_writer_and_close_clients(writer_id).await;
|
||||
}
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
assert!(warn_next_allowed.len() <= writer_count(&pool).await);
|
||||
}
|
||||
}
|
||||
|
|
@ -361,9 +375,10 @@ async fn reap_draining_writers_budgeted_cleanup_never_increases_pool_size() {
|
|||
}
|
||||
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
let mut previous = writer_count(&pool).await;
|
||||
for _ in 0..32 {
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
let current = writer_count(&pool).await;
|
||||
assert!(current <= previous);
|
||||
previous = current;
|
||||
|
|
|
|||
|
|
@ -81,6 +81,11 @@ async fn make_pool(
|
|||
general.hardswap,
|
||||
general.me_pool_drain_ttl_secs,
|
||||
general.me_pool_drain_threshold,
|
||||
general.me_pool_drain_soft_evict_enabled,
|
||||
general.me_pool_drain_soft_evict_grace_secs,
|
||||
general.me_pool_drain_soft_evict_per_writer,
|
||||
general.me_pool_drain_soft_evict_budget_per_core,
|
||||
general.me_pool_drain_soft_evict_cooldown_ms,
|
||||
general.effective_me_pool_force_close_secs(),
|
||||
general.me_pool_min_fresh_ratio,
|
||||
general.me_hardswap_warmup_delay_min_ms,
|
||||
|
|
@ -105,6 +110,8 @@ async fn make_pool(
|
|||
general.me_warn_rate_limit_ms,
|
||||
MeRouteNoWriterMode::default(),
|
||||
general.me_route_no_writer_wait_ms,
|
||||
general.me_route_hybrid_max_wait_ms,
|
||||
general.me_route_blocking_send_timeout_ms,
|
||||
general.me_route_inline_recovery_attempts,
|
||||
general.me_route_inline_recovery_wait_ms,
|
||||
);
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@ use std::sync::Arc;
|
|||
use std::sync::atomic::{AtomicBool, AtomicU8, AtomicU32, AtomicU64, Ordering};
|
||||
use std::time::{Duration, Instant};
|
||||
|
||||
use bytes::Bytes;
|
||||
use tokio::sync::mpsc;
|
||||
use tokio_util::sync::CancellationToken;
|
||||
|
||||
|
|
@ -39,7 +40,7 @@ async fn make_pool(me_pool_drain_threshold: u64) -> Arc<MePool> {
|
|||
NetworkDecision::default(),
|
||||
None,
|
||||
Arc::new(SecureRandom::new()),
|
||||
Arc::new(Stats::default()),
|
||||
Arc::new(Stats::new()),
|
||||
general.me_keepalive_enabled,
|
||||
general.me_keepalive_interval_secs,
|
||||
general.me_keepalive_jitter_secs,
|
||||
|
|
@ -74,6 +75,11 @@ async fn make_pool(me_pool_drain_threshold: u64) -> Arc<MePool> {
|
|||
general.hardswap,
|
||||
general.me_pool_drain_ttl_secs,
|
||||
general.me_pool_drain_threshold,
|
||||
general.me_pool_drain_soft_evict_enabled,
|
||||
general.me_pool_drain_soft_evict_grace_secs,
|
||||
general.me_pool_drain_soft_evict_per_writer,
|
||||
general.me_pool_drain_soft_evict_budget_per_core,
|
||||
general.me_pool_drain_soft_evict_cooldown_ms,
|
||||
general.effective_me_pool_force_close_secs(),
|
||||
general.me_pool_min_fresh_ratio,
|
||||
general.me_hardswap_warmup_delay_min_ms,
|
||||
|
|
@ -98,6 +104,8 @@ async fn make_pool(me_pool_drain_threshold: u64) -> Arc<MePool> {
|
|||
general.me_warn_rate_limit_ms,
|
||||
MeRouteNoWriterMode::default(),
|
||||
general.me_route_no_writer_wait_ms,
|
||||
general.me_route_hybrid_max_wait_ms,
|
||||
general.me_route_blocking_send_timeout_ms,
|
||||
general.me_route_inline_recovery_attempts,
|
||||
general.me_route_inline_recovery_wait_ms,
|
||||
)
|
||||
|
|
@ -190,14 +198,15 @@ async fn reap_draining_writers_drops_warn_state_for_removed_writer() {
|
|||
let conn_ids =
|
||||
insert_draining_writer(&pool, 7, now_epoch_secs.saturating_sub(180), 1, 0).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
assert!(warn_next_allowed.contains_key(&7));
|
||||
|
||||
let _ = pool.remove_writer_and_close_clients(7).await;
|
||||
assert!(pool.registry.get_writer(conn_ids[0]).await.is_none());
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
assert!(!warn_next_allowed.contains_key(&7));
|
||||
}
|
||||
|
||||
|
|
@ -209,12 +218,96 @@ async fn reap_draining_writers_removes_empty_draining_writers() {
|
|||
insert_draining_writer(&pool, 2, now_epoch_secs.saturating_sub(30), 0, 0).await;
|
||||
insert_draining_writer(&pool, 3, now_epoch_secs.saturating_sub(20), 1, 0).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert_eq!(current_writer_ids(&pool).await, vec![3]);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn reap_draining_writers_does_not_block_on_stuck_writer_close_signal() {
|
||||
let pool = make_pool(128).await;
|
||||
let now_epoch_secs = MePool::now_epoch_secs();
|
||||
|
||||
let (blocked_tx, blocked_rx) = mpsc::channel::<WriterCommand>(1);
|
||||
assert!(
|
||||
blocked_tx
|
||||
.try_send(WriterCommand::Data(Bytes::from_static(b"stuck")))
|
||||
.is_ok()
|
||||
);
|
||||
let blocked_rx_guard = tokio::spawn(async move {
|
||||
let _hold_rx = blocked_rx;
|
||||
tokio::time::sleep(Duration::from_secs(30)).await;
|
||||
});
|
||||
|
||||
let blocked_writer_id = 90u64;
|
||||
let blocked_writer = MeWriter {
|
||||
id: blocked_writer_id,
|
||||
addr: SocketAddr::new(
|
||||
IpAddr::V4(Ipv4Addr::LOCALHOST),
|
||||
4500 + blocked_writer_id as u16,
|
||||
),
|
||||
source_ip: IpAddr::V4(Ipv4Addr::LOCALHOST),
|
||||
writer_dc: 2,
|
||||
generation: 1,
|
||||
contour: Arc::new(AtomicU8::new(WriterContour::Draining.as_u8())),
|
||||
created_at: Instant::now() - Duration::from_secs(blocked_writer_id),
|
||||
tx: blocked_tx.clone(),
|
||||
cancel: CancellationToken::new(),
|
||||
degraded: Arc::new(AtomicBool::new(false)),
|
||||
rtt_ema_ms_x10: Arc::new(AtomicU32::new(0)),
|
||||
draining: Arc::new(AtomicBool::new(true)),
|
||||
draining_started_at_epoch_secs: Arc::new(AtomicU64::new(
|
||||
now_epoch_secs.saturating_sub(120),
|
||||
)),
|
||||
drain_deadline_epoch_secs: Arc::new(AtomicU64::new(0)),
|
||||
allow_drain_fallback: Arc::new(AtomicBool::new(false)),
|
||||
};
|
||||
pool.writers.write().await.push(blocked_writer);
|
||||
pool.registry
|
||||
.register_writer(blocked_writer_id, blocked_tx)
|
||||
.await;
|
||||
pool.conn_count.fetch_add(1, Ordering::Relaxed);
|
||||
|
||||
insert_draining_writer(&pool, 91, now_epoch_secs.saturating_sub(110), 0, 0).await;
|
||||
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
let reap_res = tokio::time::timeout(
|
||||
Duration::from_millis(500),
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed),
|
||||
)
|
||||
.await;
|
||||
blocked_rx_guard.abort();
|
||||
|
||||
assert!(reap_res.is_ok(), "reap should not block on close signal");
|
||||
assert!(current_writer_ids(&pool).await.is_empty());
|
||||
assert_eq!(pool.stats.get_me_writer_close_signal_drop_total(), 2);
|
||||
assert_eq!(pool.stats.get_me_writer_close_signal_channel_full_total(), 1);
|
||||
assert_eq!(pool.stats.get_me_draining_writers_reap_progress_total(), 2);
|
||||
let activity = pool.registry.writer_activity_snapshot().await;
|
||||
assert!(!activity.bound_clients_by_writer.contains_key(&blocked_writer_id));
|
||||
assert!(!activity.bound_clients_by_writer.contains_key(&91));
|
||||
let (probe_conn_id, _rx) = pool.registry.register().await;
|
||||
assert!(
|
||||
!pool.registry
|
||||
.bind_writer(
|
||||
probe_conn_id,
|
||||
blocked_writer_id,
|
||||
ConnMeta {
|
||||
target_dc: 2,
|
||||
client_addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 6400),
|
||||
our_addr: SocketAddr::new(IpAddr::V4(Ipv4Addr::LOCALHOST), 443),
|
||||
proto_flags: 0,
|
||||
},
|
||||
)
|
||||
.await
|
||||
);
|
||||
let _ = pool.registry.unregister(probe_conn_id).await;
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn reap_draining_writers_overflow_closes_oldest_non_empty_writers() {
|
||||
let pool = make_pool(2).await;
|
||||
|
|
@ -224,8 +317,9 @@ async fn reap_draining_writers_overflow_closes_oldest_non_empty_writers() {
|
|||
insert_draining_writer(&pool, 33, now_epoch_secs.saturating_sub(20), 1, 0).await;
|
||||
insert_draining_writer(&pool, 44, now_epoch_secs.saturating_sub(10), 1, 0).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert_eq!(current_writer_ids(&pool).await, vec![33, 44]);
|
||||
}
|
||||
|
|
@ -243,8 +337,9 @@ async fn reap_draining_writers_deadline_force_close_applies_under_threshold() {
|
|||
)
|
||||
.await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert!(current_writer_ids(&pool).await.is_empty());
|
||||
}
|
||||
|
|
@ -266,8 +361,9 @@ async fn reap_draining_writers_limits_closes_per_health_tick() {
|
|||
.await;
|
||||
}
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert_eq!(pool.writers.read().await.len(), writer_total - close_budget);
|
||||
}
|
||||
|
|
@ -406,12 +502,13 @@ async fn reap_draining_writers_backlog_drains_across_ticks() {
|
|||
.await;
|
||||
}
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
for _ in 0..8 {
|
||||
if pool.writers.read().await.is_empty() {
|
||||
break;
|
||||
}
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
}
|
||||
|
||||
assert!(pool.writers.read().await.is_empty());
|
||||
|
|
@ -435,9 +532,10 @@ async fn reap_draining_writers_threshold_backlog_converges_to_threshold() {
|
|||
.await;
|
||||
}
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
for _ in 0..16 {
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
if pool.writers.read().await.len() <= threshold as usize {
|
||||
break;
|
||||
}
|
||||
|
|
@ -454,8 +552,9 @@ async fn reap_draining_writers_threshold_zero_preserves_non_expired_non_empty_wr
|
|||
insert_draining_writer(&pool, 20, now_epoch_secs.saturating_sub(30), 1, 0).await;
|
||||
insert_draining_writer(&pool, 30, now_epoch_secs.saturating_sub(20), 1, 0).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert_eq!(current_writer_ids(&pool).await, vec![10, 20, 30]);
|
||||
}
|
||||
|
|
@ -478,8 +577,9 @@ async fn reap_draining_writers_prioritizes_force_close_before_empty_cleanup() {
|
|||
let empty_writer_id = close_budget as u64 + 1;
|
||||
insert_draining_writer(&pool, empty_writer_id, now_epoch_secs.saturating_sub(20), 0, 0).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert_eq!(current_writer_ids(&pool).await, vec![empty_writer_id]);
|
||||
}
|
||||
|
|
@ -491,8 +591,9 @@ async fn reap_draining_writers_empty_cleanup_does_not_increment_force_close_metr
|
|||
insert_draining_writer(&pool, 1, now_epoch_secs.saturating_sub(60), 0, 0).await;
|
||||
insert_draining_writer(&pool, 2, now_epoch_secs.saturating_sub(50), 0, 0).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert!(current_writer_ids(&pool).await.is_empty());
|
||||
assert_eq!(pool.stats.get_pool_force_close_total(), 0);
|
||||
|
|
@ -519,8 +620,9 @@ async fn reap_draining_writers_handles_duplicate_force_close_requests_for_same_w
|
|||
)
|
||||
.await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
assert!(current_writer_ids(&pool).await.is_empty());
|
||||
}
|
||||
|
|
@ -530,6 +632,7 @@ async fn reap_draining_writers_warn_state_never_exceeds_live_draining_population
|
|||
let pool = make_pool(128).await;
|
||||
let now_epoch_secs = MePool::now_epoch_secs();
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
for wave in 0..12u64 {
|
||||
for offset in 0..9u64 {
|
||||
|
|
@ -542,14 +645,14 @@ async fn reap_draining_writers_warn_state_never_exceeds_live_draining_population
|
|||
)
|
||||
.await;
|
||||
}
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
|
||||
|
||||
let existing_writer_ids = current_writer_ids(&pool).await;
|
||||
for writer_id in existing_writer_ids.into_iter().take(4) {
|
||||
let _ = pool.remove_writer_and_close_clients(writer_id).await;
|
||||
}
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
|
||||
}
|
||||
}
|
||||
|
|
@ -559,6 +662,7 @@ async fn reap_draining_writers_mixed_backlog_converges_without_leaking_warn_stat
|
|||
let pool = make_pool(6).await;
|
||||
let now_epoch_secs = MePool::now_epoch_secs();
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
for writer_id in 1..=18u64 {
|
||||
let bound_clients = if writer_id % 3 == 0 { 0 } else { 1 };
|
||||
|
|
@ -578,7 +682,7 @@ async fn reap_draining_writers_mixed_backlog_converges_without_leaking_warn_stat
|
|||
}
|
||||
|
||||
for _ in 0..16 {
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
if pool.writers.read().await.len() <= 6 {
|
||||
break;
|
||||
}
|
||||
|
|
@ -588,9 +692,62 @@ async fn reap_draining_writers_mixed_backlog_converges_without_leaking_warn_stat
|
|||
assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn reap_draining_writers_soft_evicts_stuck_writer_with_per_writer_cap() {
|
||||
let pool = make_pool(128).await;
|
||||
pool.me_pool_drain_soft_evict_enabled.store(true, Ordering::Relaxed);
|
||||
pool.me_pool_drain_soft_evict_grace_secs.store(0, Ordering::Relaxed);
|
||||
pool.me_pool_drain_soft_evict_per_writer.store(1, Ordering::Relaxed);
|
||||
pool.me_pool_drain_soft_evict_budget_per_core.store(8, Ordering::Relaxed);
|
||||
pool.me_pool_drain_soft_evict_cooldown_ms
|
||||
.store(1, Ordering::Relaxed);
|
||||
|
||||
let now_epoch_secs = MePool::now_epoch_secs();
|
||||
insert_draining_writer(&pool, 77, now_epoch_secs.saturating_sub(240), 3, 0).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
let activity = pool.registry.writer_activity_snapshot().await;
|
||||
assert_eq!(activity.bound_clients_by_writer.get(&77), Some(&2));
|
||||
assert_eq!(pool.stats.get_pool_drain_soft_evict_total(), 1);
|
||||
assert_eq!(pool.stats.get_pool_drain_soft_evict_writer_total(), 1);
|
||||
assert_eq!(current_writer_ids(&pool).await, vec![77]);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn reap_draining_writers_soft_evict_respects_cooldown_per_writer() {
|
||||
let pool = make_pool(128).await;
|
||||
pool.me_pool_drain_soft_evict_enabled.store(true, Ordering::Relaxed);
|
||||
pool.me_pool_drain_soft_evict_grace_secs.store(0, Ordering::Relaxed);
|
||||
pool.me_pool_drain_soft_evict_per_writer.store(1, Ordering::Relaxed);
|
||||
pool.me_pool_drain_soft_evict_budget_per_core.store(8, Ordering::Relaxed);
|
||||
pool.me_pool_drain_soft_evict_cooldown_ms
|
||||
.store(60_000, Ordering::Relaxed);
|
||||
|
||||
let now_epoch_secs = MePool::now_epoch_secs();
|
||||
insert_draining_writer(&pool, 88, now_epoch_secs.saturating_sub(240), 3, 0).await;
|
||||
let mut warn_next_allowed = HashMap::new();
|
||||
let mut soft_evict_next_allowed = HashMap::new();
|
||||
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
|
||||
|
||||
let activity = pool.registry.writer_activity_snapshot().await;
|
||||
assert_eq!(activity.bound_clients_by_writer.get(&88), Some(&2));
|
||||
assert_eq!(pool.stats.get_pool_drain_soft_evict_total(), 1);
|
||||
assert_eq!(pool.stats.get_pool_drain_soft_evict_writer_total(), 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn general_config_default_drain_threshold_remains_enabled() {
|
||||
assert_eq!(GeneralConfig::default().me_pool_drain_threshold, 128);
|
||||
assert!(GeneralConfig::default().me_pool_drain_soft_evict_enabled);
|
||||
assert_eq!(
|
||||
GeneralConfig::default().me_pool_drain_soft_evict_per_writer,
|
||||
1
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
|
|
|
|||
|
|
@ -70,10 +70,12 @@ impl MePool {
|
|||
|
||||
let mut missing_dc = Vec::<i32>::new();
|
||||
let mut covered = 0usize;
|
||||
let mut total = 0usize;
|
||||
for (dc, endpoints) in desired_by_dc {
|
||||
if endpoints.is_empty() {
|
||||
continue;
|
||||
}
|
||||
total += 1;
|
||||
if endpoints
|
||||
.iter()
|
||||
.any(|addr| active_writer_addrs.contains(&(*dc, *addr)))
|
||||
|
|
@ -85,7 +87,9 @@ impl MePool {
|
|||
}
|
||||
|
||||
missing_dc.sort_unstable();
|
||||
let total = desired_by_dc.len().max(1);
|
||||
if total == 0 {
|
||||
return (1.0, missing_dc);
|
||||
}
|
||||
let ratio = (covered as f32) / (total as f32);
|
||||
(ratio, missing_dc)
|
||||
}
|
||||
|
|
@ -431,29 +435,21 @@ impl MePool {
|
|||
}
|
||||
|
||||
if hardswap {
|
||||
let mut fresh_missing_dc = Vec::<(i32, usize, usize)>::new();
|
||||
for (dc, endpoints) in &desired_by_dc {
|
||||
if endpoints.is_empty() {
|
||||
continue;
|
||||
}
|
||||
let required = self.required_writers_for_dc(endpoints.len());
|
||||
let fresh_count = writers
|
||||
.iter()
|
||||
.filter(|w| !w.draining.load(Ordering::Relaxed))
|
||||
.filter(|w| w.generation == generation)
|
||||
.filter(|w| w.writer_dc == *dc)
|
||||
.filter(|w| endpoints.contains(&w.addr))
|
||||
.count();
|
||||
if fresh_count < required {
|
||||
fresh_missing_dc.push((*dc, fresh_count, required));
|
||||
}
|
||||
}
|
||||
let fresh_writer_addrs: HashSet<(i32, SocketAddr)> = writers
|
||||
.iter()
|
||||
.filter(|w| !w.draining.load(Ordering::Relaxed))
|
||||
.filter(|w| w.generation == generation)
|
||||
.map(|w| (w.writer_dc, w.addr))
|
||||
.collect();
|
||||
let (fresh_coverage_ratio, fresh_missing_dc) =
|
||||
Self::coverage_ratio(&desired_by_dc, &fresh_writer_addrs);
|
||||
if !fresh_missing_dc.is_empty() {
|
||||
warn!(
|
||||
previous_generation,
|
||||
generation,
|
||||
fresh_coverage_ratio = format_args!("{fresh_coverage_ratio:.3}"),
|
||||
missing_dc = ?fresh_missing_dc,
|
||||
"ME hardswap pending: fresh generation coverage incomplete"
|
||||
"ME hardswap pending: fresh generation DC coverage incomplete"
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
|
@ -541,3 +537,61 @@ impl MePool {
|
|||
self.zero_downtime_reinit_after_map_change(rng).await;
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use std::collections::{HashMap, HashSet};
|
||||
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
|
||||
|
||||
use super::MePool;
|
||||
|
||||
fn addr(octet: u8, port: u16) -> SocketAddr {
|
||||
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, octet)), port)
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn coverage_ratio_counts_dc_coverage_not_floor() {
|
||||
let dc1 = addr(1, 2001);
|
||||
let dc2 = addr(2, 2002);
|
||||
|
||||
let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
|
||||
desired_by_dc.insert(1, HashSet::from([dc1]));
|
||||
desired_by_dc.insert(2, HashSet::from([dc2]));
|
||||
|
||||
let active_writer_addrs = HashSet::from([(1, dc1)]);
|
||||
let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &active_writer_addrs);
|
||||
|
||||
assert_eq!(ratio, 0.5);
|
||||
assert_eq!(missing_dc, vec![2]);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn coverage_ratio_ignores_empty_dc_groups() {
|
||||
let dc1 = addr(1, 2001);
|
||||
|
||||
let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
|
||||
desired_by_dc.insert(1, HashSet::from([dc1]));
|
||||
desired_by_dc.insert(2, HashSet::new());
|
||||
|
||||
let active_writer_addrs = HashSet::from([(1, dc1)]);
|
||||
let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &active_writer_addrs);
|
||||
|
||||
assert_eq!(ratio, 1.0);
|
||||
assert!(missing_dc.is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn coverage_ratio_reports_missing_dcs_sorted() {
|
||||
let dc1 = addr(1, 2001);
|
||||
let dc2 = addr(2, 2002);
|
||||
|
||||
let mut desired_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
|
||||
desired_by_dc.insert(2, HashSet::from([dc2]));
|
||||
desired_by_dc.insert(1, HashSet::from([dc1]));
|
||||
|
||||
let (ratio, missing_dc) = MePool::coverage_ratio(&desired_by_dc, &HashSet::new());
|
||||
|
||||
assert_eq!(ratio, 0.0);
|
||||
assert_eq!(missing_dc, vec![1, 2]);
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -40,6 +40,7 @@ pub(crate) struct MeApiDcStatusSnapshot {
|
|||
pub floor_max: usize,
|
||||
pub floor_capped: bool,
|
||||
pub alive_writers: usize,
|
||||
pub coverage_ratio: f64,
|
||||
pub coverage_pct: f64,
|
||||
pub fresh_alive_writers: usize,
|
||||
pub fresh_coverage_pct: f64,
|
||||
|
|
@ -62,6 +63,7 @@ pub(crate) struct MeApiStatusSnapshot {
|
|||
pub available_pct: f64,
|
||||
pub required_writers: usize,
|
||||
pub alive_writers: usize,
|
||||
pub coverage_ratio: f64,
|
||||
pub coverage_pct: f64,
|
||||
pub fresh_alive_writers: usize,
|
||||
pub fresh_coverage_pct: f64,
|
||||
|
|
@ -124,6 +126,11 @@ pub(crate) struct MeApiRuntimeSnapshot {
|
|||
pub me_reconnect_backoff_cap_ms: u64,
|
||||
pub me_reconnect_fast_retry_count: u32,
|
||||
pub me_pool_drain_ttl_secs: u64,
|
||||
pub me_pool_drain_soft_evict_enabled: bool,
|
||||
pub me_pool_drain_soft_evict_grace_secs: u64,
|
||||
pub me_pool_drain_soft_evict_per_writer: u8,
|
||||
pub me_pool_drain_soft_evict_budget_per_core: u16,
|
||||
pub me_pool_drain_soft_evict_cooldown_ms: u64,
|
||||
pub me_pool_force_close_secs: u64,
|
||||
pub me_pool_min_fresh_ratio: f32,
|
||||
pub me_bind_stale_mode: &'static str,
|
||||
|
|
@ -337,6 +344,8 @@ impl MePool {
|
|||
let mut available_endpoints = 0usize;
|
||||
let mut alive_writers = 0usize;
|
||||
let mut fresh_alive_writers = 0usize;
|
||||
let mut coverage_ratio_dcs_total = 0usize;
|
||||
let mut coverage_ratio_dcs_covered = 0usize;
|
||||
let floor_mode = self.floor_mode();
|
||||
let adaptive_cpu_cores = (self
|
||||
.me_adaptive_floor_cpu_cores_effective
|
||||
|
|
@ -388,6 +397,12 @@ impl MePool {
|
|||
available_endpoints += dc_available_endpoints;
|
||||
alive_writers += dc_alive_writers;
|
||||
fresh_alive_writers += dc_fresh_alive_writers;
|
||||
if endpoint_count > 0 {
|
||||
coverage_ratio_dcs_total += 1;
|
||||
if dc_alive_writers > 0 {
|
||||
coverage_ratio_dcs_covered += 1;
|
||||
}
|
||||
}
|
||||
|
||||
dcs.push(MeApiDcStatusSnapshot {
|
||||
dc,
|
||||
|
|
@ -410,6 +425,11 @@ impl MePool {
|
|||
floor_max,
|
||||
floor_capped,
|
||||
alive_writers: dc_alive_writers,
|
||||
coverage_ratio: if endpoint_count > 0 && dc_alive_writers > 0 {
|
||||
100.0
|
||||
} else {
|
||||
0.0
|
||||
},
|
||||
coverage_pct: ratio_pct(dc_alive_writers, dc_required_writers),
|
||||
fresh_alive_writers: dc_fresh_alive_writers,
|
||||
fresh_coverage_pct: ratio_pct(dc_fresh_alive_writers, dc_required_writers),
|
||||
|
|
@ -426,6 +446,7 @@ impl MePool {
|
|||
available_pct: ratio_pct(available_endpoints, configured_endpoints),
|
||||
required_writers,
|
||||
alive_writers,
|
||||
coverage_ratio: ratio_pct(coverage_ratio_dcs_covered, coverage_ratio_dcs_total),
|
||||
coverage_pct: ratio_pct(alive_writers, required_writers),
|
||||
fresh_alive_writers,
|
||||
fresh_coverage_pct: ratio_pct(fresh_alive_writers, required_writers),
|
||||
|
|
@ -562,6 +583,22 @@ impl MePool {
|
|||
me_reconnect_backoff_cap_ms: self.me_reconnect_backoff_cap.as_millis() as u64,
|
||||
me_reconnect_fast_retry_count: self.me_reconnect_fast_retry_count,
|
||||
me_pool_drain_ttl_secs: self.me_pool_drain_ttl_secs.load(Ordering::Relaxed),
|
||||
me_pool_drain_soft_evict_enabled: self
|
||||
.me_pool_drain_soft_evict_enabled
|
||||
.load(Ordering::Relaxed),
|
||||
me_pool_drain_soft_evict_grace_secs: self
|
||||
.me_pool_drain_soft_evict_grace_secs
|
||||
.load(Ordering::Relaxed),
|
||||
me_pool_drain_soft_evict_per_writer: self
|
||||
.me_pool_drain_soft_evict_per_writer
|
||||
.load(Ordering::Relaxed),
|
||||
me_pool_drain_soft_evict_budget_per_core: self
|
||||
.me_pool_drain_soft_evict_budget_per_core
|
||||
.load(Ordering::Relaxed)
|
||||
.min(u16::MAX as u32) as u16,
|
||||
me_pool_drain_soft_evict_cooldown_ms: self
|
||||
.me_pool_drain_soft_evict_cooldown_ms
|
||||
.load(Ordering::Relaxed),
|
||||
me_pool_force_close_secs: self.me_pool_force_close_secs.load(Ordering::Relaxed),
|
||||
me_pool_min_fresh_ratio: Self::permille_to_ratio(
|
||||
self.me_pool_min_fresh_ratio_permille.load(Ordering::Relaxed),
|
||||
|
|
|
|||
|
|
@ -8,6 +8,7 @@ use bytes::Bytes;
|
|||
use bytes::BytesMut;
|
||||
use rand::Rng;
|
||||
use tokio::sync::mpsc;
|
||||
use tokio::sync::mpsc::error::TrySendError;
|
||||
use tokio_util::sync::CancellationToken;
|
||||
use tracing::{debug, info, warn};
|
||||
|
||||
|
|
@ -311,41 +312,28 @@ impl MePool {
|
|||
let mut p = Vec::with_capacity(12);
|
||||
p.extend_from_slice(&RPC_PING_U32.to_le_bytes());
|
||||
p.extend_from_slice(&sent_id.to_le_bytes());
|
||||
{
|
||||
let mut tracker = ping_tracker_ping.lock().await;
|
||||
let now_epoch_ms = std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.unwrap_or_default()
|
||||
.as_millis() as u64;
|
||||
let mut run_cleanup = false;
|
||||
if let Some(pool) = pool_ping.upgrade() {
|
||||
let last_cleanup_ms = pool
|
||||
let now_epoch_ms = std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.unwrap_or_default()
|
||||
.as_millis() as u64;
|
||||
let mut run_cleanup = false;
|
||||
if let Some(pool) = pool_ping.upgrade() {
|
||||
let last_cleanup_ms = pool
|
||||
.ping_tracker_last_cleanup_epoch_ms
|
||||
.load(Ordering::Relaxed);
|
||||
if now_epoch_ms.saturating_sub(last_cleanup_ms) >= 30_000
|
||||
&& pool
|
||||
.ping_tracker_last_cleanup_epoch_ms
|
||||
.load(Ordering::Relaxed);
|
||||
if now_epoch_ms.saturating_sub(last_cleanup_ms) >= 30_000
|
||||
&& pool
|
||||
.ping_tracker_last_cleanup_epoch_ms
|
||||
.compare_exchange(
|
||||
last_cleanup_ms,
|
||||
now_epoch_ms,
|
||||
Ordering::AcqRel,
|
||||
Ordering::Relaxed,
|
||||
)
|
||||
.is_ok()
|
||||
{
|
||||
run_cleanup = true;
|
||||
}
|
||||
.compare_exchange(
|
||||
last_cleanup_ms,
|
||||
now_epoch_ms,
|
||||
Ordering::AcqRel,
|
||||
Ordering::Relaxed,
|
||||
)
|
||||
.is_ok()
|
||||
{
|
||||
run_cleanup = true;
|
||||
}
|
||||
|
||||
if run_cleanup {
|
||||
let before = tracker.len();
|
||||
tracker.retain(|_, (ts, _)| ts.elapsed() < Duration::from_secs(120));
|
||||
let expired = before.saturating_sub(tracker.len());
|
||||
if expired > 0 {
|
||||
stats_ping.increment_me_keepalive_timeout_by(expired as u64);
|
||||
}
|
||||
}
|
||||
tracker.insert(sent_id, (std::time::Instant::now(), writer_id));
|
||||
}
|
||||
ping_id = ping_id.wrapping_add(1);
|
||||
stats_ping.increment_me_keepalive_sent();
|
||||
|
|
@ -366,6 +354,16 @@ impl MePool {
|
|||
}
|
||||
break;
|
||||
}
|
||||
let mut tracker = ping_tracker_ping.lock().await;
|
||||
if run_cleanup {
|
||||
let before = tracker.len();
|
||||
tracker.retain(|_, (ts, _)| ts.elapsed() < Duration::from_secs(120));
|
||||
let expired = before.saturating_sub(tracker.len());
|
||||
if expired > 0 {
|
||||
stats_ping.increment_me_keepalive_timeout_by(expired as u64);
|
||||
}
|
||||
}
|
||||
tracker.insert(sent_id, (std::time::Instant::now(), writer_id));
|
||||
}
|
||||
});
|
||||
|
||||
|
|
@ -493,11 +491,9 @@ impl MePool {
|
|||
}
|
||||
|
||||
pub(crate) async fn remove_writer_and_close_clients(self: &Arc<Self>, writer_id: u64) {
|
||||
let conns = self.remove_writer_only(writer_id).await;
|
||||
for bound in conns {
|
||||
let _ = self.registry.route(bound.conn_id, super::MeResponse::Close).await;
|
||||
let _ = self.registry.unregister(bound.conn_id).await;
|
||||
}
|
||||
// Full client cleanup now happens inside `registry.writer_lost` to keep
|
||||
// writer reap/remove paths strictly non-blocking per connection.
|
||||
let _ = self.remove_writer_only(writer_id).await;
|
||||
}
|
||||
|
||||
pub(crate) async fn remove_writer_if_empty(self: &Arc<Self>, writer_id: u64) -> bool {
|
||||
|
|
@ -539,6 +535,11 @@ impl MePool {
|
|||
self.conn_count.fetch_sub(1, Ordering::Relaxed);
|
||||
}
|
||||
}
|
||||
// State invariant:
|
||||
// - writer is removed from `self.writers` (pool visibility),
|
||||
// - writer is removed from registry routing/binding maps via `writer_lost`.
|
||||
// The close command below is only a best-effort accelerator for task shutdown.
|
||||
// Cleanup progress must never depend on command-channel availability.
|
||||
let conns = self.registry.writer_lost(writer_id).await;
|
||||
{
|
||||
let mut tracker = self.ping_tracker.lock().await;
|
||||
|
|
@ -546,7 +547,25 @@ impl MePool {
|
|||
}
|
||||
self.rtt_stats.lock().await.remove(&writer_id);
|
||||
if let Some(tx) = close_tx {
|
||||
let _ = tx.send(WriterCommand::Close).await;
|
||||
match tx.try_send(WriterCommand::Close) {
|
||||
Ok(()) => {}
|
||||
Err(TrySendError::Full(_)) => {
|
||||
self.stats.increment_me_writer_close_signal_drop_total();
|
||||
self.stats
|
||||
.increment_me_writer_close_signal_channel_full_total();
|
||||
debug!(
|
||||
writer_id,
|
||||
"Skipping close signal for removed writer: command channel is full"
|
||||
);
|
||||
}
|
||||
Err(TrySendError::Closed(_)) => {
|
||||
self.stats.increment_me_writer_close_signal_drop_total();
|
||||
debug!(
|
||||
writer_id,
|
||||
"Skipping close signal for removed writer: command channel is closed"
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
if trigger_refill
|
||||
&& let Some(addr) = removed_addr
|
||||
|
|
|
|||
|
|
@ -8,6 +8,7 @@ use bytes::{Bytes, BytesMut};
|
|||
use tokio::io::AsyncReadExt;
|
||||
use tokio::net::TcpStream;
|
||||
use tokio::sync::{Mutex, mpsc};
|
||||
use tokio::sync::mpsc::error::TrySendError;
|
||||
use tokio_util::sync::CancellationToken;
|
||||
use tracing::{debug, trace, warn};
|
||||
|
||||
|
|
@ -173,12 +174,12 @@ pub(crate) async fn reader_loop(
|
|||
} else if pt == RPC_CLOSE_EXT_U32 && body.len() >= 8 {
|
||||
let cid = u64::from_le_bytes(body[0..8].try_into().unwrap());
|
||||
debug!(cid, "RPC_CLOSE_EXT from ME");
|
||||
reg.route(cid, MeResponse::Close).await;
|
||||
let _ = reg.route_nowait(cid, MeResponse::Close).await;
|
||||
reg.unregister(cid).await;
|
||||
} else if pt == RPC_CLOSE_CONN_U32 && body.len() >= 8 {
|
||||
let cid = u64::from_le_bytes(body[0..8].try_into().unwrap());
|
||||
debug!(cid, "RPC_CLOSE_CONN from ME");
|
||||
reg.route(cid, MeResponse::Close).await;
|
||||
let _ = reg.route_nowait(cid, MeResponse::Close).await;
|
||||
reg.unregister(cid).await;
|
||||
} else if pt == RPC_PING_U32 && body.len() >= 8 {
|
||||
let ping_id = i64::from_le_bytes(body[0..8].try_into().unwrap());
|
||||
|
|
@ -186,13 +187,15 @@ pub(crate) async fn reader_loop(
|
|||
let mut pong = Vec::with_capacity(12);
|
||||
pong.extend_from_slice(&RPC_PONG_U32.to_le_bytes());
|
||||
pong.extend_from_slice(&ping_id.to_le_bytes());
|
||||
if tx
|
||||
.send(WriterCommand::DataAndFlush(Bytes::from(pong)))
|
||||
.await
|
||||
.is_err()
|
||||
{
|
||||
warn!("PONG send failed");
|
||||
break;
|
||||
match tx.try_send(WriterCommand::DataAndFlush(Bytes::from(pong))) {
|
||||
Ok(()) => {}
|
||||
Err(TrySendError::Full(_)) => {
|
||||
debug!(ping_id, "PONG dropped: writer command channel is full");
|
||||
}
|
||||
Err(TrySendError::Closed(_)) => {
|
||||
warn!("PONG send failed: writer channel closed");
|
||||
break;
|
||||
}
|
||||
}
|
||||
} else if pt == RPC_PONG_U32 && body.len() >= 8 {
|
||||
let ping_id = i64::from_le_bytes(body[0..8].try_into().unwrap());
|
||||
|
|
@ -232,6 +235,13 @@ async fn send_close_conn(tx: &mpsc::Sender<WriterCommand>, conn_id: u64) {
|
|||
let mut p = Vec::with_capacity(12);
|
||||
p.extend_from_slice(&RPC_CLOSE_CONN_U32.to_le_bytes());
|
||||
p.extend_from_slice(&conn_id.to_le_bytes());
|
||||
|
||||
let _ = tx.send(WriterCommand::DataAndFlush(Bytes::from(p))).await;
|
||||
match tx.try_send(WriterCommand::DataAndFlush(Bytes::from(p))) {
|
||||
Ok(()) => {}
|
||||
Err(TrySendError::Full(_)) => {
|
||||
debug!(conn_id, "ME close_conn signal skipped: writer command channel is full");
|
||||
}
|
||||
Err(TrySendError::Closed(_)) => {
|
||||
debug!(conn_id, "ME close_conn signal skipped: writer command channel is closed");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -11,6 +11,8 @@ use tokio::net::TcpStream;
|
|||
use socket2::{Socket, TcpKeepalive, Domain, Type, Protocol};
|
||||
use tracing::debug;
|
||||
|
||||
const DEFAULT_SOCKET_BUFFER_BYTES: usize = 256 * 1024;
|
||||
|
||||
/// Configure TCP socket with recommended settings for proxy use
|
||||
#[allow(dead_code)]
|
||||
pub fn configure_tcp_socket(
|
||||
|
|
@ -34,10 +36,10 @@ pub fn configure_tcp_socket(
|
|||
|
||||
socket.set_tcp_keepalive(&keepalive)?;
|
||||
}
|
||||
|
||||
// CHANGED: Removed manual buffer size setting (was 256KB).
|
||||
// Allowing the OS kernel to handle TCP window scaling (Autotuning) is critical
|
||||
// for mobile clients to avoid bufferbloat and stalled connections during uploads.
|
||||
|
||||
// Use explicit baseline buffers to reduce slow-start stalls on high RTT links.
|
||||
socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
|
||||
socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
|
@ -62,6 +64,10 @@ pub fn configure_client_socket(
|
|||
let keepalive = keepalive.with_interval(Duration::from_secs(keepalive_secs));
|
||||
|
||||
socket.set_tcp_keepalive(&keepalive)?;
|
||||
|
||||
// Keep explicit baseline buffers for predictable throughput across busy hosts.
|
||||
socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
|
||||
socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
|
||||
|
||||
// Set TCP user timeout (Linux only)
|
||||
// NOTE: iOS does not support TCP_USER_TIMEOUT - application-level timeout
|
||||
|
|
@ -124,6 +130,8 @@ pub fn create_outgoing_socket_bound(addr: SocketAddr, bind_addr: Option<IpAddr>)
|
|||
|
||||
// Disable Nagle
|
||||
socket.set_nodelay(true)?;
|
||||
socket.set_recv_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
|
||||
socket.set_send_buffer_size(DEFAULT_SOCKET_BUFFER_BYTES)?;
|
||||
|
||||
if let Some(bind_ip) = bind_addr {
|
||||
let bind_sock_addr = SocketAddr::new(bind_ip, 0);
|
||||
|
|
|
|||
|
|
@ -0,0 +1,728 @@
|
|||
"""
|
||||
Telemt Control API Python Client
|
||||
Full-coverage client for https://github.com/telemt/telemt
|
||||
|
||||
Usage:
|
||||
client = TelemtAPI("http://127.0.0.1:9091", auth_header="your-secret")
|
||||
client.health()
|
||||
client.create_user("alice", max_tcp_conns=10)
|
||||
client.patch_user("alice", data_quota_bytes=1_000_000_000)
|
||||
client.delete_user("alice")
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import secrets
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Dict, List, Optional, Union
|
||||
from urllib.error import HTTPError, URLError
|
||||
from urllib.request import Request, urlopen
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Exceptions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TememtAPIError(Exception):
|
||||
"""Raised when the API returns an error envelope or a transport error."""
|
||||
|
||||
def __init__(self, message: str, code: str | None = None,
|
||||
http_status: int | None = None, request_id: int | None = None):
|
||||
super().__init__(message)
|
||||
self.code = code
|
||||
self.http_status = http_status
|
||||
self.request_id = request_id
|
||||
|
||||
def __repr__(self) -> str:
|
||||
return (f"TememtAPIError(message={str(self)!r}, code={self.code!r}, "
|
||||
f"http_status={self.http_status}, request_id={self.request_id})")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Response wrapper
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@dataclass
|
||||
class APIResponse:
|
||||
"""Wraps a successful API response envelope."""
|
||||
ok: bool
|
||||
data: Any
|
||||
revision: str | None = None
|
||||
|
||||
def __repr__(self) -> str: # pragma: no cover
|
||||
return f"APIResponse(ok={self.ok}, revision={self.revision!r}, data={self.data!r})"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main client
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TememtAPI:
|
||||
"""
|
||||
HTTP client for the Telemt Control API.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
base_url:
|
||||
Scheme + host + port, e.g. ``"http://127.0.0.1:9091"``.
|
||||
Trailing slash is stripped automatically.
|
||||
auth_header:
|
||||
Exact value for the ``Authorization`` header.
|
||||
Leave *None* when ``auth_header`` is not configured server-side.
|
||||
timeout:
|
||||
Socket timeout in seconds for every request (default 10).
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
base_url: str = "http://127.0.0.1:9091",
|
||||
auth_header: str | None = None,
|
||||
timeout: int = 10,
|
||||
) -> None:
|
||||
self.base_url = base_url.rstrip("/")
|
||||
self.auth_header = auth_header
|
||||
self.timeout = timeout
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Low-level HTTP helpers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _headers(self, extra: dict | None = None) -> dict:
|
||||
h = {"Content-Type": "application/json; charset=utf-8",
|
||||
"Accept": "application/json"}
|
||||
if self.auth_header:
|
||||
h["Authorization"] = self.auth_header
|
||||
if extra:
|
||||
h.update(extra)
|
||||
return h
|
||||
|
||||
def _request(
|
||||
self,
|
||||
method: str,
|
||||
path: str,
|
||||
body: dict | None = None,
|
||||
if_match: str | None = None,
|
||||
query: dict | None = None,
|
||||
) -> APIResponse:
|
||||
url = self.base_url + path
|
||||
if query:
|
||||
qs = "&".join(f"{k}={v}" for k, v in query.items())
|
||||
url = f"{url}?{qs}"
|
||||
|
||||
raw_body: bytes | None = None
|
||||
if body is not None:
|
||||
raw_body = json.dumps(body).encode()
|
||||
|
||||
extra_headers: dict = {}
|
||||
if if_match is not None:
|
||||
extra_headers["If-Match"] = if_match
|
||||
|
||||
req = Request(
|
||||
url,
|
||||
data=raw_body,
|
||||
headers=self._headers(extra_headers),
|
||||
method=method,
|
||||
)
|
||||
|
||||
try:
|
||||
with urlopen(req, timeout=self.timeout) as resp:
|
||||
payload = json.loads(resp.read())
|
||||
except HTTPError as exc:
|
||||
raw = exc.read()
|
||||
try:
|
||||
payload = json.loads(raw)
|
||||
except Exception:
|
||||
raise TememtAPIError(
|
||||
str(exc), http_status=exc.code
|
||||
) from exc
|
||||
err = payload.get("error", {})
|
||||
raise TememtAPIError(
|
||||
err.get("message", str(exc)),
|
||||
code=err.get("code"),
|
||||
http_status=exc.code,
|
||||
request_id=payload.get("request_id"),
|
||||
) from exc
|
||||
except URLError as exc:
|
||||
raise TememtAPIError(str(exc)) from exc
|
||||
|
||||
if not payload.get("ok"):
|
||||
err = payload.get("error", {})
|
||||
raise TememtAPIError(
|
||||
err.get("message", "unknown error"),
|
||||
code=err.get("code"),
|
||||
request_id=payload.get("request_id"),
|
||||
)
|
||||
|
||||
return APIResponse(
|
||||
ok=True,
|
||||
data=payload.get("data"),
|
||||
revision=payload.get("revision"),
|
||||
)
|
||||
|
||||
def _get(self, path: str, query: dict | None = None) -> APIResponse:
|
||||
return self._request("GET", path, query=query)
|
||||
|
||||
def _post(self, path: str, body: dict | None = None,
|
||||
if_match: str | None = None) -> APIResponse:
|
||||
return self._request("POST", path, body=body, if_match=if_match)
|
||||
|
||||
def _patch(self, path: str, body: dict,
|
||||
if_match: str | None = None) -> APIResponse:
|
||||
return self._request("PATCH", path, body=body, if_match=if_match)
|
||||
|
||||
def _delete(self, path: str, if_match: str | None = None) -> APIResponse:
|
||||
return self._request("DELETE", path, if_match=if_match)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Health & system
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def health(self) -> APIResponse:
|
||||
"""GET /v1/health — liveness probe."""
|
||||
return self._get("/v1/health")
|
||||
|
||||
def system_info(self) -> APIResponse:
|
||||
"""GET /v1/system/info — binary version, uptime, config hash."""
|
||||
return self._get("/v1/system/info")
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Runtime gates & initialization
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def runtime_gates(self) -> APIResponse:
|
||||
"""GET /v1/runtime/gates — admission gates and startup progress."""
|
||||
return self._get("/v1/runtime/gates")
|
||||
|
||||
def runtime_initialization(self) -> APIResponse:
|
||||
"""GET /v1/runtime/initialization — detailed startup timeline."""
|
||||
return self._get("/v1/runtime/initialization")
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Limits & security
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def limits_effective(self) -> APIResponse:
|
||||
"""GET /v1/limits/effective — effective timeout/upstream/ME limits."""
|
||||
return self._get("/v1/limits/effective")
|
||||
|
||||
def security_posture(self) -> APIResponse:
|
||||
"""GET /v1/security/posture — API auth, telemetry, log-level summary."""
|
||||
return self._get("/v1/security/posture")
|
||||
|
||||
def security_whitelist(self) -> APIResponse:
|
||||
"""GET /v1/security/whitelist — current IP whitelist CIDRs."""
|
||||
return self._get("/v1/security/whitelist")
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Stats
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def stats_summary(self) -> APIResponse:
|
||||
"""GET /v1/stats/summary — uptime, connection totals, user count."""
|
||||
return self._get("/v1/stats/summary")
|
||||
|
||||
def stats_zero_all(self) -> APIResponse:
|
||||
"""GET /v1/stats/zero/all — zero-cost counters (core, upstream, ME, pool, desync)."""
|
||||
return self._get("/v1/stats/zero/all")
|
||||
|
||||
def stats_upstreams(self) -> APIResponse:
|
||||
"""GET /v1/stats/upstreams — upstream health + zero counters."""
|
||||
return self._get("/v1/stats/upstreams")
|
||||
|
||||
def stats_minimal_all(self) -> APIResponse:
|
||||
"""GET /v1/stats/minimal/all — ME writers + DC snapshot (requires minimal_runtime_enabled)."""
|
||||
return self._get("/v1/stats/minimal/all")
|
||||
|
||||
def stats_me_writers(self) -> APIResponse:
|
||||
"""GET /v1/stats/me-writers — per-writer ME status (requires minimal_runtime_enabled)."""
|
||||
return self._get("/v1/stats/me-writers")
|
||||
|
||||
def stats_dcs(self) -> APIResponse:
|
||||
"""GET /v1/stats/dcs — per-DC coverage and writer counts (requires minimal_runtime_enabled)."""
|
||||
return self._get("/v1/stats/dcs")
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Runtime deep-dive
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def runtime_me_pool_state(self) -> APIResponse:
|
||||
"""GET /v1/runtime/me_pool_state — ME pool generation/writer/refill snapshot."""
|
||||
return self._get("/v1/runtime/me_pool_state")
|
||||
|
||||
def runtime_me_quality(self) -> APIResponse:
|
||||
"""GET /v1/runtime/me_quality — ME KDF, route-drop, and per-DC RTT counters."""
|
||||
return self._get("/v1/runtime/me_quality")
|
||||
|
||||
def runtime_upstream_quality(self) -> APIResponse:
|
||||
"""GET /v1/runtime/upstream_quality — per-upstream health, latency, DC preferences."""
|
||||
return self._get("/v1/runtime/upstream_quality")
|
||||
|
||||
def runtime_nat_stun(self) -> APIResponse:
|
||||
"""GET /v1/runtime/nat_stun — NAT probe state, STUN servers, reflected IPs."""
|
||||
return self._get("/v1/runtime/nat_stun")
|
||||
|
||||
def runtime_me_selftest(self) -> APIResponse:
|
||||
"""GET /v1/runtime/me-selftest — KDF/timeskew/IP/PID/BND health state."""
|
||||
return self._get("/v1/runtime/me-selftest")
|
||||
|
||||
def runtime_connections_summary(self) -> APIResponse:
|
||||
"""GET /v1/runtime/connections/summary — live connection totals + top-N users (requires runtime_edge_enabled)."""
|
||||
return self._get("/v1/runtime/connections/summary")
|
||||
|
||||
def runtime_events_recent(self, limit: int | None = None) -> APIResponse:
|
||||
"""GET /v1/runtime/events/recent — recent ring-buffer events (requires runtime_edge_enabled).
|
||||
|
||||
Parameters
|
||||
----------
|
||||
limit:
|
||||
Optional cap on returned events (1–1000, server default 50).
|
||||
"""
|
||||
query = {"limit": str(limit)} if limit is not None else None
|
||||
return self._get("/v1/runtime/events/recent", query=query)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Users (read)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def list_users(self) -> APIResponse:
|
||||
"""GET /v1/users — list all users with connection/traffic info."""
|
||||
return self._get("/v1/users")
|
||||
|
||||
def get_user(self, username: str) -> APIResponse:
|
||||
"""GET /v1/users/{username} — single user info."""
|
||||
return self._get(f"/v1/users/{_safe(username)}")
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Users (write)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def create_user(
|
||||
self,
|
||||
username: str,
|
||||
*,
|
||||
secret: str | None = None,
|
||||
user_ad_tag: str | None = None,
|
||||
max_tcp_conns: int | None = None,
|
||||
expiration_rfc3339: str | None = None,
|
||||
data_quota_bytes: int | None = None,
|
||||
max_unique_ips: int | None = None,
|
||||
if_match: str | None = None,
|
||||
) -> APIResponse:
|
||||
"""POST /v1/users — create a new user.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
username:
|
||||
``[A-Za-z0-9_.-]``, length 1–64.
|
||||
secret:
|
||||
Exactly 32 hex chars. Auto-generated if omitted.
|
||||
user_ad_tag:
|
||||
Exactly 32 hex chars.
|
||||
max_tcp_conns:
|
||||
Per-user concurrent TCP limit.
|
||||
expiration_rfc3339:
|
||||
RFC3339 expiration timestamp, e.g. ``"2025-12-31T23:59:59Z"``.
|
||||
data_quota_bytes:
|
||||
Per-user traffic quota in bytes.
|
||||
max_unique_ips:
|
||||
Per-user unique source IP limit.
|
||||
if_match:
|
||||
Optional ``If-Match`` revision for optimistic concurrency.
|
||||
"""
|
||||
body: Dict[str, Any] = {"username": username}
|
||||
_opt(body, "secret", secret)
|
||||
_opt(body, "user_ad_tag", user_ad_tag)
|
||||
_opt(body, "max_tcp_conns", max_tcp_conns)
|
||||
_opt(body, "expiration_rfc3339", expiration_rfc3339)
|
||||
_opt(body, "data_quota_bytes", data_quota_bytes)
|
||||
_opt(body, "max_unique_ips", max_unique_ips)
|
||||
return self._post("/v1/users", body=body, if_match=if_match)
|
||||
|
||||
def patch_user(
|
||||
self,
|
||||
username: str,
|
||||
*,
|
||||
secret: str | None = None,
|
||||
user_ad_tag: str | None = None,
|
||||
max_tcp_conns: int | None = None,
|
||||
expiration_rfc3339: str | None = None,
|
||||
data_quota_bytes: int | None = None,
|
||||
max_unique_ips: int | None = None,
|
||||
if_match: str | None = None,
|
||||
) -> APIResponse:
|
||||
"""PATCH /v1/users/{username} — partial update; only provided fields change.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
username:
|
||||
Existing username to update.
|
||||
secret:
|
||||
New secret (32 hex chars).
|
||||
user_ad_tag:
|
||||
New ad tag (32 hex chars).
|
||||
max_tcp_conns:
|
||||
New TCP concurrency limit.
|
||||
expiration_rfc3339:
|
||||
New expiration timestamp.
|
||||
data_quota_bytes:
|
||||
New quota in bytes.
|
||||
max_unique_ips:
|
||||
New unique IP limit.
|
||||
if_match:
|
||||
Optional ``If-Match`` revision.
|
||||
"""
|
||||
body: Dict[str, Any] = {}
|
||||
_opt(body, "secret", secret)
|
||||
_opt(body, "user_ad_tag", user_ad_tag)
|
||||
_opt(body, "max_tcp_conns", max_tcp_conns)
|
||||
_opt(body, "expiration_rfc3339", expiration_rfc3339)
|
||||
_opt(body, "data_quota_bytes", data_quota_bytes)
|
||||
_opt(body, "max_unique_ips", max_unique_ips)
|
||||
if not body:
|
||||
raise ValueError("patch_user: at least one field must be provided")
|
||||
return self._patch(f"/v1/users/{_safe(username)}", body=body,
|
||||
if_match=if_match)
|
||||
|
||||
def delete_user(
|
||||
self,
|
||||
username: str,
|
||||
*,
|
||||
if_match: str | None = None,
|
||||
) -> APIResponse:
|
||||
"""DELETE /v1/users/{username} — remove user; blocks deletion of last user.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
if_match:
|
||||
Optional ``If-Match`` revision for optimistic concurrency.
|
||||
"""
|
||||
return self._delete(f"/v1/users/{_safe(username)}", if_match=if_match)
|
||||
|
||||
# NOTE: POST /v1/users/{username}/rotate-secret currently returns 404
|
||||
# in the route matcher (documented limitation). The method is provided
|
||||
# for completeness and future compatibility.
|
||||
def rotate_secret(
|
||||
self,
|
||||
username: str,
|
||||
*,
|
||||
secret: str | None = None,
|
||||
if_match: str | None = None,
|
||||
) -> APIResponse:
|
||||
"""POST /v1/users/{username}/rotate-secret — rotate user secret.
|
||||
|
||||
.. warning::
|
||||
This endpoint currently returns ``404 not_found`` in all released
|
||||
versions (documented route matcher limitation). The method is
|
||||
included for future compatibility.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
secret:
|
||||
New secret (32 hex chars). Auto-generated if omitted.
|
||||
"""
|
||||
body: Dict[str, Any] = {}
|
||||
_opt(body, "secret", secret)
|
||||
return self._post(f"/v1/users/{_safe(username)}/rotate-secret",
|
||||
body=body or None, if_match=if_match)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Convenience helpers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
@staticmethod
|
||||
def generate_secret() -> str:
|
||||
"""Generate a random 32-character hex secret suitable for user creation."""
|
||||
return secrets.token_hex(16) # 16 bytes → 32 hex chars
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Internal helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _safe(username: str) -> str:
|
||||
"""Minimal guard: reject obvious path-injection attempts."""
|
||||
if "/" in username or "\\" in username:
|
||||
raise ValueError(f"Invalid username: {username!r}")
|
||||
return username
|
||||
|
||||
|
||||
def _opt(d: dict, key: str, value: Any) -> None:
|
||||
"""Add key to dict only when value is not None."""
|
||||
if value is not None:
|
||||
d[key] = value
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _print(resp: APIResponse) -> None:
|
||||
print(json.dumps(resp.data, indent=2))
|
||||
if resp.revision:
|
||||
print(f"# revision: {resp.revision}", flush=True)
|
||||
|
||||
|
||||
def _build_parser():
|
||||
import argparse
|
||||
|
||||
p = argparse.ArgumentParser(
|
||||
prog="telemt_api.py",
|
||||
description="Telemt Control API CLI",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
COMMANDS (read)
|
||||
health Liveness check
|
||||
info System info (version, uptime, config hash)
|
||||
status Runtime gates + startup progress
|
||||
init Runtime initialization timeline
|
||||
limits Effective limits (timeouts, upstream, ME)
|
||||
posture Security posture summary
|
||||
whitelist IP whitelist entries
|
||||
summary Stats summary (conns, uptime, users)
|
||||
zero Zero-cost counters (core/upstream/ME/pool/desync)
|
||||
upstreams Upstream health + zero counters
|
||||
minimal ME writers + DC snapshot [minimal_runtime_enabled]
|
||||
me-writers Per-writer ME status [minimal_runtime_enabled]
|
||||
dcs Per-DC coverage [minimal_runtime_enabled]
|
||||
me-pool ME pool generation/writer/refill snapshot
|
||||
me-quality ME KDF, route-drops, per-DC RTT
|
||||
upstream-quality Per-upstream health + latency
|
||||
nat-stun NAT probe state + STUN servers
|
||||
me-selftest KDF/timeskew/IP/PID/BND health
|
||||
connections Live connection totals + top-N [runtime_edge_enabled]
|
||||
events [--limit N] Recent ring-buffer events [runtime_edge_enabled]
|
||||
|
||||
COMMANDS (users)
|
||||
users List all users
|
||||
user <username> Get single user
|
||||
create <username> [OPTIONS] Create user
|
||||
patch <username> [OPTIONS] Partial update user
|
||||
delete <username> Delete user
|
||||
secret <username> [--secret S] Rotate secret (reserved; returns 404 in current release)
|
||||
gen-secret Print a random 32-hex secret and exit
|
||||
|
||||
USER OPTIONS (for create / patch)
|
||||
--secret S 32 hex chars
|
||||
--ad-tag S 32 hex chars (ad tag)
|
||||
--max-conns N Max concurrent TCP connections
|
||||
--expires DATETIME RFC3339 expiration (e.g. 2026-12-31T23:59:59Z)
|
||||
--quota N Data quota in bytes
|
||||
--max-ips N Max unique source IPs
|
||||
|
||||
EXAMPLES
|
||||
telemt_api.py health
|
||||
telemt_api.py -u http://10.0.0.1:9091 -a mysecret users
|
||||
telemt_api.py create alice --max-conns 5 --quota 10000000000
|
||||
telemt_api.py patch alice --expires 2027-01-01T00:00:00Z
|
||||
telemt_api.py delete alice
|
||||
telemt_api.py events --limit 20
|
||||
""",
|
||||
)
|
||||
|
||||
p.add_argument("-u", "--url", default="http://127.0.0.1:9091",
|
||||
metavar="URL", help="API base URL (default: http://127.0.0.1:9091)")
|
||||
p.add_argument("-a", "--auth", default=None, metavar="TOKEN",
|
||||
help="Authorization header value")
|
||||
p.add_argument("-t", "--timeout", type=int, default=10, metavar="SEC",
|
||||
help="Request timeout in seconds (default: 10)")
|
||||
|
||||
p.add_argument("command", nargs="?", default="help",
|
||||
help="Command to run (see COMMANDS below)")
|
||||
p.add_argument("arg", nargs="?", default=None, metavar="USERNAME",
|
||||
help="Username for user commands")
|
||||
|
||||
# user create/patch fields
|
||||
p.add_argument("--secret", default=None)
|
||||
p.add_argument("--ad-tag", dest="ad_tag", default=None)
|
||||
p.add_argument("--max-conns", dest="max_conns", type=int, default=None)
|
||||
p.add_argument("--expires", default=None)
|
||||
p.add_argument("--quota", type=int, default=None)
|
||||
p.add_argument("--max-ips", dest="max_ips", type=int, default=None)
|
||||
|
||||
# events
|
||||
p.add_argument("--limit", type=int, default=None,
|
||||
help="Max events for `events` command")
|
||||
|
||||
# optimistic concurrency
|
||||
p.add_argument("--if-match", dest="if_match", default=None,
|
||||
metavar="REVISION", help="If-Match revision header")
|
||||
|
||||
return p
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import sys
|
||||
|
||||
parser = _build_parser()
|
||||
args = parser.parse_args()
|
||||
|
||||
cmd = (args.command or "help").lower()
|
||||
|
||||
if cmd in ("help", "--help"):
|
||||
parser.print_help()
|
||||
sys.exit(0)
|
||||
|
||||
if cmd == "gen-secret":
|
||||
print(TememtAPI.generate_secret())
|
||||
sys.exit(0)
|
||||
|
||||
api = TememtAPI(args.url, auth_header=args.auth, timeout=args.timeout)
|
||||
|
||||
try:
|
||||
# -- read endpoints --------------------------------------------------
|
||||
if cmd == "health":
|
||||
_print(api.health())
|
||||
|
||||
elif cmd == "info":
|
||||
_print(api.system_info())
|
||||
|
||||
elif cmd == "status":
|
||||
_print(api.runtime_gates())
|
||||
|
||||
elif cmd == "init":
|
||||
_print(api.runtime_initialization())
|
||||
|
||||
elif cmd == "limits":
|
||||
_print(api.limits_effective())
|
||||
|
||||
elif cmd == "posture":
|
||||
_print(api.security_posture())
|
||||
|
||||
elif cmd == "whitelist":
|
||||
_print(api.security_whitelist())
|
||||
|
||||
elif cmd == "summary":
|
||||
_print(api.stats_summary())
|
||||
|
||||
elif cmd == "zero":
|
||||
_print(api.stats_zero_all())
|
||||
|
||||
elif cmd == "upstreams":
|
||||
_print(api.stats_upstreams())
|
||||
|
||||
elif cmd == "minimal":
|
||||
_print(api.stats_minimal_all())
|
||||
|
||||
elif cmd == "me-writers":
|
||||
_print(api.stats_me_writers())
|
||||
|
||||
elif cmd == "dcs":
|
||||
_print(api.stats_dcs())
|
||||
|
||||
elif cmd == "me-pool":
|
||||
_print(api.runtime_me_pool_state())
|
||||
|
||||
elif cmd == "me-quality":
|
||||
_print(api.runtime_me_quality())
|
||||
|
||||
elif cmd == "upstream-quality":
|
||||
_print(api.runtime_upstream_quality())
|
||||
|
||||
elif cmd == "nat-stun":
|
||||
_print(api.runtime_nat_stun())
|
||||
|
||||
elif cmd == "me-selftest":
|
||||
_print(api.runtime_me_selftest())
|
||||
|
||||
elif cmd == "connections":
|
||||
_print(api.runtime_connections_summary())
|
||||
|
||||
elif cmd == "events":
|
||||
_print(api.runtime_events_recent(limit=args.limit))
|
||||
|
||||
# -- user read -------------------------------------------------------
|
||||
elif cmd == "users":
|
||||
resp = api.list_users()
|
||||
users = resp.data or []
|
||||
if not users:
|
||||
print("No users configured.")
|
||||
else:
|
||||
fmt = "{:<24} {:>7} {:>14} {}"
|
||||
print(fmt.format("USERNAME", "CONNS", "OCTETS", "LINKS"))
|
||||
print("-" * 72)
|
||||
for u in users:
|
||||
links = (u.get("links") or {})
|
||||
all_links = (links.get("classic") or []) + \
|
||||
(links.get("secure") or []) + \
|
||||
(links.get("tls") or [])
|
||||
link_str = all_links[0] if all_links else "-"
|
||||
print(fmt.format(
|
||||
u["username"],
|
||||
u.get("current_connections", 0),
|
||||
u.get("total_octets", 0),
|
||||
link_str,
|
||||
))
|
||||
if resp.revision:
|
||||
print(f"# revision: {resp.revision}")
|
||||
|
||||
elif cmd == "user":
|
||||
if not args.arg:
|
||||
parser.error("user command requires <username>")
|
||||
_print(api.get_user(args.arg))
|
||||
|
||||
# -- user write ------------------------------------------------------
|
||||
elif cmd == "create":
|
||||
if not args.arg:
|
||||
parser.error("create command requires <username>")
|
||||
resp = api.create_user(
|
||||
args.arg,
|
||||
secret=args.secret,
|
||||
user_ad_tag=args.ad_tag,
|
||||
max_tcp_conns=args.max_conns,
|
||||
expiration_rfc3339=args.expires,
|
||||
data_quota_bytes=args.quota,
|
||||
max_unique_ips=args.max_ips,
|
||||
if_match=args.if_match,
|
||||
)
|
||||
d = resp.data or {}
|
||||
print(f"Created: {d.get('user', {}).get('username')}")
|
||||
print(f"Secret: {d.get('secret')}")
|
||||
links = (d.get("user") or {}).get("links") or {}
|
||||
for kind, lst in links.items():
|
||||
for link in (lst or []):
|
||||
print(f"Link ({kind}): {link}")
|
||||
if resp.revision:
|
||||
print(f"# revision: {resp.revision}")
|
||||
|
||||
elif cmd == "patch":
|
||||
if not args.arg:
|
||||
parser.error("patch command requires <username>")
|
||||
if not any([args.secret, args.ad_tag, args.max_conns,
|
||||
args.expires, args.quota, args.max_ips]):
|
||||
parser.error("patch requires at least one field (--secret, --max-conns, --expires, --quota, --max-ips, --ad-tag)")
|
||||
_print(api.patch_user(
|
||||
args.arg,
|
||||
secret=args.secret,
|
||||
user_ad_tag=args.ad_tag,
|
||||
max_tcp_conns=args.max_conns,
|
||||
expiration_rfc3339=args.expires,
|
||||
data_quota_bytes=args.quota,
|
||||
max_unique_ips=args.max_ips,
|
||||
if_match=args.if_match,
|
||||
))
|
||||
|
||||
elif cmd == "delete":
|
||||
if not args.arg:
|
||||
parser.error("delete command requires <username>")
|
||||
resp = api.delete_user(args.arg, if_match=args.if_match)
|
||||
print(f"Deleted: {resp.data}")
|
||||
if resp.revision:
|
||||
print(f"# revision: {resp.revision}")
|
||||
|
||||
elif cmd == "secret":
|
||||
if not args.arg:
|
||||
parser.error("secret command requires <username>")
|
||||
_print(api.rotate_secret(args.arg, secret=args.secret,
|
||||
if_match=args.if_match))
|
||||
|
||||
else:
|
||||
print(f"Unknown command: {cmd!r}\nRun with 'help' to see available commands.",
|
||||
file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
except TememtAPIError as exc:
|
||||
print(f"API error [{exc.http_status}] {exc.code}: {exc}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
except KeyboardInterrupt:
|
||||
sys.exit(130)
|
||||
|
|
@ -1165,6 +1165,60 @@ zabbix_export:
|
|||
tags:
|
||||
- tag: Application
|
||||
value: 'Users connections'
|
||||
graph_prototypes:
|
||||
- uuid: 4199de3dcea943d8a1ec62dc297b2e9f
|
||||
name: 'User {#TELEMT_USER}: Connections'
|
||||
graph_items:
|
||||
- color: 1A7C11
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.active_conn_[{#TELEMT_USER}]'
|
||||
- color: F63100
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.total_conn_[{#TELEMT_USER}]'
|
||||
- uuid: 84b8f22d891e49768891f497cac12fb3
|
||||
name: 'User {#TELEMT_USER}: IPs'
|
||||
graph_items:
|
||||
- color: 0080FF
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.ips_current_[{#TELEMT_USER}]'
|
||||
- color: FF8000
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.ips_limit_[{#TELEMT_USER}]'
|
||||
- color: AA00FF
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.ips_utilization_[{#TELEMT_USER}]'
|
||||
- uuid: 09dabe7125114e36a6ce40788a7cb888
|
||||
name: 'User {#TELEMT_USER}: Traffic'
|
||||
graph_items:
|
||||
- color: 00AA00
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.octets_from_[{#TELEMT_USER}]'
|
||||
- color: AA0000
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.octets_to_[{#TELEMT_USER}]'
|
||||
- uuid: 367f458962574b0ab3c02278a4cd7ecb
|
||||
name: 'User {#TELEMT_USER}: Messages'
|
||||
graph_items:
|
||||
- color: 00AAFF
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.msgs_from_[{#TELEMT_USER}]'
|
||||
- color: FF5500
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: 'telemt.msgs_to_[{#TELEMT_USER}]'
|
||||
master_item:
|
||||
key: telemt.prom_metrics
|
||||
lld_macro_paths:
|
||||
|
|
@ -1177,3 +1231,206 @@ zabbix_export:
|
|||
tags:
|
||||
- tag: target
|
||||
value: Telemt
|
||||
graphs:
|
||||
- uuid: f162658049ca4f50893c5cc02515ff10
|
||||
name: 'Telemt: Server Connections Overview'
|
||||
graph_items:
|
||||
- color: 1A7C11
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.conn_total
|
||||
- color: F63100
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.conn_bad_total
|
||||
- color: FC6EA3
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.handshake_timeouts_total
|
||||
- uuid: 759eca5e687142f19248f9d9343e1adf
|
||||
name: 'Telemt: Uptime'
|
||||
graph_items:
|
||||
- color: 0080FF
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.uptime
|
||||
- uuid: 0a27dbd0490d4a508c03ed39fa18545d
|
||||
name: 'Telemt: ME Keepalive'
|
||||
graph_items:
|
||||
- color: 1A7C11
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_keepalive_sent_total
|
||||
- color: 00AA00
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_keepalive_pong_total
|
||||
- color: F63100
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_keepalive_failed_total
|
||||
- color: FF8000
|
||||
sortorder: '3'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_keepalive_timeout_total
|
||||
- uuid: 4015e24ff70b49f484e884d1dde687c0
|
||||
name: 'Telemt: ME Reconnects'
|
||||
graph_items:
|
||||
- color: 0080FF
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_reconnect_attempts_total
|
||||
- color: 1A7C11
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_reconnect_success_total
|
||||
- uuid: f3e3eeb0663c471aa26cf4b6872b0c50
|
||||
name: 'Telemt: ME Route Drops'
|
||||
graph_items:
|
||||
- color: F63100
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_route_drop_channel_closed_total
|
||||
- color: FF8000
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_route_drop_no_conn_total
|
||||
- color: AA00FF
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_route_drop_queue_full_total
|
||||
- uuid: 49b51ed78a5943bdbd6d1d34fe28bf61
|
||||
name: 'Telemt: ME Writer Pool'
|
||||
graph_items:
|
||||
- color: 0080FF
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.pool_drain_active
|
||||
- color: F63100
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.pool_force_close_total
|
||||
- color: FF8000
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.pool_stale_pick_total
|
||||
- color: 1A7C11
|
||||
sortorder: '3'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.pool_swap_total
|
||||
- uuid: a0779e6c979f4c1ab7ac4da7123a5ecb
|
||||
name: 'Telemt: ME Writer Removals and Restores'
|
||||
graph_items:
|
||||
- color: F63100
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_writer_removed_total
|
||||
- color: FF8000
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_writer_removed_unexpected_total
|
||||
- color: FFAA00
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_writer_removed_unexpected_minus_restored_total
|
||||
- color: 1A7C11
|
||||
sortorder: '3'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_writer_restored_same_endpoint_total
|
||||
- color: 00AA00
|
||||
sortorder: '4'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_writer_restored_fallback_total
|
||||
- uuid: 4fead70290664953b026a228108bee0e
|
||||
name: 'Telemt: Desync Detections'
|
||||
graph_items:
|
||||
- color: F63100
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.desync_total
|
||||
- color: 1A7C11
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.desync_full_logged_total
|
||||
- color: FF8000
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.desync_suppressed_total
|
||||
- uuid: 9f8c9f48cb534a66ac21b1bba1acb602
|
||||
name: 'Telemt: Upstream Connect Cycles'
|
||||
graph_items:
|
||||
- color: 0080FF
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.upstream_connect_attempt_total
|
||||
- color: 1A7C11
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.upstream_connect_success_total
|
||||
- color: F63100
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.upstream_connect_fail_total
|
||||
- color: FF8000
|
||||
sortorder: '3'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.upstream_connect_failfast_hard_error_total
|
||||
- uuid: 05182057727547f8b8884b7e71e34f19
|
||||
name: 'Telemt: ME Single-Endpoint Outages'
|
||||
graph_items:
|
||||
- color: F63100
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_single_endpoint_outage_enter_total
|
||||
- color: 1A7C11
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_single_endpoint_outage_exit_total
|
||||
- color: 0080FF
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_single_endpoint_outage_reconnect_attempt_total
|
||||
- color: 00AA00
|
||||
sortorder: '3'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_single_endpoint_outage_reconnect_success_total
|
||||
- uuid: 6892e8b7fbd2445d9ccc0574af58a354
|
||||
name: 'Telemt: ME Refill Activity'
|
||||
graph_items:
|
||||
- color: 0080FF
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_refill_triggered_total
|
||||
- color: F63100
|
||||
sortorder: '1'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_refill_failed_total
|
||||
- color: FF8000
|
||||
sortorder: '2'
|
||||
item:
|
||||
host: Telemt
|
||||
key: telemt.me_refill_skipped_inflight_total
|
||||
|
|
|
|||
Loading…
Reference in New Issue