Compare commits

..

48 Commits

Author SHA1 Message Date
Vladimir Krivopalov
d9f729db26 Merge dc088ea5d2 into f3598cf309 2026-03-21 10:37:09 +00:00
Vladimir Krivopalov
dc088ea5d2 Add multi-destination logging: syslog and file support
Implement logging infrastructure for non-systemd platforms:

- Add src/logging.rs with syslog and file logging support
- New CLI flags: --syslog, --log-file, --log-file-daily
- Syslog uses libc directly with LOG_DAEMON facility
- File logging via tracing-appender with optional daily rotation

Update service scripts:
- OpenRC and FreeBSD rc.d now use --syslog by default
- Ensures logs are captured on platforms without journald

Default (stderr) behavior unchanged for systemd compatibility.
Log destination is selected at startup based on CLI flags.

Signed-off-by: Vladimir Krivopalov <argenet@yandex.ru>
2026-03-21 12:36:59 +02:00
Vladimir Krivopalov
dd548d1049 Add multi-platform service manager integration
Implement automatic init system detection and service file generation
for systemd, OpenRC (Alpine/Gentoo), and FreeBSD rc.d:

- Add src/service module with init system detection and generators
- Auto-detect init system via filesystem probes
- Generate platform-appropriate service files during --init

systemd enhancements:
- ExecReload for SIGHUP config reload
- PIDFile directive
- Comprehensive security hardening (ProtectKernelTunables,
  RestrictAddressFamilies, MemoryDenyWriteExecute, etc.)
- CAP_NET_BIND_SERVICE for privileged ports

OpenRC support:
- Standard openrc-run script with depend/reload functions
- Directory setup in start_pre

FreeBSD rc.d support:
- rc.subr integration with rc.conf variables
- reload extra command

The --init command now detects the init system and runs the
appropriate enable/start commands (systemctl, rc-update, sysrc).

Signed-off-by: Vladimir Krivopalov <argenet@yandex.ru>
2026-03-21 12:36:59 +02:00
Vladimir Krivopalov
e48331e344 Add daemon lifecycle subcommands: start, stop, reload, status
Implement CLI subcommands for managing telemt as a daemon:

- `start [config.toml]` - Start as background daemon (implies --daemon)
- `stop` - Stop running daemon by sending SIGTERM
- `reload` - Reload configuration by sending SIGHUP
- `status` - Check if daemon is running via PID file

Subcommands use the PID file (default /var/run/telemt.pid) to locate
the running daemon. Stop command waits up to 10 seconds for graceful
shutdown. Status cleans up stale PID files automatically.

Updated help text with subcommand documentation and usage examples.
Exit codes follow Unix convention: 0 for success, 1 for not running
or error.

Signed-off-by: Vladimir Krivopalov <argenet@yandex.ru>
2026-03-21 12:36:59 +02:00
Vladimir Krivopalov
dd78d4eca3 Add comprehensive Unix signal handling for daemon mode
Enhance signal handling to support proper daemon operation:

- SIGTERM: Graceful shutdown (same behavior as SIGINT)
- SIGQUIT: Graceful shutdown with full statistics dump
- SIGUSR1: Log rotation acknowledgment for external tools
- SIGUSR2: Dump runtime status to log without stopping

Statistics dump includes connection counts, ME keepalive metrics,
and relay adaptive tuning counters. SIGHUP config reload unchanged
(handled in hot_reload.rs).

Signals are handled via tokio::signal::unix with async select!
to avoid blocking the runtime. Non-shutdown signals (USR1/USR2)
run in a background task spawned at startup.

Signed-off-by: Vladimir Krivopalov <argenet@yandex.ru>
2026-03-21 12:36:59 +02:00
Vladimir Krivopalov
be2b0104fd Add Unix daemon mode with PID file and privilege dropping
Implement core daemon infrastructure for running telemt as a background
  service on Unix platforms (Linux, FreeBSD, etc.):

  - Add src/daemon module with classic double-fork daemonization
  - Implement flock-based PID file management to prevent duplicate instances
  - Add privilege dropping (setuid/setgid) after socket binding
  - New CLI flags: --daemon, --foreground, --pid-file, --run-as-user,
    --run-as-group, --working-dir

  Daemonization occurs before tokio runtime starts to ensure clean fork.
  PID file uses exclusive locking to detect already-running instances.
  Privilege dropping happens after bind_listeners() to allow binding
  to privileged ports (< 1024) before switching to unprivileged user.

Signed-off-by: Vladimir Krivopalov <argenet@yandex.ru>
2026-03-21 12:36:59 +02:00
Alexey
f3598cf309 Merge pull request #514 from M1h4n1k/patch-1
docs: fix typo in ru QUICK_START
2026-03-21 10:22:52 +03:00
Michael Karpov
777b15b1da Update section title for Docker usage
Изменено название раздела с 'Запуск в Docker Compose' на 'Запуск без Docker Compose'.
2026-03-20 22:23:36 +02:00
Alexey
99ba2f7bbc Add Shadowsocks upstream support: merge pull request #430 from hunmar/feat/shadowsocks-upstream
Add Shadowsocks upstream support
2026-03-20 18:35:28 +03:00
Maxim Myalin
e14dd07220 Merge branch 'main' into feat/shadowsocks-upstream 2026-03-20 17:08:47 +03:00
Maxim Myalin
d93a4fbd53 Merge remote-tracking branch 'origin/main' into feat/shadowsocks-upstream
# Conflicts:
#	src/tls_front/fetcher.rs
2026-03-20 17:07:47 +03:00
Alexey
2798039ab8 Merge pull request #507 from dzhus/patch-2
Fix typo in systemd service metadata
2026-03-20 17:04:41 +03:00
Alexey
342b0119dd Merge pull request #509 from telemt/bump
Update Cargo.toml
2026-03-20 16:27:39 +03:00
Alexey
2605929b93 Update Cargo.toml 2026-03-20 16:26:57 +03:00
Alexey
36814b6355 ME Draining on Dual-Stack + TLS Fetcher Upstream Selection: merge pull request #508 from telemt/flow
ME Draining on Dual-Stack + TLS Fetcher Upstream Selection
2026-03-20 16:24:17 +03:00
Alexey
269ba537ad ME Draining on Dual-Stack
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-20 16:07:12 +03:00
Alexey
5c0eb6dbe8 TLS Fetcher Upstream Selection
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-20 16:05:24 +03:00
Maxim Myalin
66867d3f5b Merge branch 'main' into feat/shadowsocks-upstream
# Conflicts:
#	Cargo.lock
#	src/api/runtime_stats.rs
2026-03-20 15:22:36 +03:00
Dmitry Dzhus
db36945293 Fix typo in systemd service metadata 2026-03-20 12:00:41 +00:00
Alexey
dd07fa9453 Merge pull request #505 from telemt/flow-me
Teardown Monitoring in API and Metrics
2026-03-20 12:59:39 +03:00
Alexey
bb1a372ac4 Merge branch 'main' into flow-me 2026-03-20 12:59:32 +03:00
Alexey
6661401a34 Merge pull request #506 from telemt/about-releases
Update README.md
2026-03-20 12:59:09 +03:00
Alexey
cd65fb432b Update README.md 2026-03-20 12:58:55 +03:00
Alexey
caf0717789 Merge branch 'main' into flow-me 2026-03-20 12:57:27 +03:00
Alexey
4a610d83a3 Update Cargo.toml
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-20 12:56:13 +03:00
Alexey
aba4205dcc Teardown Monitoring in Metrics
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-20 12:46:35 +03:00
Alexey
ef9b7b1492 Teardown Monitoring in API
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-20 12:45:53 +03:00
Alexey
d112f15b90 ME Writers Anti-stuck + Quarantine fixes + ME Writers Advanced Cleanup + Authoritative Teardown + Orphan Watchdog + Force-Close Safery Policy: merge pull request #504 from telemt/flow-me
ME Writers Anti-stuck + Quarantine fixes + ME Writers Advanced Cleanup + Authoritative Teardown + Orphan Watchdog + Force-Close Safery Policy
2026-03-20 12:41:45 +03:00
Alexey
b55b264345 Merge branch 'main' into flow-me 2026-03-20 12:20:51 +03:00
Alexey
f61d25ebe0 Authoritative Teardown + Orphan Watchdog + Force-Close Safery Policy
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-20 12:11:47 +03:00
Alexey
ed4d1167dd ME Writers Advanced Cleanup
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-20 12:09:23 +03:00
Alexey
dc6948cf39 Merge pull request #502 from telemt/about-releases
Update README.md
2026-03-20 11:25:19 +03:00
Alexey
4f11aa0772 Update README.md 2026-03-20 11:25:07 +03:00
Alexey
e40361b171 Cargo.toml + Cargo.lock
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-20 00:45:04 +03:00
Alexey
1c6c73beda ME Writers Anti-stuck and Quarantine fixes
Co-Authored-By: Nook Scheel <nook@live.ru>
2026-03-20 00:41:40 +03:00
Alexey
67dc1e8d18 Merge pull request #498 from telemt/bump
Update Cargo.toml
2026-03-19 18:25:14 +03:00
Alexey
ad8ada33c9 Update Cargo.toml 2026-03-19 18:24:01 +03:00
Alexey
bbb201b433 Instadrain + Hard-remove for long draining-state: merge pull request #497 from telemt/flow-stuck-writer
Instadrain + Hard-remove for long draining-state
2026-03-19 18:23:38 +03:00
Alexey
8d1faece60 Instadrain + Hard-remove for long draining-state
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-19 17:45:17 +03:00
Alexey
a603505f90 Merge pull request #492 from temandroid/main
fix(docker): expose port 9091 and allow external API access
2026-03-19 17:32:49 +03:00
Alexey
f8c42c324f Merge pull request #494 from Dimasssss/patch-1
Update install.sh
2026-03-19 17:32:05 +03:00
Alexey
dd8ef4d996 Merge branch 'main' into feat/shadowsocks-upstream 2026-03-19 17:19:01 +03:00
Dimasssss
dc3363aa0d Update install.sh 2026-03-19 16:23:32 +03:00
Alexey
f655924323 Update health.rs
Co-Authored-By: brekotis <93345790+brekotis@users.noreply.github.com>
2026-03-19 16:15:00 +03:00
TEMAndroid
05c066c676 fix(docker): expose port 9091 and allow external API access
Add 9091 port mapping to compose.yml to make the REST API reachable
from outside the container. Previously only port 9090 (metrics) was
published, making the documented curl commands non-functional.

fixes #434
2026-03-19 15:54:01 +03:00
Maxim Myalin
062464175e Merge branch 'main' into feat/shadowsocks-upstream 2026-03-18 12:38:23 +03:00
Maxim Myalin
a5983c17d3 Add Docker build context ignore file 2026-03-18 12:36:48 +03:00
Maxim Myalin
def42f0baa Add Shadowsocks upstream support 2026-03-18 12:36:44 +03:00
54 changed files with 6282 additions and 1000 deletions

8
.dockerignore Normal file
View File

@@ -0,0 +1,8 @@
.git
.github
target
.kilocode
cache
tlsfront
*.tar
*.tar.gz

916
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
[package]
name = "telemt"
version = "3.3.24"
version = "3.3.28"
edition = "2024"
[dependencies]
@@ -25,7 +25,8 @@ zeroize = { version = "1.8", features = ["derive"] }
# Network
socket2 = { version = "0.5", features = ["all"] }
nix = { version = "0.28", default-features = false, features = ["net"] }
nix = { version = "0.28", default-features = false, features = ["net", "user", "process", "fs", "signal"] }
shadowsocks = { version = "1.24", features = ["aead-cipher-2022"] }
# Serialization
serde = { version = "1.0", features = ["derive"] }
@@ -38,8 +39,10 @@ bytes = "1.9"
thiserror = "2.0"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
tracing-appender = "0.2"
parking_lot = "0.12"
dashmap = "5.5"
arc-swap = "1.7"
lru = "0.16"
rand = "0.9"
chrono = { version = "0.4", features = ["serde"] }

View File

@@ -38,6 +38,7 @@ USER telemt
EXPOSE 443
EXPOSE 9090
EXPOSE 9091
ENTRYPOINT ["/app/telemt"]
CMD ["config.toml"]

View File

@@ -19,9 +19,9 @@
### 🇷🇺 RU
#### Релиз 3.3.15 Semistable
#### О релизах
[3.3.15](https://github.com/telemt/telemt/releases/tag/3.3.15) по итогам работы в продакшн признан одним из самых стабильных и рекомендуется к использованию, когда cutting-edge фичи некритичны!
[3.3.27](https://github.com/telemt/telemt/releases/tag/3.3.27) даёт баланс стабильности и передового функционала, а так же последние исправления по безопасности и багам
Будем рады вашему фидбеку и предложениям по улучшению — особенно в части **API**, **статистики**, **UX**
@@ -40,9 +40,9 @@
### 🇬🇧 EN
#### Release 3.3.15 Semistable
#### About releases
[3.3.15](https://github.com/telemt/telemt/releases/tag/3.3.15) is, based on the results of his work in production, recognized as one of the most stable and recommended for use when cutting-edge features are not so necessary!
[3.3.27](https://github.com/telemt/telemt/releases/tag/3.3.27) provides a balance of stability and advanced functionality, as well as the latest security and bug fixes
We are looking forward to your feedback and improvement proposals — especially regarding **API**, **statistics**, **UX**

View File

@@ -7,6 +7,7 @@ services:
ports:
- "443:443"
- "127.0.0.1:9090:9090"
- "127.0.0.1:9091:9091"
# Allow caching 'proxy-secret' in read-only container
working_dir: /run/telemt
volumes:

View File

@@ -497,13 +497,14 @@ Note: the request contract is defined, but the corresponding route currently ret
| `direct_total` | `usize` | Direct-route upstream entries. |
| `socks4_total` | `usize` | SOCKS4 upstream entries. |
| `socks5_total` | `usize` | SOCKS5 upstream entries. |
| `shadowsocks_total` | `usize` | Shadowsocks upstream entries. |
#### `RuntimeUpstreamQualityUpstreamData`
| Field | Type | Description |
| --- | --- | --- |
| `upstream_id` | `usize` | Runtime upstream index. |
| `route_kind` | `string` | `direct`, `socks4`, `socks5`. |
| `address` | `string` | Upstream address (`direct` literal for direct route kind). |
| `route_kind` | `string` | `direct`, `socks4`, `socks5`, `shadowsocks`. |
| `address` | `string` | Upstream address (`direct` literal for direct route kind, `host:port` only for proxied upstreams). |
| `weight` | `u16` | Selection weight. |
| `scopes` | `string` | Configured scope selector. |
| `healthy` | `bool` | Current health flag. |
@@ -757,13 +758,14 @@ Note: the request contract is defined, but the corresponding route currently ret
| `direct_total` | `usize` | Number of direct upstream entries. |
| `socks4_total` | `usize` | Number of SOCKS4 upstream entries. |
| `socks5_total` | `usize` | Number of SOCKS5 upstream entries. |
| `shadowsocks_total` | `usize` | Number of Shadowsocks upstream entries. |
#### `UpstreamStatus`
| Field | Type | Description |
| --- | --- | --- |
| `upstream_id` | `usize` | Runtime upstream index. |
| `route_kind` | `string` | Upstream route kind: `direct`, `socks4`, `socks5`. |
| `address` | `string` | Upstream address (`direct` for direct route kind). Authentication fields are intentionally omitted. |
| `route_kind` | `string` | Upstream route kind: `direct`, `socks4`, `socks5`, `shadowsocks`. |
| `address` | `string` | Upstream address (`direct` for direct route kind, `host:port` for Shadowsocks). Authentication fields are intentionally omitted. |
| `weight` | `u16` | Selection weight. |
| `scopes` | `string` | Configured scope selector string. |
| `healthy` | `bool` | Current health flag. |

View File

@@ -120,3 +120,17 @@ password = "pass" # Password for Auth on SOCKS-server
weight = 1 # Set Weight for Scenarios
enabled = true
```
#### Shadowsocks as Upstream
Requires `use_middle_proxy = false`.
```toml
[general]
use_middle_proxy = false
[[upstreams]]
type = "shadowsocks"
url = "ss://2022-blake3-aes-256-gcm:BASE64_KEY@1.2.3.4:8388"
weight = 1
enabled = true
```

View File

@@ -121,3 +121,16 @@ weight = 1 # Set Weight for Scenarios
enabled = true
```
#### Shadowsocks как Upstream
Требует `use_middle_proxy = false`.
```toml
[general]
use_middle_proxy = false
[[upstreams]]
type = "shadowsocks"
url = "ss://2022-blake3-aes-256-gcm:BASE64_KEY@1.2.3.4:8388"
weight = 1
enabled = true
```

View File

@@ -181,6 +181,8 @@ docker compose down
docker build -t telemt:local .
docker run --name telemt --restart unless-stopped \
-p 443:443 \
-p 9090:9090 \
-p 9091:9091 \
-e RUST_LOG=info \
-v "$PWD/config.toml:/app/config.toml:ro" \
--read-only \

View File

@@ -178,11 +178,13 @@ docker compose down
> - По умолчанию публикуются порты 443:443, а контейнер запускается со сброшенными привилегиями (добавлена только `NET_BIND_SERVICE`)
> - Если вам действительно нужна сеть хоста (обычно это требуется только для некоторых конфигураций IPv6), раскомментируйте `network_mode: host`
**Запуск в Docker Compose**
**Запуск без Docker Compose**
```bash
docker build -t telemt:local .
docker run --name telemt --restart unless-stopped \
-p 443:443 \
-p 9090:9090 \
-p 9091:9091 \
-e RUST_LOG=info \
-v "$PWD/config.toml:/app/config.toml:ro" \
--read-only \

View File

@@ -82,7 +82,7 @@ Die unten angegebenen `Default`-Werte sind Code-Defaults (bei fehlendem Schlüss
| Feld | Gilt für | Typ | Pflicht | Default | Bedeutung |
|---|---|---|---|---|---|
| `[[upstreams]].type` | alle Upstreams | `"direct" \| "socks4" \| "socks5"` | ja | n/a | Upstream-Transporttyp. |
| `[[upstreams]].type` | alle Upstreams | `"direct" \| "socks4" \| "socks5" \| "shadowsocks"` | ja | n/a | Upstream-Transporttyp. |
| `[[upstreams]].weight` | alle Upstreams | `u16` | nein | `1` | Basisgewicht für weighted-random Auswahl. |
| `[[upstreams]].enabled` | alle Upstreams | `bool` | nein | `true` | Deaktivierte Einträge werden beim Start ignoriert. |
| `[[upstreams]].scopes` | alle Upstreams | `String` | nein | `""` | Komma-separierte Scope-Tags für Request-Routing. |
@@ -95,6 +95,8 @@ Die unten angegebenen `Default`-Werte sind Code-Defaults (bei fehlendem Schlüss
| `interface` | `socks5` | `Option<String>` | nein | `null` | Wird nur genutzt, wenn `address` als `ip:port` angegeben ist. |
| `username` | `socks5` | `Option<String>` | nein | `null` | SOCKS5 Benutzername. |
| `password` | `socks5` | `Option<String>` | nein | `null` | SOCKS5 Passwort. |
| `url` | `shadowsocks` | `String` | ja | n/a | Shadowsocks-SIP002-URL (`ss://...`). In Runtime-APIs wird nur `host:port` offengelegt. |
| `interface` | `shadowsocks` | `Option<String>` | nein | `null` | Optionales ausgehendes Bind-Interface oder lokale Literal-IP. |
### Runtime-Regeln (wichtig)
@@ -115,6 +117,7 @@ Die unten angegebenen `Default`-Werte sind Code-Defaults (bei fehlendem Schlüss
8. Im ME-Modus wird der gewählte Upstream auch für den ME-TCP-Dial-Pfad verwendet.
9. Im ME-Modus ist bei `direct` mit bind/interface die STUN-Reflection bind-aware für KDF-Adressmaterial.
10. Im ME-Modus werden bei SOCKS-Upstream `BND.ADDR/BND.PORT` für KDF verwendet, wenn gültig/öffentlich und gleiche IP-Familie.
11. `shadowsocks`-Upstreams erfordern `general.use_middle_proxy = false`. Mit aktiviertem ME-Modus schlägt das Laden der Config sofort fehl.
## Upstream-Konfigurationsbeispiele
@@ -150,7 +153,20 @@ weight = 2
enabled = true
```
### Beispiel 4: Gemischte Upstreams mit Scopes
### Beispiel 4: Shadowsocks-Upstream
```toml
[general]
use_middle_proxy = false
[[upstreams]]
type = "shadowsocks"
url = "ss://2022-blake3-aes-256-gcm:BASE64_KEY@198.51.100.50:8388"
weight = 2
enabled = true
```
### Beispiel 5: Gemischte Upstreams mit Scopes
```toml
[[upstreams]]

View File

@@ -82,7 +82,7 @@ Defaults below are code defaults (used when a key is omitted), not necessarily v
| Field | Applies to | Type | Required | Default | Meaning |
|---|---|---|---|---|---|
| `[[upstreams]].type` | all upstreams | `"direct" \| "socks4" \| "socks5"` | yes | n/a | Upstream transport type. |
| `[[upstreams]].type` | all upstreams | `"direct" \| "socks4" \| "socks5" \| "shadowsocks"` | yes | n/a | Upstream transport type. |
| `[[upstreams]].weight` | all upstreams | `u16` | no | `1` | Base weight for weighted-random selection. |
| `[[upstreams]].enabled` | all upstreams | `bool` | no | `true` | Disabled entries are ignored at startup. |
| `[[upstreams]].scopes` | all upstreams | `String` | no | `""` | Comma-separated scope tags for request-level routing. |
@@ -95,6 +95,8 @@ Defaults below are code defaults (used when a key is omitted), not necessarily v
| `interface` | `socks5` | `Option<String>` | no | `null` | Used only for SOCKS server `ip:port` dial path. |
| `username` | `socks5` | `Option<String>` | no | `null` | SOCKS5 username auth. |
| `password` | `socks5` | `Option<String>` | no | `null` | SOCKS5 password auth. |
| `url` | `shadowsocks` | `String` | yes | n/a | Shadowsocks SIP002 URL (`ss://...`). Only `host:port` is exposed in runtime APIs. |
| `interface` | `shadowsocks` | `Option<String>` | no | `null` | Optional outgoing bind interface or literal local IP. |
### Runtime rules (important)
@@ -115,6 +117,7 @@ Defaults below are code defaults (used when a key is omitted), not necessarily v
8. In ME mode, the selected upstream is also used for ME TCP dial path.
9. In ME mode for `direct` upstream with bind/interface, STUN reflection logic is bind-aware for KDF source material.
10. In ME mode for SOCKS upstream, SOCKS `BND.ADDR/BND.PORT` is used for KDF when it is valid/public for the same family.
11. `shadowsocks` upstreams require `general.use_middle_proxy = false`. Config load fails fast if ME mode is enabled.
## Upstream Configuration Examples
@@ -150,7 +153,20 @@ weight = 2
enabled = true
```
### Example 4: Mixed upstreams with scopes
### Example 4: Shadowsocks upstream
```toml
[general]
use_middle_proxy = false
[[upstreams]]
type = "shadowsocks"
url = "ss://2022-blake3-aes-256-gcm:BASE64_KEY@198.51.100.50:8388"
weight = 2
enabled = true
```
### Example 5: Mixed upstreams with scopes
```toml
[[upstreams]]

View File

@@ -82,7 +82,7 @@
| Поле | Применимость | Тип | Обязательно | Default | Назначение |
|---|---|---|---|---|---|
| `[[upstreams]].type` | все upstream | `"direct" \| "socks4" \| "socks5"` | да | n/a | Тип upstream транспорта. |
| `[[upstreams]].type` | все upstream | `"direct" \| "socks4" \| "socks5" \| "shadowsocks"` | да | n/a | Тип upstream транспорта. |
| `[[upstreams]].weight` | все upstream | `u16` | нет | `1` | Базовый вес в weighted-random выборе. |
| `[[upstreams]].enabled` | все upstream | `bool` | нет | `true` | Выключенные записи игнорируются на старте. |
| `[[upstreams]].scopes` | все upstream | `String` | нет | `""` | Список scope-токенов через запятую для маршрутизации. |
@@ -95,6 +95,8 @@
| `interface` | `socks5` | `Option<String>` | нет | `null` | Используется только если `address` задан как `ip:port`. |
| `username` | `socks5` | `Option<String>` | нет | `null` | Логин SOCKS5 auth. |
| `password` | `socks5` | `Option<String>` | нет | `null` | Пароль SOCKS5 auth. |
| `url` | `shadowsocks` | `String` | да | n/a | Shadowsocks SIP002 URL (`ss://...`). В runtime API раскрывается только `host:port`. |
| `interface` | `shadowsocks` | `Option<String>` | нет | `null` | Необязательный исходящий bind-интерфейс или literal локальный IP. |
### Runtime-правила
@@ -115,6 +117,7 @@
8. В ME-режиме выбранный upstream также используется для ME TCP dial path.
9. В ME-режиме для `direct` upstream с bind/interface STUN-рефлексия выполняется bind-aware для KDF материала.
10. В ME-режиме для SOCKS upstream используются `BND.ADDR/BND.PORT` для KDF, если адрес валиден/публичен и соответствует IP family.
11. `shadowsocks` upstream требует `general.use_middle_proxy = false`. При включенном ME-режиме конфиг отклоняется при загрузке.
## Примеры конфигурации Upstreams
@@ -150,7 +153,20 @@ weight = 2
enabled = true
```
### Пример 4: смешанные upstream с scopes
### Пример 4: Shadowsocks upstream
```toml
[general]
use_middle_proxy = false
[[upstreams]]
type = "shadowsocks"
url = "ss://2022-blake3-aes-256-gcm:BASE64_KEY@198.51.100.50:8388"
weight = 2
enabled = true
```
### Пример 5: смешанные upstream с scopes
```toml
[[upstreams]]

View File

@@ -1,154 +1,190 @@
#!/bin/sh
set -eu
# --- Global Configurations ---
REPO="${REPO:-telemt/telemt}"
BIN_NAME="${BIN_NAME:-telemt}"
INSTALL_DIR="${INSTALL_DIR:-/bin}"
CONFIG_DIR="${CONFIG_DIR:-/etc/telemt}"
CONFIG_FILE="${CONFIG_FILE:-${CONFIG_DIR}/telemt.toml}"
WORK_DIR="${WORK_DIR:-/opt/telemt}"
TLS_DOMAIN="${TLS_DOMAIN:-petrovich.ru}"
SERVICE_NAME="telemt"
TEMP_DIR=""
SUDO=""
CONFIG_PARENT_DIR=""
SERVICE_START_FAILED=0
# --- Argument Parsing ---
ACTION="install"
TARGET_VERSION="${VERSION:-latest}"
while [ $# -gt 0 ]; do
case "$1" in
-h|--help)
ACTION="help"
shift
;;
-h|--help) ACTION="help"; shift ;;
uninstall|--uninstall)
[ "$ACTION" != "purge" ] && ACTION="uninstall"
shift
;;
--purge)
ACTION="purge"
shift
;;
install|--install)
ACTION="install"
shift
;;
-*)
printf '[ERROR] Unknown option: %s\n' "$1" >&2
exit 1
;;
if [ "$ACTION" != "purge" ]; then ACTION="uninstall"; fi
shift ;;
purge|--purge) ACTION="purge"; shift ;;
install|--install) ACTION="install"; shift ;;
-*) printf '[ERROR] Unknown option: %s\n' "$1" >&2; exit 1 ;;
*)
if [ "$ACTION" = "install" ]; then
TARGET_VERSION="$1"
fi
shift
;;
if [ "$ACTION" = "install" ]; then TARGET_VERSION="$1"
else printf '[WARNING] Ignoring extra argument: %s\n' "$1" >&2; fi
shift ;;
esac
done
# --- Core Functions ---
say() { printf '[INFO] %s\n' "$*"; }
say() {
if [ "$#" -eq 0 ] || [ -z "${1:-}" ]; then
printf '\n'
else
printf '[INFO] %s\n' "$*"
fi
}
die() { printf '[ERROR] %s\n' "$*" >&2; exit 1; }
write_root() { $SUDO sh -c 'cat > "$1"' _ "$1"; }
cleanup() {
if [ -n "${TEMP_DIR:-}" ] && [ -d "$TEMP_DIR" ]; then
rm -rf -- "$TEMP_DIR"
fi
}
trap cleanup EXIT INT TERM
show_help() {
say "Usage: $0 [version | install | uninstall | --purge | --help]"
say " version Install specific version (e.g. 1.0.0, default: latest)"
say " uninstall Remove the binary and service (keeps config)"
say " --purge Remove everything including configuration"
say "Usage: $0 [ <version> | install | uninstall | purge | --help ]"
say " <version> Install specific version (e.g. 3.3.15, default: latest)"
say " install Install the latest version"
say " uninstall Remove the binary and service (keeps config and user)"
say " purge Remove everything including configuration, data, and user"
exit 0
}
user_exists() {
if command -v getent >/dev/null 2>&1; then
getent passwd "$1" >/dev/null 2>&1
check_os_entity() {
if command -v getent >/dev/null 2>&1; then getent "$1" "$2" >/dev/null 2>&1
else grep -q "^${2}:" "/etc/$1" 2>/dev/null; fi
}
normalize_path() {
printf '%s\n' "$1" | tr -s '/' | sed 's|/$||; s|^$|/|'
}
get_realpath() {
path_in="$1"
case "$path_in" in /*) ;; *) path_in="$(pwd)/$path_in" ;; esac
if command -v realpath >/dev/null 2>&1; then
if realpath_out="$(realpath -m "$path_in" 2>/dev/null)"; then
printf '%s\n' "$realpath_out"
return
fi
fi
if command -v readlink >/dev/null 2>&1; then
resolved_path="$(readlink -f "$path_in" 2>/dev/null || true)"
if [ -n "$resolved_path" ]; then
printf '%s\n' "$resolved_path"
return
fi
fi
d="${path_in%/*}"; b="${path_in##*/}"
if [ -z "$d" ]; then d="/"; fi
if [ "$d" = "$path_in" ]; then d="/"; b="$path_in"; fi
if [ -d "$d" ]; then
abs_d="$(cd "$d" >/dev/null 2>&1 && pwd || true)"
if [ -n "$abs_d" ]; then
if [ "$b" = "." ] || [ -z "$b" ]; then printf '%s\n' "$abs_d"
elif [ "$abs_d" = "/" ]; then printf '/%s\n' "$b"
else printf '%s/%s\n' "$abs_d" "$b"; fi
else
normalize_path "$path_in"
fi
else
grep -q "^${1}:" /etc/passwd 2>/dev/null
normalize_path "$path_in"
fi
}
group_exists() {
if command -v getent >/dev/null 2>&1; then
getent group "$1" >/dev/null 2>&1
else
grep -q "^${1}:" /etc/group 2>/dev/null
fi
get_svc_mgr() {
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then echo "systemd"
elif command -v rc-service >/dev/null 2>&1; then echo "openrc"
else echo "none"; fi
}
verify_common() {
[ -z "$BIN_NAME" ] && die "BIN_NAME cannot be empty."
[ -z "$INSTALL_DIR" ] && die "INSTALL_DIR cannot be empty."
[ -z "$CONFIG_DIR" ] && die "CONFIG_DIR cannot be empty."
[ -n "$BIN_NAME" ] || die "BIN_NAME cannot be empty."
[ -n "$INSTALL_DIR" ] || die "INSTALL_DIR cannot be empty."
[ -n "$CONFIG_DIR" ] || die "CONFIG_DIR cannot be empty."
[ -n "$CONFIG_FILE" ] || die "CONFIG_FILE cannot be empty."
case "${INSTALL_DIR}${CONFIG_DIR}${WORK_DIR}${CONFIG_FILE}" in
*[!a-zA-Z0-9_./-]*) die "Invalid characters in paths. Only alphanumeric, _, ., -, and / allowed." ;;
esac
case "$TARGET_VERSION" in *[!a-zA-Z0-9_.-]*) die "Invalid characters in version." ;; esac
case "$BIN_NAME" in *[!a-zA-Z0-9_-]*) die "Invalid characters in BIN_NAME." ;; esac
INSTALL_DIR="$(get_realpath "$INSTALL_DIR")"
CONFIG_DIR="$(get_realpath "$CONFIG_DIR")"
WORK_DIR="$(get_realpath "$WORK_DIR")"
CONFIG_FILE="$(get_realpath "$CONFIG_FILE")"
CONFIG_PARENT_DIR="${CONFIG_FILE%/*}"
if [ -z "$CONFIG_PARENT_DIR" ]; then CONFIG_PARENT_DIR="/"; fi
if [ "$CONFIG_PARENT_DIR" = "$CONFIG_FILE" ]; then CONFIG_PARENT_DIR="."; fi
if [ "$(id -u)" -eq 0 ]; then
SUDO=""
else
if ! command -v sudo >/dev/null 2>&1; then
die "This script requires root or sudo. Neither found."
fi
command -v sudo >/dev/null 2>&1 || die "This script requires root or sudo. Neither found."
SUDO="sudo"
say "sudo is available. Caching credentials..."
if ! sudo -v; then
die "Failed to cache sudo credentials"
if ! sudo -n true 2>/dev/null; then
if ! [ -t 0 ]; then
die "sudo requires a password, but no TTY detected. Aborting to prevent hang."
fi
fi
fi
case "${INSTALL_DIR}${CONFIG_DIR}${WORK_DIR}" in
*[!a-zA-Z0-9_./-]*)
die "Invalid characters in path variables. Only alphanumeric, _, ., -, and / are allowed."
;;
esac
case "$BIN_NAME" in
*[!a-zA-Z0-9_-]*) die "Invalid characters in BIN_NAME: $BIN_NAME" ;;
esac
for path in "$CONFIG_DIR" "$WORK_DIR"; do
check_path="$path"
while [ "$check_path" != "/" ] && [ "${check_path%"/"}" != "$check_path" ]; do
check_path="${check_path%"/"}"
done
[ -z "$check_path" ] && check_path="/"
if [ -n "$SUDO" ]; then
if $SUDO sh -c '[ -d "$1" ]' _ "$CONFIG_FILE"; then
die "Safety check failed: CONFIG_FILE '$CONFIG_FILE' is a directory."
fi
elif [ -d "$CONFIG_FILE" ]; then
die "Safety check failed: CONFIG_FILE '$CONFIG_FILE' is a directory."
fi
for path in "$CONFIG_DIR" "$CONFIG_PARENT_DIR" "$WORK_DIR"; do
check_path="$(get_realpath "$path")"
case "$check_path" in
/|/bin|/sbin|/usr|/usr/bin|/usr/local|/etc|/opt|/var|/home|/root|/tmp)
die "Safety check failed: '$path' is a critical system directory."
;;
/|/bin|/sbin|/usr|/usr/bin|/usr/sbin|/usr/local|/usr/local/bin|/usr/local/sbin|/usr/local/etc|/usr/local/share|/etc|/var|/var/lib|/var/log|/var/run|/home|/root|/tmp|/lib|/lib64|/opt|/run|/boot|/dev|/sys|/proc)
die "Safety check failed: '$path' (resolved to '$check_path') is a critical system directory." ;;
esac
done
for cmd in uname grep find rm chown chmod mv head mktemp; do
check_install_dir="$(get_realpath "$INSTALL_DIR")"
case "$check_install_dir" in
/|/etc|/var|/home|/root|/tmp|/usr|/usr/local|/opt|/boot|/dev|/sys|/proc|/run)
die "Safety check failed: INSTALL_DIR '$INSTALL_DIR' is a critical system directory." ;;
esac
for cmd in id uname grep find rm chown chmod mv mktemp mkdir tr dd sed ps head sleep cat tar gzip rmdir; do
command -v "$cmd" >/dev/null 2>&1 || die "Required command not found: $cmd"
done
}
verify_install_deps() {
if ! command -v curl >/dev/null 2>&1 && ! command -v wget >/dev/null 2>&1; then
die "Neither curl nor wget is installed."
fi
command -v tar >/dev/null 2>&1 || die "Required command not found: tar"
command -v gzip >/dev/null 2>&1 || die "Required command not found: gzip"
command -v curl >/dev/null 2>&1 || command -v wget >/dev/null 2>&1 || die "Neither curl nor wget is installed."
command -v cp >/dev/null 2>&1 || command -v install >/dev/null 2>&1 || die "Need cp or install"
if ! command -v setcap >/dev/null 2>&1; then
say "setcap is missing. Installing required capability tools..."
if command -v apk >/dev/null 2>&1; then
$SUDO apk add --no-cache libcap || die "Failed to install libcap"
$SUDO apk add --no-cache libcap-utils >/dev/null 2>&1 || $SUDO apk add --no-cache libcap >/dev/null 2>&1 || true
elif command -v apt-get >/dev/null 2>&1; then
$SUDO apt-get update -qq && $SUDO apt-get install -y -qq libcap2-bin || die "Failed to install libcap2-bin"
elif command -v dnf >/dev/null 2>&1 || command -v yum >/dev/null 2>&1; then
$SUDO ${YUM_CMD:-yum} install -y -q libcap || die "Failed to install libcap"
else
die "Cannot install 'setcap'. Package manager not found. Please install libcap manually."
$SUDO apt-get update -q >/dev/null 2>&1 || true
$SUDO apt-get install -y -q libcap2-bin >/dev/null 2>&1 || true
elif command -v dnf >/dev/null 2>&1; then $SUDO dnf install -y -q libcap >/dev/null 2>&1 || true
elif command -v yum >/dev/null 2>&1; then $SUDO yum install -y -q libcap >/dev/null 2>&1 || true
fi
fi
}
@@ -163,122 +199,96 @@ detect_arch() {
}
detect_libc() {
if command -v ldd >/dev/null 2>&1 && ldd --version 2>&1 | grep -qi musl; then
echo "musl"; return 0
fi
if grep -q '^ID=alpine' /etc/os-release 2>/dev/null || grep -q '^ID="alpine"' /etc/os-release 2>/dev/null; then
echo "musl"; return 0
fi
for f in /lib/ld-musl-*.so.* /lib64/ld-musl-*.so.*; do
if [ -e "$f" ]; then
echo "musl"; return 0
fi
if [ -e "$f" ]; then echo "musl"; return 0; fi
done
if grep -qE '^ID="?alpine"?' /etc/os-release 2>/dev/null; then echo "musl"; return 0; fi
if command -v ldd >/dev/null 2>&1 && (ldd --version 2>&1 || true) | grep -qi musl; then echo "musl"; return 0; fi
echo "gnu"
}
fetch_file() {
fetch_url="$1"
fetch_out="$2"
if command -v curl >/dev/null 2>&1; then
curl -fsSL "$fetch_url" -o "$fetch_out" || return 1
elif command -v wget >/dev/null 2>&1; then
wget -qO "$fetch_out" "$fetch_url" || return 1
else
die "curl or wget required"
fi
if command -v curl >/dev/null 2>&1; then curl -fsSL "$1" -o "$2"
else wget -q -O "$2" "$1"; fi
}
ensure_user_group() {
nologin_bin="/bin/false"
nologin_bin="$(command -v nologin 2>/dev/null || command -v false 2>/dev/null || echo /bin/false)"
cmd_nologin="$(command -v nologin 2>/dev/null || true)"
if [ -n "$cmd_nologin" ] && [ -x "$cmd_nologin" ]; then
nologin_bin="$cmd_nologin"
else
for bin in /sbin/nologin /usr/sbin/nologin; do
if [ -x "$bin" ]; then
nologin_bin="$bin"
break
fi
done
if ! check_os_entity group telemt; then
if command -v groupadd >/dev/null 2>&1; then $SUDO groupadd -r telemt
elif command -v addgroup >/dev/null 2>&1; then $SUDO addgroup -S telemt
else die "Cannot create group"; fi
fi
if ! group_exists telemt; then
if command -v groupadd >/dev/null 2>&1; then
$SUDO groupadd -r telemt || die "Failed to create group via groupadd"
elif command -v addgroup >/dev/null 2>&1; then
$SUDO addgroup -S telemt || die "Failed to create group via addgroup"
else
die "Cannot create group: neither groupadd nor addgroup found"
fi
fi
if ! user_exists telemt; then
if ! check_os_entity passwd telemt; then
if command -v useradd >/dev/null 2>&1; then
$SUDO useradd -r -g telemt -d "$WORK_DIR" -s "$nologin_bin" -c "Telemt Proxy" telemt || die "Failed to create user via useradd"
$SUDO useradd -r -g telemt -d "$WORK_DIR" -s "$nologin_bin" -c "Telemt Proxy" telemt
elif command -v adduser >/dev/null 2>&1; then
$SUDO adduser -S -D -H -h "$WORK_DIR" -s "$nologin_bin" -G telemt telemt || die "Failed to create user via adduser"
else
die "Cannot create user: neither useradd nor adduser found"
fi
if adduser --help 2>&1 | grep -q -- '-S'; then
$SUDO adduser -S -D -H -h "$WORK_DIR" -s "$nologin_bin" -G telemt telemt
else
$SUDO adduser --system --home "$WORK_DIR" --shell "$nologin_bin" --no-create-home --ingroup telemt --disabled-password telemt
fi
else die "Cannot create user"; fi
fi
}
setup_dirs() {
say "Setting up directories..."
$SUDO mkdir -p "$WORK_DIR" "$CONFIG_DIR" || die "Failed to create directories"
$SUDO chown telemt:telemt "$WORK_DIR" || die "Failed to set owner on WORK_DIR"
$SUDO chmod 750 "$WORK_DIR" || die "Failed to set permissions on WORK_DIR"
$SUDO mkdir -p "$WORK_DIR" "$CONFIG_DIR" "$CONFIG_PARENT_DIR" || die "Failed to create directories"
$SUDO chown telemt:telemt "$WORK_DIR" && $SUDO chmod 750 "$WORK_DIR"
$SUDO chown root:telemt "$CONFIG_DIR" && $SUDO chmod 750 "$CONFIG_DIR"
if [ "$CONFIG_PARENT_DIR" != "$CONFIG_DIR" ] && [ "$CONFIG_PARENT_DIR" != "." ] && [ "$CONFIG_PARENT_DIR" != "/" ]; then
$SUDO chown root:telemt "$CONFIG_PARENT_DIR" && $SUDO chmod 750 "$CONFIG_PARENT_DIR"
fi
}
stop_service() {
say "Stopping service if running..."
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then
svc="$(get_svc_mgr)"
if [ "$svc" = "systemd" ] && systemctl is-active --quiet "$SERVICE_NAME" 2>/dev/null; then
$SUDO systemctl stop "$SERVICE_NAME" 2>/dev/null || true
elif command -v rc-service >/dev/null 2>&1; then
elif [ "$svc" = "openrc" ] && rc-service "$SERVICE_NAME" status >/dev/null 2>&1; then
$SUDO rc-service "$SERVICE_NAME" stop 2>/dev/null || true
fi
}
install_binary() {
bin_src="$1"
bin_dst="$2"
bin_src="$1"; bin_dst="$2"
if [ -e "$INSTALL_DIR" ] && [ ! -d "$INSTALL_DIR" ]; then
die "'$INSTALL_DIR' is not a directory."
fi
$SUDO mkdir -p "$INSTALL_DIR" || die "Failed to create install directory"
if command -v install >/dev/null 2>&1; then
$SUDO install -m 0755 "$bin_src" "$bin_dst" || die "Failed to install binary"
else
$SUDO rm -f "$bin_dst"
$SUDO cp "$bin_src" "$bin_dst" || die "Failed to copy binary"
$SUDO chmod 0755 "$bin_dst" || die "Failed to set permissions"
$SUDO rm -f "$bin_dst" 2>/dev/null || true
$SUDO cp "$bin_src" "$bin_dst" && $SUDO chmod 0755 "$bin_dst" || die "Failed to copy binary"
fi
if [ ! -x "$bin_dst" ]; then
die "Failed to install binary or it is not executable: $bin_dst"
fi
$SUDO sh -c '[ -x "$1" ]' _ "$bin_dst" || die "Binary not executable: $bin_dst"
say "Granting network bind capabilities to bind port 443..."
if ! $SUDO setcap cap_net_bind_service=+ep "$bin_dst" 2>/dev/null; then
say "[WARNING] Failed to apply setcap. The service will NOT be able to open port 443!"
say "[WARNING] This usually happens inside unprivileged Docker/LXC containers."
if command -v setcap >/dev/null 2>&1; then
$SUDO setcap cap_net_bind_service=+ep "$bin_dst" 2>/dev/null || true
fi
}
generate_secret() {
if command -v openssl >/dev/null 2>&1; then
secret="$(openssl rand -hex 16 2>/dev/null)" && [ -n "$secret" ] && { echo "$secret"; return 0; }
secret="$(command -v openssl >/dev/null 2>&1 && openssl rand -hex 16 2>/dev/null || true)"
if [ -z "$secret" ] || [ "${#secret}" -ne 32 ]; then
if command -v od >/dev/null 2>&1; then secret="$(dd if=/dev/urandom bs=16 count=1 2>/dev/null | od -An -tx1 | tr -d ' \n')"
elif command -v hexdump >/dev/null 2>&1; then secret="$(dd if=/dev/urandom bs=16 count=1 2>/dev/null | hexdump -e '1/1 "%02x"')"
elif command -v xxd >/dev/null 2>&1; then secret="$(dd if=/dev/urandom bs=16 count=1 2>/dev/null | xxd -p | tr -d '\n')"
fi
fi
if command -v xxd >/dev/null 2>&1; then
secret="$(dd if=/dev/urandom bs=1 count=16 2>/dev/null | xxd -p | tr -d '\n')" && [ -n "$secret" ] && { echo "$secret"; return 0; }
fi
secret="$(dd if=/dev/urandom bs=1 count=16 2>/dev/null | od -An -tx1 | tr -d ' \n')" && [ -n "$secret" ] && { echo "$secret"; return 0; }
return 1
if [ "${#secret}" -eq 32 ]; then echo "$secret"; else return 1; fi
}
generate_config_content() {
escaped_tls_domain="$(printf '%s\n' "$TLS_DOMAIN" | tr -d '[:cntrl:]' | sed 's/\\/\\\\/g; s/"/\\"/g')"
cat <<EOF
[general]
use_middle_proxy = false
@@ -297,7 +307,7 @@ listen = "127.0.0.1:9091"
whitelist = ["127.0.0.1/32"]
[censorship]
tls_domain = "petrovich.ru"
tls_domain = "${escaped_tls_domain}"
[access.users]
hello = "$1"
@@ -305,44 +315,38 @@ EOF
}
install_config() {
config_exists=0
if [ -n "$SUDO" ]; then
$SUDO sh -c "[ -f '$CONFIG_FILE' ]" 2>/dev/null && config_exists=1 || true
else
[ -f "$CONFIG_FILE" ] && config_exists=1 || true
fi
if [ "$config_exists" -eq 1 ]; then
say "Config already exists, skipping generation."
if $SUDO sh -c '[ -f "$1" ]' _ "$CONFIG_FILE"; then
say " -> Config already exists at $CONFIG_FILE. Skipping creation."
return 0
fi
elif [ -f "$CONFIG_FILE" ]; then
say " -> Config already exists at $CONFIG_FILE. Skipping creation."
return 0
fi
toml_secret="$(generate_secret)" || die "Failed to generate secret"
say "Creating config at $CONFIG_FILE..."
toml_secret="$(generate_secret)" || die "Failed to generate secret."
tmp_conf="$(mktemp "${TEMP_DIR:-/tmp}/telemt_conf.XXXXXX")" || die "Failed to create temp config"
generate_config_content "$toml_secret" > "$tmp_conf" || die "Failed to write temp config"
generate_config_content "$toml_secret" | write_root "$CONFIG_FILE" || die "Failed to install config"
$SUDO chown root:telemt "$CONFIG_FILE" && $SUDO chmod 640 "$CONFIG_FILE"
$SUDO mv "$tmp_conf" "$CONFIG_FILE" || die "Failed to install config file"
$SUDO chown root:telemt "$CONFIG_FILE" || die "Failed to set owner"
$SUDO chmod 640 "$CONFIG_FILE" || die "Failed to set config permissions"
say "Secret for user 'hello': $toml_secret"
say " -> Config created successfully."
say " -> Generated secret for default user 'hello': $toml_secret"
}
generate_systemd_content() {
cat <<EOF
[Unit]
Description=Telemt Proxy Service
Description=Telemt
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=telemt
Group=telemt
WorkingDirectory=$WORK_DIR
ExecStart=${INSTALL_DIR}/${BIN_NAME} ${CONFIG_FILE}
ExecStart="${INSTALL_DIR}/${BIN_NAME}" "${CONFIG_FILE}"
Restart=on-failure
LimitNOFILE=65536
AmbientCapabilities=CAP_NET_BIND_SERVICE
@@ -370,156 +374,183 @@ EOF
}
install_service() {
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then
say "Installing systemd service..."
tmp_svc="$(mktemp "${TEMP_DIR:-/tmp}/${SERVICE_NAME}.service.XXXXXX")" || die "Failed to create temp service"
generate_systemd_content > "$tmp_svc" || die "Failed to generate service content"
svc="$(get_svc_mgr)"
if [ "$svc" = "systemd" ]; then
generate_systemd_content | write_root "/etc/systemd/system/${SERVICE_NAME}.service"
$SUDO chown root:root "/etc/systemd/system/${SERVICE_NAME}.service" && $SUDO chmod 644 "/etc/systemd/system/${SERVICE_NAME}.service"
$SUDO mv "$tmp_svc" "/etc/systemd/system/${SERVICE_NAME}.service" || die "Failed to move service file"
$SUDO chown root:root "/etc/systemd/system/${SERVICE_NAME}.service"
$SUDO chmod 644 "/etc/systemd/system/${SERVICE_NAME}.service"
$SUDO systemctl daemon-reload || true
$SUDO systemctl enable "$SERVICE_NAME" || true
if ! $SUDO systemctl start "$SERVICE_NAME"; then
say "[WARNING] Failed to start service"
SERVICE_START_FAILED=1
fi
elif [ "$svc" = "openrc" ]; then
generate_openrc_content | write_root "/etc/init.d/${SERVICE_NAME}"
$SUDO chown root:root "/etc/init.d/${SERVICE_NAME}" && $SUDO chmod 0755 "/etc/init.d/${SERVICE_NAME}"
$SUDO systemctl daemon-reload || die "Failed to reload systemd"
$SUDO systemctl enable "$SERVICE_NAME" || die "Failed to enable service"
$SUDO systemctl start "$SERVICE_NAME" || die "Failed to start service"
elif command -v rc-update >/dev/null 2>&1; then
say "Installing OpenRC service..."
tmp_svc="$(mktemp "${TEMP_DIR:-/tmp}/${SERVICE_NAME}.init.XXXXXX")" || die "Failed to create temp file"
generate_openrc_content > "$tmp_svc" || die "Failed to generate init content"
$SUDO mv "$tmp_svc" "/etc/init.d/${SERVICE_NAME}" || die "Failed to move service file"
$SUDO chown root:root "/etc/init.d/${SERVICE_NAME}"
$SUDO chmod 0755 "/etc/init.d/${SERVICE_NAME}"
$SUDO rc-update add "$SERVICE_NAME" default 2>/dev/null || die "Failed to register service"
$SUDO rc-service "$SERVICE_NAME" start 2>/dev/null || die "Failed to start OpenRC service"
$SUDO rc-update add "$SERVICE_NAME" default 2>/dev/null || true
if ! $SUDO rc-service "$SERVICE_NAME" start 2>/dev/null; then
say "[WARNING] Failed to start service"
SERVICE_START_FAILED=1
fi
else
say "No service manager found. You can start it manually with:"
if [ -n "$SUDO" ]; then
say " sudo -u telemt ${INSTALL_DIR}/${BIN_NAME} ${CONFIG_FILE}"
else
say " su -s /bin/sh telemt -c '${INSTALL_DIR}/${BIN_NAME} ${CONFIG_FILE}'"
cmd="\"${INSTALL_DIR}/${BIN_NAME}\" \"${CONFIG_FILE}\""
if [ -n "$SUDO" ]; then
say " -> Service manager not found. Start manually: sudo -u telemt $cmd"
else
say " -> Service manager not found. Start manually: su -s /bin/sh telemt -c '$cmd'"
fi
fi
}
kill_user_procs() {
say "Ensuring $BIN_NAME processes are killed..."
if pkill_cmd="$(command -v pkill 2>/dev/null)"; then
$SUDO "$pkill_cmd" -u telemt "$BIN_NAME" 2>/dev/null || true
if command -v pkill >/dev/null 2>&1; then
$SUDO pkill -u telemt "$BIN_NAME" 2>/dev/null || true
sleep 1
$SUDO "$pkill_cmd" -9 -u telemt "$BIN_NAME" 2>/dev/null || true
elif killall_cmd="$(command -v killall 2>/dev/null)"; then
$SUDO "$killall_cmd" "$BIN_NAME" 2>/dev/null || true
sleep 1
$SUDO "$killall_cmd" -9 "$BIN_NAME" 2>/dev/null || true
$SUDO pkill -9 -u telemt "$BIN_NAME" 2>/dev/null || true
else
if command -v pgrep >/dev/null 2>&1; then
pids="$(pgrep -u telemt 2>/dev/null || true)"
else
pids="$(ps -u telemt -o pid= 2>/dev/null || true)"
fi
if [ -n "$pids" ]; then
for pid in $pids; do
case "$pid" in ''|*[!0-9]*) continue ;; *) $SUDO kill "$pid" 2>/dev/null || true ;; esac
done
sleep 1
for pid in $pids; do
case "$pid" in ''|*[!0-9]*) continue ;; *) $SUDO kill -9 "$pid" 2>/dev/null || true ;; esac
done
fi
fi
}
uninstall() {
purge_data=0
[ "$ACTION" = "purge" ] && purge_data=1
say "Starting uninstallation of $BIN_NAME..."
say "Uninstalling $BIN_NAME..."
say ">>> Stage 1: Stopping services"
stop_service
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then
say ">>> Stage 2: Removing service configuration"
svc="$(get_svc_mgr)"
if [ "$svc" = "systemd" ]; then
$SUDO systemctl disable "$SERVICE_NAME" 2>/dev/null || true
$SUDO rm -f "/etc/systemd/system/${SERVICE_NAME}.service"
$SUDO systemctl daemon-reload || true
elif command -v rc-update >/dev/null 2>&1; then
$SUDO systemctl daemon-reload 2>/dev/null || true
elif [ "$svc" = "openrc" ]; then
$SUDO rc-update del "$SERVICE_NAME" 2>/dev/null || true
$SUDO rm -f "/etc/init.d/${SERVICE_NAME}"
fi
say ">>> Stage 3: Terminating user processes"
kill_user_procs
say ">>> Stage 4: Removing binary"
$SUDO rm -f "${INSTALL_DIR}/${BIN_NAME}"
$SUDO userdel telemt 2>/dev/null || $SUDO deluser telemt 2>/dev/null || true
$SUDO groupdel telemt 2>/dev/null || $SUDO delgroup telemt 2>/dev/null || true
if [ "$purge_data" -eq 1 ]; then
say "Purging configuration and data..."
if [ "$ACTION" = "purge" ]; then
say ">>> Stage 5: Purging configuration, data, and user"
$SUDO rm -rf "$CONFIG_DIR" "$WORK_DIR"
$SUDO rm -f "$CONFIG_FILE"
if [ "$CONFIG_PARENT_DIR" != "$CONFIG_DIR" ] && [ "$CONFIG_PARENT_DIR" != "." ] && [ "$CONFIG_PARENT_DIR" != "/" ]; then
$SUDO rmdir "$CONFIG_PARENT_DIR" 2>/dev/null || true
fi
$SUDO userdel telemt 2>/dev/null || $SUDO deluser telemt 2>/dev/null || true
$SUDO groupdel telemt 2>/dev/null || $SUDO delgroup telemt 2>/dev/null || true
else
say "Note: Configuration in $CONFIG_DIR was kept. Run with '--purge' to remove it."
say "Note: Configuration and user kept. Run with 'purge' to remove completely."
fi
say "Uninstallation complete."
printf '\n====================================================================\n'
printf ' UNINSTALLATION COMPLETE\n'
printf '====================================================================\n\n'
exit 0
}
# ============================================================================
# Main Entry Point
# ============================================================================
case "$ACTION" in
help)
show_help
;;
uninstall|purge)
verify_common
uninstall
;;
help) show_help ;;
uninstall|purge) verify_common; uninstall ;;
install)
say "Starting installation..."
verify_common
verify_install_deps
say "Starting installation of $BIN_NAME (Version: $TARGET_VERSION)"
ARCH="$(detect_arch)"
LIBC="$(detect_libc)"
say "Detected system: $ARCH-linux-$LIBC"
say ">>> Stage 1: Verifying environment and dependencies"
verify_common; verify_install_deps
if [ "$TARGET_VERSION" != "latest" ]; then
TARGET_VERSION="${TARGET_VERSION#v}"
fi
ARCH="$(detect_arch)"; LIBC="$(detect_libc)"
FILE_NAME="${BIN_NAME}-${ARCH}-linux-${LIBC}.tar.gz"
FILE_NAME="$(printf '%s' "$FILE_NAME" | tr -d ' \t\n\r')"
if [ "$TARGET_VERSION" = "latest" ]; then
DL_URL="https://github.com/${REPO}/releases/latest/download/${FILE_NAME}"
else
else
DL_URL="https://github.com/${REPO}/releases/download/${TARGET_VERSION}/${FILE_NAME}"
fi
TEMP_DIR="$(mktemp -d)" || die "Failed to create temp directory"
say ">>> Stage 2: Downloading archive"
TEMP_DIR="$(mktemp -d)" || die "Temp directory creation failed"
if [ -z "$TEMP_DIR" ] || [ ! -d "$TEMP_DIR" ]; then
die "Temp directory creation failed"
die "Temp directory is invalid or was not created"
fi
say "Downloading from $DL_URL..."
fetch_file "$DL_URL" "${TEMP_DIR}/archive.tar.gz" || die "Download failed (check version or network)"
fetch_file "$DL_URL" "${TEMP_DIR}/${FILE_NAME}" || die "Download failed"
gzip -dc "${TEMP_DIR}/archive.tar.gz" | tar -xf - -C "$TEMP_DIR" || die "Extraction failed"
say ">>> Stage 3: Extracting archive"
if ! gzip -dc "${TEMP_DIR}/${FILE_NAME}" | tar -xf - -C "$TEMP_DIR" 2>/dev/null; then
die "Extraction failed (downloaded archive might be invalid or 404)."
fi
EXTRACTED_BIN="$(find "$TEMP_DIR" -type f -name "$BIN_NAME" -print 2>/dev/null | head -n 1)"
[ -z "$EXTRACTED_BIN" ] && die "Binary '$BIN_NAME' not found in archive"
EXTRACTED_BIN="$(find "$TEMP_DIR" -type f -name "$BIN_NAME" -print 2>/dev/null | head -n 1 || true)"
[ -n "$EXTRACTED_BIN" ] || die "Binary '$BIN_NAME' not found in archive"
ensure_user_group
setup_dirs
stop_service
say "Installing binary..."
say ">>> Stage 4: Setting up environment (User, Group, Directories)"
ensure_user_group; setup_dirs; stop_service
say ">>> Stage 5: Installing binary"
install_binary "$EXTRACTED_BIN" "${INSTALL_DIR}/${BIN_NAME}"
say ">>> Stage 6: Generating configuration"
install_config
say ">>> Stage 7: Installing and starting service"
install_service
say ""
say "============================================="
say "Installation complete!"
say "============================================="
if command -v systemctl >/dev/null 2>&1 && [ -d /run/systemd/system ]; then
say "To check the logs, run:"
say " journalctl -u $SERVICE_NAME -f"
say ""
fi
say "To get user connection links, run:"
if command -v jq >/dev/null 2>&1; then
say " curl -s http://127.0.0.1:9091/v1/users | jq -r '.data[] | \"User: \\(.username)\\n\\(.links.tls[0] // empty)\"'"
if [ "${SERVICE_START_FAILED:-0}" -eq 1 ]; then
printf '\n====================================================================\n'
printf ' INSTALLATION COMPLETED WITH WARNINGS\n'
printf '====================================================================\n\n'
printf 'The service was installed but failed to start automatically.\n'
printf 'Please check the logs to determine the issue.\n\n'
else
say " curl -s http://127.0.0.1:9091/v1/users"
say " (Note: Install 'jq' package to see the links nicely formatted)"
printf '\n====================================================================\n'
printf ' INSTALLATION SUCCESS\n'
printf '====================================================================\n\n'
fi
svc="$(get_svc_mgr)"
if [ "$svc" = "systemd" ]; then
printf 'To check the status of your proxy service, run:\n'
printf ' systemctl status %s\n\n' "$SERVICE_NAME"
elif [ "$svc" = "openrc" ]; then
printf 'To check the status of your proxy service, run:\n'
printf ' rc-service %s status\n\n' "$SERVICE_NAME"
fi
printf 'To get your user connection links (for Telegram), run:\n'
if command -v jq >/dev/null 2>&1; then
printf ' curl -s http://127.0.0.1:9091/v1/users | jq -r '\''.data[] | "User: \\(.username)\\n\\(.links.tls[0] // empty)\\n"'\''\n'
else
printf ' curl -s http://127.0.0.1:9091/v1/users\n'
printf ' (Tip: Install '\''jq'\'' for a much cleaner output)\n'
fi
printf '\n====================================================================\n'
;;
esac

View File

@@ -134,6 +134,7 @@ pub(super) struct UpstreamSummaryData {
pub(super) direct_total: usize,
pub(super) socks4_total: usize,
pub(super) socks5_total: usize,
pub(super) shadowsocks_total: usize,
}
#[derive(Serialize, Clone)]
@@ -205,6 +206,16 @@ pub(super) struct ZeroPoolData {
pub(super) refill_failed_total: u64,
pub(super) writer_restored_same_endpoint_total: u64,
pub(super) writer_restored_fallback_total: u64,
pub(super) teardown_attempt_total_normal: u64,
pub(super) teardown_attempt_total_hard_detach: u64,
pub(super) teardown_success_total_normal: u64,
pub(super) teardown_success_total_hard_detach: u64,
pub(super) teardown_timeout_total: u64,
pub(super) teardown_escalation_total: u64,
pub(super) teardown_noop_total: u64,
pub(super) teardown_cleanup_side_effect_failures_total: u64,
pub(super) teardown_duration_count_total: u64,
pub(super) teardown_duration_sum_seconds_total: f64,
}
#[derive(Serialize, Clone)]
@@ -364,6 +375,7 @@ pub(super) struct MinimalMeRuntimeData {
pub(super) me_reconnect_backoff_cap_ms: u64,
pub(super) me_reconnect_fast_retry_count: u32,
pub(super) me_pool_drain_ttl_secs: u64,
pub(super) me_instadrain: bool,
pub(super) me_pool_drain_soft_evict_enabled: bool,
pub(super) me_pool_drain_soft_evict_grace_secs: u64,
pub(super) me_pool_drain_soft_evict_per_writer: u8,

View File

@@ -4,6 +4,9 @@ use std::time::{SystemTime, UNIX_EPOCH};
use serde::Serialize;
use crate::config::ProxyConfig;
use crate::stats::{
MeWriterCleanupSideEffectStep, MeWriterTeardownMode, MeWriterTeardownReason, Stats,
};
use super::ApiShared;
@@ -98,6 +101,50 @@ pub(super) struct RuntimeMeQualityCountersData {
pub(super) reconnect_success_total: u64,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityTeardownAttemptData {
pub(super) reason: &'static str,
pub(super) mode: &'static str,
pub(super) total: u64,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityTeardownSuccessData {
pub(super) mode: &'static str,
pub(super) total: u64,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityTeardownSideEffectData {
pub(super) step: &'static str,
pub(super) total: u64,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityTeardownDurationBucketData {
pub(super) le_seconds: &'static str,
pub(super) total: u64,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityTeardownDurationData {
pub(super) mode: &'static str,
pub(super) count: u64,
pub(super) sum_seconds: f64,
pub(super) buckets: Vec<RuntimeMeQualityTeardownDurationBucketData>,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityTeardownData {
pub(super) attempts: Vec<RuntimeMeQualityTeardownAttemptData>,
pub(super) success: Vec<RuntimeMeQualityTeardownSuccessData>,
pub(super) timeout_total: u64,
pub(super) escalation_total: u64,
pub(super) noop_total: u64,
pub(super) cleanup_side_effect_failures: Vec<RuntimeMeQualityTeardownSideEffectData>,
pub(super) duration: Vec<RuntimeMeQualityTeardownDurationData>,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityRouteDropData {
pub(super) no_conn_total: u64,
@@ -107,6 +154,25 @@ pub(super) struct RuntimeMeQualityRouteDropData {
pub(super) queue_full_high_total: u64,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityFamilyStateData {
pub(super) family: &'static str,
pub(super) state: &'static str,
pub(super) state_since_epoch_secs: u64,
#[serde(skip_serializing_if = "Option::is_none")]
pub(super) suppressed_until_epoch_secs: Option<u64>,
pub(super) fail_streak: u32,
pub(super) recover_success_streak: u32,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityDrainGateData {
pub(super) route_quorum_ok: bool,
pub(super) redundancy_ok: bool,
pub(super) block_reason: &'static str,
pub(super) updated_at_epoch_secs: u64,
}
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityDcRttData {
pub(super) dc: i16,
@@ -120,7 +186,10 @@ pub(super) struct RuntimeMeQualityDcRttData {
#[derive(Serialize)]
pub(super) struct RuntimeMeQualityPayload {
pub(super) counters: RuntimeMeQualityCountersData,
pub(super) teardown: RuntimeMeQualityTeardownData,
pub(super) route_drops: RuntimeMeQualityRouteDropData,
pub(super) family_states: Vec<RuntimeMeQualityFamilyStateData>,
pub(super) drain_gate: RuntimeMeQualityDrainGateData,
pub(super) dc_rtt: Vec<RuntimeMeQualityDcRttData>,
}
@@ -159,6 +228,7 @@ pub(super) struct RuntimeUpstreamQualitySummaryData {
pub(super) direct_total: usize,
pub(super) socks4_total: usize,
pub(super) socks5_total: usize,
pub(super) shadowsocks_total: usize,
}
#[derive(Serialize)]
@@ -361,6 +431,19 @@ pub(super) async fn build_runtime_me_quality_data(shared: &ApiShared) -> Runtime
};
let status = pool.api_status_snapshot().await;
let family_states = pool
.api_family_state_snapshot()
.into_iter()
.map(|entry| RuntimeMeQualityFamilyStateData {
family: entry.family,
state: entry.state,
state_since_epoch_secs: entry.state_since_epoch_secs,
suppressed_until_epoch_secs: entry.suppressed_until_epoch_secs,
fail_streak: entry.fail_streak,
recover_success_streak: entry.recover_success_streak,
})
.collect();
let drain_gate_snapshot = pool.api_drain_gate_snapshot();
RuntimeMeQualityData {
enabled: true,
reason: None,
@@ -374,6 +457,7 @@ pub(super) async fn build_runtime_me_quality_data(shared: &ApiShared) -> Runtime
reconnect_attempt_total: shared.stats.get_me_reconnect_attempts(),
reconnect_success_total: shared.stats.get_me_reconnect_success(),
},
teardown: build_runtime_me_teardown_data(shared),
route_drops: RuntimeMeQualityRouteDropData {
no_conn_total: shared.stats.get_me_route_drop_no_conn(),
channel_closed_total: shared.stats.get_me_route_drop_channel_closed(),
@@ -381,6 +465,13 @@ pub(super) async fn build_runtime_me_quality_data(shared: &ApiShared) -> Runtime
queue_full_base_total: shared.stats.get_me_route_drop_queue_full_base(),
queue_full_high_total: shared.stats.get_me_route_drop_queue_full_high(),
},
family_states,
drain_gate: RuntimeMeQualityDrainGateData {
route_quorum_ok: drain_gate_snapshot.route_quorum_ok,
redundancy_ok: drain_gate_snapshot.redundancy_ok,
block_reason: drain_gate_snapshot.block_reason,
updated_at_epoch_secs: drain_gate_snapshot.updated_at_epoch_secs,
},
dc_rtt: status
.dcs
.into_iter()
@@ -397,6 +488,81 @@ pub(super) async fn build_runtime_me_quality_data(shared: &ApiShared) -> Runtime
}
}
fn build_runtime_me_teardown_data(shared: &ApiShared) -> RuntimeMeQualityTeardownData {
let attempts = MeWriterTeardownReason::ALL
.iter()
.copied()
.flat_map(|reason| {
MeWriterTeardownMode::ALL
.iter()
.copied()
.map(move |mode| RuntimeMeQualityTeardownAttemptData {
reason: reason.as_str(),
mode: mode.as_str(),
total: shared.stats.get_me_writer_teardown_attempt_total(reason, mode),
})
})
.collect();
let success = MeWriterTeardownMode::ALL
.iter()
.copied()
.map(|mode| RuntimeMeQualityTeardownSuccessData {
mode: mode.as_str(),
total: shared.stats.get_me_writer_teardown_success_total(mode),
})
.collect();
let cleanup_side_effect_failures = MeWriterCleanupSideEffectStep::ALL
.iter()
.copied()
.map(|step| RuntimeMeQualityTeardownSideEffectData {
step: step.as_str(),
total: shared
.stats
.get_me_writer_cleanup_side_effect_failures_total(step),
})
.collect();
let duration = MeWriterTeardownMode::ALL
.iter()
.copied()
.map(|mode| {
let count = shared.stats.get_me_writer_teardown_duration_count(mode);
let mut buckets: Vec<RuntimeMeQualityTeardownDurationBucketData> = Stats::me_writer_teardown_duration_bucket_labels()
.iter()
.enumerate()
.map(|(bucket_idx, label)| RuntimeMeQualityTeardownDurationBucketData {
le_seconds: label,
total: shared
.stats
.get_me_writer_teardown_duration_bucket_total(mode, bucket_idx),
})
.collect();
buckets.push(RuntimeMeQualityTeardownDurationBucketData {
le_seconds: "+Inf",
total: count,
});
RuntimeMeQualityTeardownDurationData {
mode: mode.as_str(),
count,
sum_seconds: shared.stats.get_me_writer_teardown_duration_sum_seconds(mode),
buckets,
}
})
.collect();
RuntimeMeQualityTeardownData {
attempts,
success,
timeout_total: shared.stats.get_me_writer_teardown_timeout_total(),
escalation_total: shared.stats.get_me_writer_teardown_escalation_total(),
noop_total: shared.stats.get_me_writer_teardown_noop_total(),
cleanup_side_effect_failures,
duration,
}
}
pub(super) async fn build_runtime_upstream_quality_data(
shared: &ApiShared,
) -> RuntimeUpstreamQualityData {
@@ -406,7 +572,9 @@ pub(super) async fn build_runtime_upstream_quality_data(
connect_attempt_total: shared.stats.get_upstream_connect_attempt_total(),
connect_success_total: shared.stats.get_upstream_connect_success_total(),
connect_fail_total: shared.stats.get_upstream_connect_fail_total(),
connect_failfast_hard_error_total: shared.stats.get_upstream_connect_failfast_hard_error_total(),
connect_failfast_hard_error_total: shared
.stats
.get_upstream_connect_failfast_hard_error_total(),
};
let Some(snapshot) = shared.upstream_manager.try_api_snapshot() else {
@@ -446,6 +614,7 @@ pub(super) async fn build_runtime_upstream_quality_data(
direct_total: snapshot.summary.direct_total,
socks4_total: snapshot.summary.socks4_total,
socks5_total: snapshot.summary.socks5_total,
shadowsocks_total: snapshot.summary.shadowsocks_total,
}),
upstreams: Some(
snapshot
@@ -457,6 +626,7 @@ pub(super) async fn build_runtime_upstream_quality_data(
crate::transport::UpstreamRouteKind::Direct => "direct",
crate::transport::UpstreamRouteKind::Socks4 => "socks4",
crate::transport::UpstreamRouteKind::Socks5 => "socks5",
crate::transport::UpstreamRouteKind::Shadowsocks => "shadowsocks",
},
address: upstream.address,
weight: upstream.weight,
@@ -476,7 +646,9 @@ pub(super) async fn build_runtime_upstream_quality_data(
crate::transport::upstream::IpPreference::PreferV6 => "prefer_v6",
crate::transport::upstream::IpPreference::PreferV4 => "prefer_v4",
crate::transport::upstream::IpPreference::BothWork => "both_work",
crate::transport::upstream::IpPreference::Unavailable => "unavailable",
crate::transport::upstream::IpPreference::Unavailable => {
"unavailable"
}
},
})
.collect(),
@@ -514,14 +686,18 @@ pub(super) async fn build_runtime_nat_stun_data(shared: &ApiShared) -> RuntimeNa
live_total: snapshot.live_servers.len(),
},
reflection: RuntimeNatStunReflectionBlockData {
v4: snapshot.reflection_v4.map(|entry| RuntimeNatStunReflectionData {
addr: entry.addr.to_string(),
age_secs: entry.age_secs,
}),
v6: snapshot.reflection_v6.map(|entry| RuntimeNatStunReflectionData {
addr: entry.addr.to_string(),
age_secs: entry.age_secs,
}),
v4: snapshot
.reflection_v4
.map(|entry| RuntimeNatStunReflectionData {
addr: entry.addr.to_string(),
age_secs: entry.age_secs,
}),
v6: snapshot
.reflection_v6
.map(|entry| RuntimeNatStunReflectionData {
addr: entry.addr.to_string(),
age_secs: entry.age_secs,
}),
},
stun_backoff_remaining_ms: snapshot.stun_backoff_remaining_ms,
}),

View File

@@ -1,5 +1,5 @@
use std::net::IpAddr;
use std::collections::HashMap;
use std::net::IpAddr;
use std::sync::{Mutex, OnceLock};
use std::time::{SystemTime, UNIX_EPOCH};
@@ -7,8 +7,8 @@ use serde::Serialize;
use crate::config::{ProxyConfig, UpstreamType};
use crate::network::probe::{detect_interface_ipv4, detect_interface_ipv6, is_bogon};
use crate::transport::middle_proxy::{bnd_snapshot, timeskew_snapshot, upstream_bnd_snapshots};
use crate::transport::UpstreamRouteKind;
use crate::transport::middle_proxy::{bnd_snapshot, timeskew_snapshot, upstream_bnd_snapshots};
use super::ApiShared;
@@ -262,8 +262,8 @@ fn update_kdf_ewma(now_epoch_secs: u64, total_errors: u64) -> f64 {
let delta_errors = total_errors.saturating_sub(guard.last_total_errors);
let instant_rate_per_min = (delta_errors as f64) * 60.0 / (dt_secs as f64);
let alpha = 1.0 - f64::exp(-(dt_secs as f64) / KDF_EWMA_TAU_SECS);
guard.ewma_errors_per_min = guard.ewma_errors_per_min
+ alpha * (instant_rate_per_min - guard.ewma_errors_per_min);
guard.ewma_errors_per_min =
guard.ewma_errors_per_min + alpha * (instant_rate_per_min - guard.ewma_errors_per_min);
guard.last_epoch_secs = now_epoch_secs;
guard.last_total_errors = total_errors;
guard.ewma_errors_per_min
@@ -284,6 +284,7 @@ fn map_route_kind(value: UpstreamRouteKind) -> &'static str {
UpstreamRouteKind::Direct => "direct",
UpstreamRouteKind::Socks4 => "socks4",
UpstreamRouteKind::Socks5 => "socks5",
UpstreamRouteKind::Shadowsocks => "shadowsocks",
}
}

View File

@@ -1,7 +1,7 @@
use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
use crate::config::ApiConfig;
use crate::stats::Stats;
use crate::stats::{MeWriterTeardownMode, Stats};
use crate::transport::upstream::IpPreference;
use crate::transport::UpstreamRouteKind;
@@ -106,6 +106,29 @@ pub(super) fn build_zero_all_data(stats: &Stats, configured_users: usize) -> Zer
refill_failed_total: stats.get_me_refill_failed_total(),
writer_restored_same_endpoint_total: stats.get_me_writer_restored_same_endpoint_total(),
writer_restored_fallback_total: stats.get_me_writer_restored_fallback_total(),
teardown_attempt_total_normal: stats
.get_me_writer_teardown_attempt_total_by_mode(MeWriterTeardownMode::Normal),
teardown_attempt_total_hard_detach: stats
.get_me_writer_teardown_attempt_total_by_mode(MeWriterTeardownMode::HardDetach),
teardown_success_total_normal: stats
.get_me_writer_teardown_success_total(MeWriterTeardownMode::Normal),
teardown_success_total_hard_detach: stats
.get_me_writer_teardown_success_total(MeWriterTeardownMode::HardDetach),
teardown_timeout_total: stats.get_me_writer_teardown_timeout_total(),
teardown_escalation_total: stats.get_me_writer_teardown_escalation_total(),
teardown_noop_total: stats.get_me_writer_teardown_noop_total(),
teardown_cleanup_side_effect_failures_total: stats
.get_me_writer_cleanup_side_effect_failures_total_all(),
teardown_duration_count_total: stats
.get_me_writer_teardown_duration_count(MeWriterTeardownMode::Normal)
.saturating_add(
stats.get_me_writer_teardown_duration_count(MeWriterTeardownMode::HardDetach),
),
teardown_duration_sum_seconds_total: stats
.get_me_writer_teardown_duration_sum_seconds(MeWriterTeardownMode::Normal)
+ stats.get_me_writer_teardown_duration_sum_seconds(
MeWriterTeardownMode::HardDetach,
),
},
desync: ZeroDesyncData {
secure_padding_invalid_total: stats.get_secure_padding_invalid(),
@@ -138,7 +161,8 @@ fn build_zero_upstream_data(stats: &Stats) -> ZeroUpstreamData {
.get_upstream_connect_duration_success_bucket_501_1000ms(),
connect_duration_success_bucket_gt_1000ms: stats
.get_upstream_connect_duration_success_bucket_gt_1000ms(),
connect_duration_fail_bucket_le_100ms: stats.get_upstream_connect_duration_fail_bucket_le_100ms(),
connect_duration_fail_bucket_le_100ms: stats
.get_upstream_connect_duration_fail_bucket_le_100ms(),
connect_duration_fail_bucket_101_500ms: stats
.get_upstream_connect_duration_fail_bucket_101_500ms(),
connect_duration_fail_bucket_501_1000ms: stats
@@ -180,6 +204,7 @@ pub(super) fn build_upstreams_data(shared: &ApiShared, api_cfg: &ApiConfig) -> U
direct_total: snapshot.summary.direct_total,
socks4_total: snapshot.summary.socks4_total,
socks5_total: snapshot.summary.socks5_total,
shadowsocks_total: snapshot.summary.shadowsocks_total,
};
let upstreams = snapshot
.upstreams
@@ -395,8 +420,7 @@ async fn get_minimal_payload_cached(
adaptive_floor_min_writers_multi_endpoint: runtime
.adaptive_floor_min_writers_multi_endpoint,
adaptive_floor_recover_grace_secs: runtime.adaptive_floor_recover_grace_secs,
adaptive_floor_writers_per_core_total: runtime
.adaptive_floor_writers_per_core_total,
adaptive_floor_writers_per_core_total: runtime.adaptive_floor_writers_per_core_total,
adaptive_floor_cpu_cores_override: runtime.adaptive_floor_cpu_cores_override,
adaptive_floor_max_extra_writers_single_per_core: runtime
.adaptive_floor_max_extra_writers_single_per_core,
@@ -404,12 +428,9 @@ async fn get_minimal_payload_cached(
.adaptive_floor_max_extra_writers_multi_per_core,
adaptive_floor_max_active_writers_per_core: runtime
.adaptive_floor_max_active_writers_per_core,
adaptive_floor_max_warm_writers_per_core: runtime
.adaptive_floor_max_warm_writers_per_core,
adaptive_floor_max_active_writers_global: runtime
.adaptive_floor_max_active_writers_global,
adaptive_floor_max_warm_writers_global: runtime
.adaptive_floor_max_warm_writers_global,
adaptive_floor_max_warm_writers_per_core: runtime.adaptive_floor_max_warm_writers_per_core,
adaptive_floor_max_active_writers_global: runtime.adaptive_floor_max_active_writers_global,
adaptive_floor_max_warm_writers_global: runtime.adaptive_floor_max_warm_writers_global,
adaptive_floor_cpu_cores_detected: runtime.adaptive_floor_cpu_cores_detected,
adaptive_floor_cpu_cores_effective: runtime.adaptive_floor_cpu_cores_effective,
adaptive_floor_global_cap_raw: runtime.adaptive_floor_global_cap_raw,
@@ -431,6 +452,7 @@ async fn get_minimal_payload_cached(
me_reconnect_backoff_cap_ms: runtime.me_reconnect_backoff_cap_ms,
me_reconnect_fast_retry_count: runtime.me_reconnect_fast_retry_count,
me_pool_drain_ttl_secs: runtime.me_pool_drain_ttl_secs,
me_instadrain: runtime.me_instadrain,
me_pool_drain_soft_evict_enabled: runtime.me_pool_drain_soft_evict_enabled,
me_pool_drain_soft_evict_grace_secs: runtime.me_pool_drain_soft_evict_grace_secs,
me_pool_drain_soft_evict_per_writer: runtime.me_pool_drain_soft_evict_per_writer,
@@ -527,6 +549,7 @@ fn map_route_kind(value: UpstreamRouteKind) -> &'static str {
UpstreamRouteKind::Direct => "direct",
UpstreamRouteKind::Socks4 => "socks4",
UpstreamRouteKind::Socks5 => "socks5",
UpstreamRouteKind::Shadowsocks => "shadowsocks",
}
}

View File

@@ -1,11 +1,270 @@
//! CLI commands: --init (fire-and-forget setup)
//! CLI commands: --init (fire-and-forget setup), daemon options, subcommands
//!
//! Subcommands:
//! - `start [OPTIONS] [config.toml]` - Start the daemon
//! - `stop [--pid-file PATH]` - Stop a running daemon
//! - `reload [--pid-file PATH]` - Reload configuration (SIGHUP)
//! - `status [--pid-file PATH]` - Check daemon status
//! - `run [OPTIONS] [config.toml]` - Run in foreground (default behavior)
use std::fs;
use std::path::{Path, PathBuf};
use std::process::Command;
use rand::Rng;
#[cfg(unix)]
use crate::daemon::{self, DaemonOptions, DEFAULT_PID_FILE};
/// CLI subcommand to execute.
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Subcommand {
/// Run the proxy (default, or explicit `run` subcommand).
Run,
/// Start as daemon (`start` subcommand).
Start,
/// Stop a running daemon (`stop` subcommand).
Stop,
/// Reload configuration (`reload` subcommand).
Reload,
/// Check daemon status (`status` subcommand).
Status,
/// Fire-and-forget setup (`--init`).
Init,
}
/// Parsed subcommand with its options.
#[derive(Debug)]
pub struct ParsedCommand {
pub subcommand: Subcommand,
pub pid_file: PathBuf,
pub config_path: String,
#[cfg(unix)]
pub daemon_opts: DaemonOptions,
pub init_opts: Option<InitOptions>,
}
impl Default for ParsedCommand {
fn default() -> Self {
Self {
subcommand: Subcommand::Run,
#[cfg(unix)]
pid_file: PathBuf::from(DEFAULT_PID_FILE),
#[cfg(not(unix))]
pid_file: PathBuf::from("/var/run/telemt.pid"),
config_path: "config.toml".to_string(),
#[cfg(unix)]
daemon_opts: DaemonOptions::default(),
init_opts: None,
}
}
}
/// Parse CLI arguments into a command structure.
pub fn parse_command(args: &[String]) -> ParsedCommand {
let mut cmd = ParsedCommand::default();
// Check for --init first (legacy form)
if args.iter().any(|a| a == "--init") {
cmd.subcommand = Subcommand::Init;
cmd.init_opts = parse_init_args(args);
return cmd;
}
// Check for subcommand as first argument
if let Some(first) = args.first() {
match first.as_str() {
"start" => {
cmd.subcommand = Subcommand::Start;
#[cfg(unix)]
{
cmd.daemon_opts = parse_daemon_args(args);
// Force daemonize for start command
cmd.daemon_opts.daemonize = true;
}
}
"stop" => {
cmd.subcommand = Subcommand::Stop;
}
"reload" => {
cmd.subcommand = Subcommand::Reload;
}
"status" => {
cmd.subcommand = Subcommand::Status;
}
"run" => {
cmd.subcommand = Subcommand::Run;
#[cfg(unix)]
{
cmd.daemon_opts = parse_daemon_args(args);
}
}
_ => {
// No subcommand, default to Run
#[cfg(unix)]
{
cmd.daemon_opts = parse_daemon_args(args);
}
}
}
}
// Parse remaining options
let mut i = 0;
while i < args.len() {
match args[i].as_str() {
// Skip subcommand names
"start" | "stop" | "reload" | "status" | "run" => {}
// PID file option (for stop/reload/status)
"--pid-file" => {
i += 1;
if i < args.len() {
cmd.pid_file = PathBuf::from(&args[i]);
#[cfg(unix)]
{
cmd.daemon_opts.pid_file = Some(cmd.pid_file.clone());
}
}
}
s if s.starts_with("--pid-file=") => {
cmd.pid_file = PathBuf::from(s.trim_start_matches("--pid-file="));
#[cfg(unix)]
{
cmd.daemon_opts.pid_file = Some(cmd.pid_file.clone());
}
}
// Config path (positional, non-flag argument)
s if !s.starts_with('-') => {
cmd.config_path = s.to_string();
}
_ => {}
}
i += 1;
}
cmd
}
/// Execute a subcommand that doesn't require starting the server.
/// Returns `Some(exit_code)` if the command was handled, `None` if server should start.
#[cfg(unix)]
pub fn execute_subcommand(cmd: &ParsedCommand) -> Option<i32> {
match cmd.subcommand {
Subcommand::Stop => Some(cmd_stop(&cmd.pid_file)),
Subcommand::Reload => Some(cmd_reload(&cmd.pid_file)),
Subcommand::Status => Some(cmd_status(&cmd.pid_file)),
Subcommand::Init => {
if let Some(opts) = cmd.init_opts.clone() {
match run_init(opts) {
Ok(()) => Some(0),
Err(e) => {
eprintln!("[telemt] Init failed: {}", e);
Some(1)
}
}
} else {
Some(1)
}
}
// Run and Start need the server
Subcommand::Run | Subcommand::Start => None,
}
}
#[cfg(not(unix))]
pub fn execute_subcommand(cmd: &ParsedCommand) -> Option<i32> {
match cmd.subcommand {
Subcommand::Stop | Subcommand::Reload | Subcommand::Status => {
eprintln!("[telemt] Subcommand not supported on this platform");
Some(1)
}
Subcommand::Init => {
if let Some(opts) = cmd.init_opts.clone() {
match run_init(opts) {
Ok(()) => Some(0),
Err(e) => {
eprintln!("[telemt] Init failed: {}", e);
Some(1)
}
}
} else {
Some(1)
}
}
Subcommand::Run | Subcommand::Start => None,
}
}
/// Stop command: send SIGTERM to the running daemon.
#[cfg(unix)]
fn cmd_stop(pid_file: &Path) -> i32 {
use nix::sys::signal::Signal;
println!("Stopping telemt daemon...");
match daemon::signal_pid_file(pid_file, Signal::SIGTERM) {
Ok(()) => {
println!("Stop signal sent successfully");
// Wait for process to exit (up to 10 seconds)
for _ in 0..20 {
std::thread::sleep(std::time::Duration::from_millis(500));
if let daemon::DaemonStatus::NotRunning = daemon::check_status(pid_file) {
println!("Daemon stopped");
return 0;
}
}
println!("Daemon may still be shutting down");
0
}
Err(e) => {
eprintln!("Failed to stop daemon: {}", e);
1
}
}
}
/// Reload command: send SIGHUP to trigger config reload.
#[cfg(unix)]
fn cmd_reload(pid_file: &Path) -> i32 {
use nix::sys::signal::Signal;
println!("Reloading telemt configuration...");
match daemon::signal_pid_file(pid_file, Signal::SIGHUP) {
Ok(()) => {
println!("Reload signal sent successfully");
0
}
Err(e) => {
eprintln!("Failed to reload daemon: {}", e);
1
}
}
}
/// Status command: check if daemon is running.
#[cfg(unix)]
fn cmd_status(pid_file: &Path) -> i32 {
match daemon::check_status(pid_file) {
daemon::DaemonStatus::Running(pid) => {
println!("telemt is running (pid {})", pid);
0
}
daemon::DaemonStatus::Stale(pid) => {
println!("telemt is not running (stale pid file, was pid {})", pid);
// Clean up stale PID file
let _ = std::fs::remove_file(pid_file);
1
}
daemon::DaemonStatus::NotRunning => {
println!("telemt is not running");
1
}
}
}
/// Options for the init command
#[derive(Debug, Clone)]
pub struct InitOptions {
pub port: u16,
pub domain: String,
@@ -15,6 +274,64 @@ pub struct InitOptions {
pub no_start: bool,
}
/// Parse daemon-related options from CLI args.
#[cfg(unix)]
pub fn parse_daemon_args(args: &[String]) -> DaemonOptions {
let mut opts = DaemonOptions::default();
let mut i = 0;
while i < args.len() {
match args[i].as_str() {
"--daemon" | "-d" => {
opts.daemonize = true;
}
"--foreground" | "-f" => {
opts.foreground = true;
}
"--pid-file" => {
i += 1;
if i < args.len() {
opts.pid_file = Some(PathBuf::from(&args[i]));
}
}
s if s.starts_with("--pid-file=") => {
opts.pid_file = Some(PathBuf::from(s.trim_start_matches("--pid-file=")));
}
"--run-as-user" => {
i += 1;
if i < args.len() {
opts.user = Some(args[i].clone());
}
}
s if s.starts_with("--run-as-user=") => {
opts.user = Some(s.trim_start_matches("--run-as-user=").to_string());
}
"--run-as-group" => {
i += 1;
if i < args.len() {
opts.group = Some(args[i].clone());
}
}
s if s.starts_with("--run-as-group=") => {
opts.group = Some(s.trim_start_matches("--run-as-group=").to_string());
}
"--working-dir" => {
i += 1;
if i < args.len() {
opts.working_dir = Some(PathBuf::from(&args[i]));
}
}
s if s.starts_with("--working-dir=") => {
opts.working_dir = Some(PathBuf::from(s.trim_start_matches("--working-dir=")));
}
_ => {}
}
i += 1;
}
opts
}
impl Default for InitOptions {
fn default() -> Self {
Self {
@@ -84,10 +401,16 @@ pub fn parse_init_args(args: &[String]) -> Option<InitOptions> {
/// Run the fire-and-forget setup.
pub fn run_init(opts: InitOptions) -> Result<(), Box<dyn std::error::Error>> {
use crate::service::{self, InitSystem, ServiceOptions};
eprintln!("[telemt] Fire-and-forget setup");
eprintln!();
// 1. Generate or validate secret
// 1. Detect init system
let init_system = service::detect_init_system();
eprintln!("[+] Detected init system: {}", init_system);
// 2. Generate or validate secret
let secret = match opts.secret {
Some(s) => {
if s.len() != 32 || !s.chars().all(|c| c.is_ascii_hexdigit()) {
@@ -98,80 +421,134 @@ pub fn run_init(opts: InitOptions) -> Result<(), Box<dyn std::error::Error>> {
}
None => generate_secret(),
};
eprintln!("[+] Secret: {}", secret);
eprintln!("[+] User: {}", opts.username);
eprintln!("[+] Port: {}", opts.port);
eprintln!("[+] Domain: {}", opts.domain);
// 2. Create config directory
// 3. Create config directory
fs::create_dir_all(&opts.config_dir)?;
let config_path = opts.config_dir.join("config.toml");
// 3. Write config
// 4. Write config
let config_content = generate_config(&opts.username, &secret, opts.port, &opts.domain);
fs::write(&config_path, &config_content)?;
eprintln!("[+] Config written to {}", config_path.display());
// 4. Write systemd unit
// 5. Generate and write service file
let exe_path = std::env::current_exe()
.unwrap_or_else(|_| PathBuf::from("/usr/local/bin/telemt"));
let unit_path = Path::new("/etc/systemd/system/telemt.service");
let unit_content = generate_systemd_unit(&exe_path, &config_path);
match fs::write(unit_path, &unit_content) {
let service_opts = ServiceOptions {
exe_path: &exe_path,
config_path: &config_path,
user: None, // Let systemd/init handle user
group: None,
pid_file: "/var/run/telemt.pid",
working_dir: Some("/var/lib/telemt"),
description: "Telemt MTProxy - Telegram MTProto Proxy",
};
let service_path = service::service_file_path(init_system);
let service_content = service::generate_service_file(init_system, &service_opts);
// Ensure parent directory exists
if let Some(parent) = Path::new(service_path).parent() {
let _ = fs::create_dir_all(parent);
}
match fs::write(service_path, &service_content) {
Ok(()) => {
eprintln!("[+] Systemd unit written to {}", unit_path.display());
eprintln!("[+] Service file written to {}", service_path);
// Make script executable for OpenRC/FreeBSD
#[cfg(unix)]
if init_system == InitSystem::OpenRC || init_system == InitSystem::FreeBSDRc {
use std::os::unix::fs::PermissionsExt;
let mut perms = fs::metadata(service_path)?.permissions();
perms.set_mode(0o755);
fs::set_permissions(service_path, perms)?;
}
}
Err(e) => {
eprintln!("[!] Cannot write systemd unit (run as root?): {}", e);
eprintln!("[!] Manual unit file content:");
eprintln!("{}", unit_content);
// Still print links and config
eprintln!("[!] Cannot write service file (run as root?): {}", e);
eprintln!("[!] Manual service file content:");
eprintln!("{}", service_content);
// Still print links and installation instructions
eprintln!();
eprintln!("{}", service::installation_instructions(init_system));
print_links(&opts.username, &secret, opts.port, &opts.domain);
return Ok(());
}
}
// 5. Reload systemd
run_cmd("systemctl", &["daemon-reload"]);
// 6. Enable service
run_cmd("systemctl", &["enable", "telemt.service"]);
eprintln!("[+] Service enabled");
// 7. Start service (unless --no-start)
if !opts.no_start {
run_cmd("systemctl", &["start", "telemt.service"]);
eprintln!("[+] Service started");
// Brief delay then check status
std::thread::sleep(std::time::Duration::from_secs(1));
let status = Command::new("systemctl")
.args(["is-active", "telemt.service"])
.output();
match status {
Ok(out) if out.status.success() => {
eprintln!("[+] Service is running");
}
_ => {
eprintln!("[!] Service may not have started correctly");
eprintln!("[!] Check: journalctl -u telemt.service -n 20");
// 6. Install and enable service based on init system
match init_system {
InitSystem::Systemd => {
run_cmd("systemctl", &["daemon-reload"]);
run_cmd("systemctl", &["enable", "telemt.service"]);
eprintln!("[+] Service enabled");
if !opts.no_start {
run_cmd("systemctl", &["start", "telemt.service"]);
eprintln!("[+] Service started");
std::thread::sleep(std::time::Duration::from_secs(1));
let status = Command::new("systemctl")
.args(["is-active", "telemt.service"])
.output();
match status {
Ok(out) if out.status.success() => {
eprintln!("[+] Service is running");
}
_ => {
eprintln!("[!] Service may not have started correctly");
eprintln!("[!] Check: journalctl -u telemt.service -n 20");
}
}
} else {
eprintln!("[+] Service not started (--no-start)");
eprintln!("[+] Start manually: systemctl start telemt.service");
}
}
} else {
eprintln!("[+] Service not started (--no-start)");
eprintln!("[+] Start manually: systemctl start telemt.service");
InitSystem::OpenRC => {
run_cmd("rc-update", &["add", "telemt", "default"]);
eprintln!("[+] Service enabled");
if !opts.no_start {
run_cmd("rc-service", &["telemt", "start"]);
eprintln!("[+] Service started");
} else {
eprintln!("[+] Service not started (--no-start)");
eprintln!("[+] Start manually: rc-service telemt start");
}
}
InitSystem::FreeBSDRc => {
run_cmd("sysrc", &["telemt_enable=YES"]);
eprintln!("[+] Service enabled");
if !opts.no_start {
run_cmd("service", &["telemt", "start"]);
eprintln!("[+] Service started");
} else {
eprintln!("[+] Service not started (--no-start)");
eprintln!("[+] Start manually: service telemt start");
}
}
InitSystem::Unknown => {
eprintln!("[!] Unknown init system - service file written but not installed");
eprintln!("[!] You may need to install it manually");
}
}
eprintln!();
// 8. Print links
// 7. Print links
print_links(&opts.username, &secret, opts.port, &opts.domain);
Ok(())
}
@@ -198,8 +575,15 @@ desync_all_full = false
update_every = 43200
hardswap = false
me_pool_drain_ttl_secs = 90
me_instadrain = false
me_pool_drain_threshold = 32
me_pool_drain_soft_evict_grace_secs = 10
me_pool_drain_soft_evict_per_writer = 2
me_pool_drain_soft_evict_budget_per_core = 16
me_pool_drain_soft_evict_cooldown_ms = 1000
me_bind_stale_mode = "never"
me_pool_min_fresh_ratio = 0.8
me_reinit_drain_timeout_secs = 120
me_reinit_drain_timeout_secs = 90
[network]
ipv4 = true
@@ -257,35 +641,6 @@ weight = 10
)
}
fn generate_systemd_unit(exe_path: &Path, config_path: &Path) -> String {
format!(
r#"[Unit]
Description=Telemt MTProxy
Documentation=https://github.com/nicepkg/telemt
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
ExecStart={exe} {config}
Restart=always
RestartSec=5
LimitNOFILE=65535
# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/etc/telemt
PrivateTmp=true
[Install]
WantedBy=multi-user.target
"#,
exe = exe_path.display(),
config = config_path.display(),
)
}
fn run_cmd(cmd: &str, args: &[&str]) {
match Command::new(cmd).args(args).output() {
Ok(output) => {

View File

@@ -40,10 +40,10 @@ const DEFAULT_ME_ROUTE_HYBRID_MAX_WAIT_MS: u64 = 3000;
const DEFAULT_ME_ROUTE_BLOCKING_SEND_TIMEOUT_MS: u64 = 250;
const DEFAULT_ME_C2ME_SEND_TIMEOUT_MS: u64 = 4000;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_ENABLED: bool = true;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_GRACE_SECS: u64 = 30;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_PER_WRITER: u8 = 1;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_BUDGET_PER_CORE: u16 = 8;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_COOLDOWN_MS: u64 = 5000;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_GRACE_SECS: u64 = 10;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_PER_WRITER: u8 = 2;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_BUDGET_PER_CORE: u16 = 16;
const DEFAULT_ME_POOL_DRAIN_SOFT_EVICT_COOLDOWN_MS: u64 = 1000;
const DEFAULT_USER_MAX_UNIQUE_IPS_WINDOW_SECS: u64 = 30;
const DEFAULT_ACCEPT_PERMIT_TIMEOUT_MS: u64 = 250;
const DEFAULT_UPSTREAM_CONNECT_RETRY_ATTEMPTS: u32 = 2;
@@ -65,6 +65,10 @@ pub(crate) fn default_tls_domain() -> String {
"petrovich.ru".to_string()
}
pub(crate) fn default_tls_fetch_scope() -> String {
String::new()
}
pub(crate) fn default_mask_port() -> u16 {
443
}
@@ -606,15 +610,19 @@ pub(crate) fn default_proxy_secret_len_max() -> usize {
}
pub(crate) fn default_me_reinit_drain_timeout_secs() -> u64 {
120
90
}
pub(crate) fn default_me_pool_drain_ttl_secs() -> u64 {
90
}
pub(crate) fn default_me_instadrain() -> bool {
false
}
pub(crate) fn default_me_pool_drain_threshold() -> u64 {
128
32
}
pub(crate) fn default_me_pool_drain_soft_evict_enabled() -> bool {

View File

@@ -56,6 +56,7 @@ pub struct HotFields {
pub me_reinit_coalesce_window_ms: u64,
pub hardswap: bool,
pub me_pool_drain_ttl_secs: u64,
pub me_instadrain: bool,
pub me_pool_drain_threshold: u64,
pub me_pool_drain_soft_evict_enabled: bool,
pub me_pool_drain_soft_evict_grace_secs: u64,
@@ -143,6 +144,7 @@ impl HotFields {
me_reinit_coalesce_window_ms: cfg.general.me_reinit_coalesce_window_ms,
hardswap: cfg.general.hardswap,
me_pool_drain_ttl_secs: cfg.general.me_pool_drain_ttl_secs,
me_instadrain: cfg.general.me_instadrain,
me_pool_drain_threshold: cfg.general.me_pool_drain_threshold,
me_pool_drain_soft_evict_enabled: cfg.general.me_pool_drain_soft_evict_enabled,
me_pool_drain_soft_evict_grace_secs: cfg.general.me_pool_drain_soft_evict_grace_secs,
@@ -477,6 +479,7 @@ fn overlay_hot_fields(old: &ProxyConfig, new: &ProxyConfig) -> ProxyConfig {
cfg.general.me_reinit_coalesce_window_ms = new.general.me_reinit_coalesce_window_ms;
cfg.general.hardswap = new.general.hardswap;
cfg.general.me_pool_drain_ttl_secs = new.general.me_pool_drain_ttl_secs;
cfg.general.me_instadrain = new.general.me_instadrain;
cfg.general.me_pool_drain_threshold = new.general.me_pool_drain_threshold;
cfg.general.me_pool_drain_soft_evict_enabled = new.general.me_pool_drain_soft_evict_enabled;
cfg.general.me_pool_drain_soft_evict_grace_secs =
@@ -620,6 +623,7 @@ fn warn_non_hot_changes(old: &ProxyConfig, new: &ProxyConfig, non_hot_changed: b
}
if old.censorship.tls_domain != new.censorship.tls_domain
|| old.censorship.tls_domains != new.censorship.tls_domains
|| old.censorship.tls_fetch_scope != new.censorship.tls_fetch_scope
|| old.censorship.mask != new.censorship.mask
|| old.censorship.mask_host != new.censorship.mask_host
|| old.censorship.mask_port != new.censorship.mask_port
@@ -869,6 +873,12 @@ fn log_changes(
old_hot.me_pool_drain_ttl_secs, new_hot.me_pool_drain_ttl_secs,
);
}
if old_hot.me_instadrain != new_hot.me_instadrain {
info!(
"config reload: me_instadrain: {} → {}",
old_hot.me_instadrain, new_hot.me_instadrain,
);
}
if old_hot.me_pool_drain_threshold != new_hot.me_pool_drain_threshold {
info!(

View File

@@ -6,8 +6,9 @@ use std::net::{IpAddr, SocketAddr};
use std::path::{Path, PathBuf};
use rand::Rng;
use serde::{Deserialize, Serialize};
use shadowsocks::config::ServerConfig as ShadowsocksServerConfig;
use tracing::warn;
use serde::{Serialize, Deserialize};
use crate::error::{ProxyError, Result};
@@ -122,13 +123,37 @@ fn sanitize_ad_tag(ad_tag: &mut Option<String>) {
};
if !is_valid_ad_tag(tag) {
warn!(
"Invalid general.ad_tag value, expected exactly 32 hex chars; ad_tag is disabled"
);
warn!("Invalid general.ad_tag value, expected exactly 32 hex chars; ad_tag is disabled");
*ad_tag = None;
}
}
fn validate_upstreams(config: &ProxyConfig) -> Result<()> {
let has_enabled_shadowsocks = config.upstreams.iter().any(|upstream| {
upstream.enabled && matches!(upstream.upstream_type, UpstreamType::Shadowsocks { .. })
});
if has_enabled_shadowsocks && config.general.use_middle_proxy {
return Err(ProxyError::Config(
"shadowsocks upstreams require general.use_middle_proxy = false".to_string(),
));
}
for upstream in &config.upstreams {
if let UpstreamType::Shadowsocks { url, .. } = &upstream.upstream_type {
let parsed = ShadowsocksServerConfig::from_url(url)
.map_err(|error| ProxyError::Config(format!("invalid shadowsocks url: {error}")))?;
if parsed.plugin().is_some() {
return Err(ProxyError::Config(
"shadowsocks plugins are not supported".to_string(),
));
}
}
}
Ok(())
}
// ============= Main Config =============
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
@@ -180,7 +205,8 @@ impl ProxyConfig {
pub(crate) fn load_with_metadata<P: AsRef<Path>>(path: P) -> Result<LoadedConfig> {
let path = path.as_ref();
let content = std::fs::read_to_string(path).map_err(|e| ProxyError::Config(e.to_string()))?;
let content =
std::fs::read_to_string(path).map_err(|e| ProxyError::Config(e.to_string()))?;
let base_dir = path.parent().unwrap_or(Path::new("."));
let mut source_files = BTreeSet::new();
source_files.insert(normalize_config_path(path));
@@ -207,15 +233,17 @@ impl ProxyConfig {
.map(|table| table.contains_key("stun_servers"))
.unwrap_or(false);
let mut config: ProxyConfig =
parsed_toml.try_into().map_err(|e| ProxyError::Config(e.to_string()))?;
let mut config: ProxyConfig = parsed_toml
.try_into()
.map_err(|e| ProxyError::Config(e.to_string()))?;
if !update_every_is_explicit && (legacy_secret_is_explicit || legacy_config_is_explicit) {
config.general.update_every = None;
}
let legacy_nat_stun = config.general.middle_proxy_nat_stun.take();
let legacy_nat_stun_servers = std::mem::take(&mut config.general.middle_proxy_nat_stun_servers);
let legacy_nat_stun_servers =
std::mem::take(&mut config.general.middle_proxy_nat_stun_servers);
let legacy_nat_stun_used = legacy_nat_stun.is_some() || !legacy_nat_stun_servers.is_empty();
if stun_servers_is_explicit {
let mut explicit_stun_servers = Vec::new();
@@ -225,7 +253,9 @@ impl ProxyConfig {
config.network.stun_servers = explicit_stun_servers;
if legacy_nat_stun_used {
warn!("general.middle_proxy_nat_stun and general.middle_proxy_nat_stun_servers are ignored because network.stun_servers is explicitly set");
warn!(
"general.middle_proxy_nat_stun and general.middle_proxy_nat_stun_servers are ignored because network.stun_servers is explicitly set"
);
}
} else {
// Keep the default STUN pool unless network.stun_servers is explicitly overridden.
@@ -240,7 +270,9 @@ impl ProxyConfig {
config.network.stun_servers = unified_stun_servers;
if legacy_nat_stun_used {
warn!("general.middle_proxy_nat_stun and general.middle_proxy_nat_stun_servers are deprecated; use network.stun_servers");
warn!(
"general.middle_proxy_nat_stun and general.middle_proxy_nat_stun_servers are deprecated; use network.stun_servers"
);
}
}
@@ -378,13 +410,15 @@ impl ProxyConfig {
if !(4096..=1024 * 1024).contains(&config.general.direct_relay_copy_buf_c2s_bytes) {
return Err(ProxyError::Config(
"general.direct_relay_copy_buf_c2s_bytes must be within [4096, 1048576]".to_string(),
"general.direct_relay_copy_buf_c2s_bytes must be within [4096, 1048576]"
.to_string(),
));
}
if !(8192..=2 * 1024 * 1024).contains(&config.general.direct_relay_copy_buf_s2c_bytes) {
return Err(ProxyError::Config(
"general.direct_relay_copy_buf_s2c_bytes must be within [8192, 2097152]".to_string(),
"general.direct_relay_copy_buf_s2c_bytes must be within [8192, 2097152]"
.to_string(),
));
}
@@ -633,7 +667,8 @@ impl ProxyConfig {
if !(1..=100).contains(&config.general.me_route_backpressure_high_watermark_pct) {
return Err(ProxyError::Config(
"general.me_route_backpressure_high_watermark_pct must be within [1, 100]".to_string(),
"general.me_route_backpressure_high_watermark_pct must be within [1, 100]"
.to_string(),
));
}
@@ -779,6 +814,9 @@ impl ProxyConfig {
config.censorship.mask_host = Some(config.censorship.tls_domain.clone());
}
// Normalize optional TLS fetch scope: whitespace-only values disable scoped routing.
config.censorship.tls_fetch_scope = config.censorship.tls_fetch_scope.trim().to_string();
// Merge primary + extra TLS domains, deduplicate (primary always first).
if !config.censorship.tls_domains.is_empty() {
let mut all = Vec::with_capacity(1 + config.censorship.tls_domains.len());
@@ -813,11 +851,15 @@ impl ProxyConfig {
crate::network::dns_overrides::validate_entries(&config.network.dns_overrides)?;
if config.general.use_middle_proxy && config.network.ipv6 == Some(true) {
warn!("IPv6 with Middle Proxy is experimental and may cause KDF address mismatch; consider disabling IPv6 or ME");
warn!(
"IPv6 with Middle Proxy is experimental and may cause KDF address mismatch; consider disabling IPv6 or ME"
);
}
// Random fake_cert_len only when default is in use.
if !config.censorship.tls_emulation && config.censorship.fake_cert_len == default_fake_cert_len() {
if !config.censorship.tls_emulation
&& config.censorship.fake_cert_len == default_fake_cert_len()
{
config.censorship.fake_cert_len = rand::rng().gen_range(1024..4096);
}
@@ -827,8 +869,7 @@ impl ProxyConfig {
let listen_tcp = config.server.listen_tcp.unwrap_or_else(|| {
if config.server.listen_unix_sock.is_some() {
// Unix socket present: TCP only if user explicitly set addresses or listeners.
config.server.listen_addr_ipv4.is_some()
|| !config.server.listeners.is_empty()
config.server.listen_addr_ipv4.is_some() || !config.server.listeners.is_empty()
} else {
true
}
@@ -836,7 +877,9 @@ impl ProxyConfig {
// Migration: Populate listeners if empty (skip when listen_tcp = false).
if config.server.listeners.is_empty() && listen_tcp {
let ipv4_str = config.server.listen_addr_ipv4
let ipv4_str = config
.server
.listen_addr_ipv4
.as_deref()
.unwrap_or("0.0.0.0");
if let Ok(ipv4) = ipv4_str.parse::<IpAddr>() {
@@ -878,7 +921,10 @@ impl ProxyConfig {
// Migration: Populate upstreams if empty (Default Direct).
if config.upstreams.is_empty() {
config.upstreams.push(UpstreamConfig {
upstream_type: UpstreamType::Direct { interface: None, bind_addresses: None },
upstream_type: UpstreamType::Direct {
interface: None,
bind_addresses: None,
},
weight: 1,
enabled: true,
scopes: String::new(),
@@ -892,6 +938,8 @@ impl ProxyConfig {
.entry("203".to_string())
.or_insert_with(|| vec!["91.105.192.100:443".to_string()]);
validate_upstreams(&config)?;
Ok(LoadedConfig {
config,
source_files: source_files.into_iter().collect(),
@@ -938,6 +986,9 @@ impl ProxyConfig {
mod tests {
use super::*;
const TEST_SHADOWSOCKS_URL: &str =
"ss://2022-blake3-aes-256-gcm:MDEyMzQ1Njc4OTAxMjM0NTY3ODkwMTIzNDU2Nzg5MDE=@127.0.0.1:8388";
#[test]
fn serde_defaults_remain_unchanged_for_present_sections() {
let toml = r#"
@@ -967,10 +1018,7 @@ mod tests {
cfg.general.me_init_retry_attempts,
default_me_init_retry_attempts()
);
assert_eq!(
cfg.general.me2dc_fallback,
default_me2dc_fallback()
);
assert_eq!(cfg.general.me2dc_fallback, default_me2dc_fallback());
assert_eq!(
cfg.general.proxy_config_v4_cache_path,
default_proxy_config_v4_cache_path()
@@ -1279,11 +1327,12 @@ mod tests {
let path = dir.join("telemt_dc_override_test.toml");
std::fs::write(&path, toml).unwrap();
let cfg = ProxyConfig::load(&path).unwrap();
assert!(cfg
.dc_overrides
.get("203")
.map(|v| v.contains(&"91.105.192.100:443".to_string()))
.unwrap_or(false));
assert!(
cfg.dc_overrides
.get("203")
.map(|v| v.contains(&"91.105.192.100:443".to_string()))
.unwrap_or(false)
);
let _ = std::fs::remove_file(path);
}
@@ -1470,11 +1519,9 @@ mod tests {
let path = dir.join("telemt_me_adaptive_floor_min_writers_out_of_range_test.toml");
std::fs::write(&path, toml).unwrap();
let err = ProxyConfig::load(&path).unwrap_err().to_string();
assert!(
err.contains(
"general.me_adaptive_floor_min_writers_single_endpoint must be within [1, 32]"
)
);
assert!(err.contains(
"general.me_adaptive_floor_min_writers_single_endpoint must be within [1, 32]"
));
let _ = std::fs::remove_file(path);
}
@@ -2037,6 +2084,45 @@ mod tests {
let _ = std::fs::remove_file(path);
}
#[test]
fn force_close_default_matches_drain_ttl() {
let toml = r#"
[censorship]
tls_domain = "example.com"
[access.users]
user = "00000000000000000000000000000000"
"#;
let dir = std::env::temp_dir();
let path = dir.join("telemt_force_close_default_test.toml");
std::fs::write(&path, toml).unwrap();
let cfg = ProxyConfig::load(&path).unwrap();
assert_eq!(cfg.general.me_reinit_drain_timeout_secs, 90);
assert_eq!(cfg.general.effective_me_pool_force_close_secs(), 90);
let _ = std::fs::remove_file(path);
}
#[test]
fn force_close_zero_uses_runtime_safety_fallback() {
let toml = r#"
[general]
me_reinit_drain_timeout_secs = 0
[censorship]
tls_domain = "example.com"
[access.users]
user = "00000000000000000000000000000000"
"#;
let dir = std::env::temp_dir();
let path = dir.join("telemt_force_close_zero_fallback_test.toml");
std::fs::write(&path, toml).unwrap();
let cfg = ProxyConfig::load(&path).unwrap();
assert_eq!(cfg.general.me_reinit_drain_timeout_secs, 0);
assert_eq!(cfg.general.effective_me_pool_force_close_secs(), 300);
let _ = std::fs::remove_file(path);
}
#[test]
fn force_close_bumped_when_below_drain_ttl() {
let toml = r#"
@@ -2058,6 +2144,59 @@ mod tests {
let _ = std::fs::remove_file(path);
}
#[test]
fn tls_fetch_scope_default_is_empty() {
let toml = r#"
[censorship]
tls_domain = "example.com"
[access.users]
user = "00000000000000000000000000000000"
"#;
let dir = std::env::temp_dir();
let path = dir.join("telemt_tls_fetch_scope_default_test.toml");
std::fs::write(&path, toml).unwrap();
let cfg = ProxyConfig::load(&path).unwrap();
assert!(cfg.censorship.tls_fetch_scope.is_empty());
let _ = std::fs::remove_file(path);
}
#[test]
fn tls_fetch_scope_is_trimmed_during_load() {
let toml = r#"
[censorship]
tls_domain = "example.com"
tls_fetch_scope = " me "
[access.users]
user = "00000000000000000000000000000000"
"#;
let dir = std::env::temp_dir();
let path = dir.join("telemt_tls_fetch_scope_trim_test.toml");
std::fs::write(&path, toml).unwrap();
let cfg = ProxyConfig::load(&path).unwrap();
assert_eq!(cfg.censorship.tls_fetch_scope, "me");
let _ = std::fs::remove_file(path);
}
#[test]
fn tls_fetch_scope_whitespace_becomes_empty() {
let toml = r#"
[censorship]
tls_domain = "example.com"
tls_fetch_scope = " "
[access.users]
user = "00000000000000000000000000000000"
"#;
let dir = std::env::temp_dir();
let path = dir.join("telemt_tls_fetch_scope_blank_test.toml");
std::fs::write(&path, toml).unwrap();
let cfg = ProxyConfig::load(&path).unwrap();
assert!(cfg.censorship.tls_fetch_scope.is_empty());
let _ = std::fs::remove_file(path);
}
#[test]
fn invalid_ad_tag_is_disabled_during_load() {
let toml = r#"
@@ -2101,6 +2240,124 @@ mod tests {
let _ = std::fs::remove_file(path);
}
#[test]
fn shadowsocks_upstream_url_loads_successfully() {
let toml = format!(
r#"
[general]
use_middle_proxy = false
[censorship]
tls_domain = "example.com"
[access.users]
user = "00000000000000000000000000000000"
[[upstreams]]
type = "shadowsocks"
url = "{url}"
interface = "127.0.0.2"
"#,
url = TEST_SHADOWSOCKS_URL,
);
let dir = std::env::temp_dir();
let path = dir.join("telemt_shadowsocks_valid_test.toml");
std::fs::write(&path, toml).unwrap();
let cfg = ProxyConfig::load(&path).unwrap();
assert!(matches!(
&cfg.upstreams[0].upstream_type,
UpstreamType::Shadowsocks { url, interface }
if url == TEST_SHADOWSOCKS_URL && interface.as_deref() == Some("127.0.0.2")
));
let _ = std::fs::remove_file(path);
}
#[test]
fn shadowsocks_requires_direct_mode() {
let toml = format!(
r#"
[general]
use_middle_proxy = true
[censorship]
tls_domain = "example.com"
[access.users]
user = "00000000000000000000000000000000"
[[upstreams]]
type = "shadowsocks"
url = "{url}"
"#,
url = TEST_SHADOWSOCKS_URL,
);
let dir = std::env::temp_dir();
let path = dir.join("telemt_shadowsocks_me_reject_test.toml");
std::fs::write(&path, toml).unwrap();
let err = ProxyConfig::load(&path).unwrap_err().to_string();
assert!(err.contains("shadowsocks upstreams require general.use_middle_proxy = false"));
let _ = std::fs::remove_file(path);
}
#[test]
fn invalid_shadowsocks_url_is_rejected() {
let toml = r#"
[general]
use_middle_proxy = false
[censorship]
tls_domain = "example.com"
[access.users]
user = "00000000000000000000000000000000"
[[upstreams]]
type = "shadowsocks"
url = "not-a-valid-ss-url"
"#;
let dir = std::env::temp_dir();
let path = dir.join("telemt_shadowsocks_invalid_url_test.toml");
std::fs::write(&path, toml).unwrap();
let err = ProxyConfig::load(&path).unwrap_err().to_string();
assert!(err.contains("invalid shadowsocks url"));
let _ = std::fs::remove_file(path);
}
#[test]
fn shadowsocks_plugins_are_rejected() {
let toml = format!(
r#"
[general]
use_middle_proxy = false
[censorship]
tls_domain = "example.com"
[access.users]
user = "00000000000000000000000000000000"
[[upstreams]]
type = "shadowsocks"
url = "{url}?plugin=obfs-local%3Bobfs%3Dhttp"
"#,
url = TEST_SHADOWSOCKS_URL,
);
let dir = std::env::temp_dir();
let path = dir.join("telemt_shadowsocks_plugin_reject_test.toml");
std::fs::write(&path, toml).unwrap();
let err = ProxyConfig::load(&path).unwrap_err().to_string();
assert!(err.contains("shadowsocks plugins are not supported"));
let _ = std::fs::remove_file(path);
}
#[test]
fn invalid_user_ad_tag_reports_access_user_ad_tags_key() {
let toml = r#"

View File

@@ -135,8 +135,8 @@ impl MeSocksKdfPolicy {
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize, Default)]
#[serde(rename_all = "lowercase")]
pub enum MeBindStaleMode {
Never,
#[default]
Never,
Ttl,
Always,
}
@@ -812,6 +812,10 @@ pub struct GeneralConfig {
#[serde(default = "default_me_pool_drain_ttl_secs")]
pub me_pool_drain_ttl_secs: u64,
/// Force-remove any draining writer on the next cleanup tick, regardless of age/deadline.
#[serde(default = "default_me_instadrain")]
pub me_instadrain: bool,
/// Maximum allowed number of draining ME writers before oldest ones are force-closed in batches.
/// Set to 0 to disable threshold-based draining cleanup and keep timeout-only behavior.
#[serde(default = "default_me_pool_drain_threshold")]
@@ -851,7 +855,7 @@ pub struct GeneralConfig {
pub me_pool_min_fresh_ratio: f32,
/// Drain timeout in seconds for stale ME writers after endpoint map changes.
/// Set to 0 to keep stale writers draining indefinitely (no force-close).
/// Set to 0 to use the runtime safety fallback timeout.
#[serde(default = "default_me_reinit_drain_timeout_secs")]
pub me_reinit_drain_timeout_secs: u64,
@@ -951,24 +955,38 @@ impl Default for GeneralConfig {
me_reconnect_backoff_cap_ms: default_reconnect_backoff_cap_ms(),
me_reconnect_fast_retry_count: default_me_reconnect_fast_retry_count(),
me_single_endpoint_shadow_writers: default_me_single_endpoint_shadow_writers(),
me_single_endpoint_outage_mode_enabled: default_me_single_endpoint_outage_mode_enabled(),
me_single_endpoint_outage_disable_quarantine: default_me_single_endpoint_outage_disable_quarantine(),
me_single_endpoint_outage_backoff_min_ms: default_me_single_endpoint_outage_backoff_min_ms(),
me_single_endpoint_outage_backoff_max_ms: default_me_single_endpoint_outage_backoff_max_ms(),
me_single_endpoint_shadow_rotate_every_secs: default_me_single_endpoint_shadow_rotate_every_secs(),
me_single_endpoint_outage_mode_enabled: default_me_single_endpoint_outage_mode_enabled(
),
me_single_endpoint_outage_disable_quarantine:
default_me_single_endpoint_outage_disable_quarantine(),
me_single_endpoint_outage_backoff_min_ms:
default_me_single_endpoint_outage_backoff_min_ms(),
me_single_endpoint_outage_backoff_max_ms:
default_me_single_endpoint_outage_backoff_max_ms(),
me_single_endpoint_shadow_rotate_every_secs:
default_me_single_endpoint_shadow_rotate_every_secs(),
me_floor_mode: MeFloorMode::default(),
me_adaptive_floor_idle_secs: default_me_adaptive_floor_idle_secs(),
me_adaptive_floor_min_writers_single_endpoint: default_me_adaptive_floor_min_writers_single_endpoint(),
me_adaptive_floor_min_writers_multi_endpoint: default_me_adaptive_floor_min_writers_multi_endpoint(),
me_adaptive_floor_min_writers_single_endpoint:
default_me_adaptive_floor_min_writers_single_endpoint(),
me_adaptive_floor_min_writers_multi_endpoint:
default_me_adaptive_floor_min_writers_multi_endpoint(),
me_adaptive_floor_recover_grace_secs: default_me_adaptive_floor_recover_grace_secs(),
me_adaptive_floor_writers_per_core_total: default_me_adaptive_floor_writers_per_core_total(),
me_adaptive_floor_writers_per_core_total:
default_me_adaptive_floor_writers_per_core_total(),
me_adaptive_floor_cpu_cores_override: default_me_adaptive_floor_cpu_cores_override(),
me_adaptive_floor_max_extra_writers_single_per_core: default_me_adaptive_floor_max_extra_writers_single_per_core(),
me_adaptive_floor_max_extra_writers_multi_per_core: default_me_adaptive_floor_max_extra_writers_multi_per_core(),
me_adaptive_floor_max_active_writers_per_core: default_me_adaptive_floor_max_active_writers_per_core(),
me_adaptive_floor_max_warm_writers_per_core: default_me_adaptive_floor_max_warm_writers_per_core(),
me_adaptive_floor_max_active_writers_global: default_me_adaptive_floor_max_active_writers_global(),
me_adaptive_floor_max_warm_writers_global: default_me_adaptive_floor_max_warm_writers_global(),
me_adaptive_floor_max_extra_writers_single_per_core:
default_me_adaptive_floor_max_extra_writers_single_per_core(),
me_adaptive_floor_max_extra_writers_multi_per_core:
default_me_adaptive_floor_max_extra_writers_multi_per_core(),
me_adaptive_floor_max_active_writers_per_core:
default_me_adaptive_floor_max_active_writers_per_core(),
me_adaptive_floor_max_warm_writers_per_core:
default_me_adaptive_floor_max_warm_writers_per_core(),
me_adaptive_floor_max_active_writers_global:
default_me_adaptive_floor_max_active_writers_global(),
me_adaptive_floor_max_warm_writers_global:
default_me_adaptive_floor_max_warm_writers_global(),
upstream_connect_retry_attempts: default_upstream_connect_retry_attempts(),
upstream_connect_retry_backoff_ms: default_upstream_connect_retry_backoff_ms(),
upstream_connect_budget_ms: default_upstream_connect_budget_ms(),
@@ -983,7 +1001,8 @@ impl Default for GeneralConfig {
me_socks_kdf_policy: MeSocksKdfPolicy::Strict,
me_route_backpressure_base_timeout_ms: default_me_route_backpressure_base_timeout_ms(),
me_route_backpressure_high_timeout_ms: default_me_route_backpressure_high_timeout_ms(),
me_route_backpressure_high_watermark_pct: default_me_route_backpressure_high_watermark_pct(),
me_route_backpressure_high_watermark_pct:
default_me_route_backpressure_high_watermark_pct(),
me_health_interval_ms_unhealthy: default_me_health_interval_ms_unhealthy(),
me_health_interval_ms_healthy: default_me_health_interval_ms_healthy(),
me_admission_poll_ms: default_me_admission_poll_ms(),
@@ -1009,7 +1028,8 @@ impl Default for GeneralConfig {
me_hardswap_warmup_delay_min_ms: default_me_hardswap_warmup_delay_min_ms(),
me_hardswap_warmup_delay_max_ms: default_me_hardswap_warmup_delay_max_ms(),
me_hardswap_warmup_extra_passes: default_me_hardswap_warmup_extra_passes(),
me_hardswap_warmup_pass_backoff_base_ms: default_me_hardswap_warmup_pass_backoff_base_ms(),
me_hardswap_warmup_pass_backoff_base_ms:
default_me_hardswap_warmup_pass_backoff_base_ms(),
me_config_stable_snapshots: default_me_config_stable_snapshots(),
me_config_apply_cooldown_secs: default_me_config_apply_cooldown_secs(),
me_snapshot_require_http_2xx: default_me_snapshot_require_http_2xx(),
@@ -1020,6 +1040,7 @@ impl Default for GeneralConfig {
me_secret_atomic_snapshot: default_me_secret_atomic_snapshot(),
proxy_secret_len_max: default_proxy_secret_len_max(),
me_pool_drain_ttl_secs: default_me_pool_drain_ttl_secs(),
me_instadrain: default_me_instadrain(),
me_pool_drain_threshold: default_me_pool_drain_threshold(),
me_pool_drain_soft_evict_enabled: default_me_pool_drain_soft_evict_enabled(),
me_pool_drain_soft_evict_grace_secs: default_me_pool_drain_soft_evict_grace_secs(),
@@ -1052,8 +1073,10 @@ impl GeneralConfig {
/// Resolve the active updater interval for ME infrastructure refresh tasks.
/// `update_every` has priority, otherwise legacy proxy_*_auto_reload_secs are used.
pub fn effective_update_every_secs(&self) -> u64 {
self.update_every
.unwrap_or_else(|| self.proxy_secret_auto_reload_secs.min(self.proxy_config_auto_reload_secs))
self.update_every.unwrap_or_else(|| {
self.proxy_secret_auto_reload_secs
.min(self.proxy_config_auto_reload_secs)
})
}
/// Resolve periodic zero-downtime reinit interval for ME writers.
@@ -1063,8 +1086,13 @@ impl GeneralConfig {
/// Resolve force-close timeout for stale writers.
/// `me_reinit_drain_timeout_secs` remains backward-compatible alias.
/// A configured `0` uses the runtime safety fallback (300s).
pub fn effective_me_pool_force_close_secs(&self) -> u64 {
self.me_reinit_drain_timeout_secs
if self.me_reinit_drain_timeout_secs == 0 {
300
} else {
self.me_reinit_drain_timeout_secs
}
}
}
@@ -1298,6 +1326,11 @@ pub struct AntiCensorshipConfig {
#[serde(default)]
pub tls_domains: Vec<String>,
/// Upstream scope used for TLS front metadata fetches.
/// Empty value keeps default upstream routing behavior.
#[serde(default = "default_tls_fetch_scope")]
pub tls_fetch_scope: String,
#[serde(default = "default_true")]
pub mask: bool,
@@ -1355,6 +1388,7 @@ impl Default for AntiCensorshipConfig {
Self {
tls_domain: default_tls_domain(),
tls_domains: Vec::new(),
tls_fetch_scope: default_tls_fetch_scope(),
mask: default_true(),
mask_host: None,
mask_port: default_mask_port(),
@@ -1460,6 +1494,11 @@ pub enum UpstreamType {
#[serde(default)]
password: Option<String>,
},
Shadowsocks {
url: String,
#[serde(default)]
interface: Option<String>,
},
}
#[derive(Debug, Clone, Serialize, Deserialize)]
@@ -1540,7 +1579,10 @@ impl ShowLink {
}
impl Serialize for ShowLink {
fn serialize<S: serde::Serializer>(&self, serializer: S) -> std::result::Result<S::Ok, S::Error> {
fn serialize<S: serde::Serializer>(
&self,
serializer: S,
) -> std::result::Result<S::Ok, S::Error> {
match self {
ShowLink::None => Vec::<String>::new().serialize(serializer),
ShowLink::All => serializer.serialize_str("*"),
@@ -1550,7 +1592,9 @@ impl Serialize for ShowLink {
}
impl<'de> Deserialize<'de> for ShowLink {
fn deserialize<D: serde::Deserializer<'de>>(deserializer: D) -> std::result::Result<Self, D::Error> {
fn deserialize<D: serde::Deserializer<'de>>(
deserializer: D,
) -> std::result::Result<Self, D::Error> {
use serde::de;
struct ShowLinkVisitor;
@@ -1566,14 +1610,14 @@ impl<'de> Deserialize<'de> for ShowLink {
if v == "*" {
Ok(ShowLink::All)
} else {
Err(de::Error::invalid_value(
de::Unexpected::Str(v),
&r#""*""#,
))
Err(de::Error::invalid_value(de::Unexpected::Str(v), &r#""*""#))
}
}
fn visit_seq<A: de::SeqAccess<'de>>(self, mut seq: A) -> std::result::Result<ShowLink, A::Error> {
fn visit_seq<A: de::SeqAccess<'de>>(
self,
mut seq: A,
) -> std::result::Result<ShowLink, A::Error> {
let mut names = Vec::new();
while let Some(name) = seq.next_element::<String>()? {
names.push(name);

523
src/daemon/mod.rs Normal file
View File

@@ -0,0 +1,523 @@
//! Unix daemon support for telemt.
//!
//! Provides classic Unix daemonization (double-fork), PID file management,
//! and privilege dropping for running telemt as a background service.
use std::fs::{self, File, OpenOptions};
use std::io::{self, Read, Write};
use std::os::unix::fs::OpenOptionsExt;
use std::path::{Path, PathBuf};
use nix::fcntl::{Flock, FlockArg};
use nix::unistd::{self, ForkResult, Gid, Pid, Uid, chdir, close, dup2, fork, getpid, setsid};
use tracing::{debug, info, warn};
/// Default PID file location.
pub const DEFAULT_PID_FILE: &str = "/var/run/telemt.pid";
/// Daemon configuration options parsed from CLI.
#[derive(Debug, Clone, Default)]
pub struct DaemonOptions {
/// Run as daemon (fork to background).
pub daemonize: bool,
/// Path to PID file.
pub pid_file: Option<PathBuf>,
/// User to run as after binding sockets.
pub user: Option<String>,
/// Group to run as after binding sockets.
pub group: Option<String>,
/// Working directory for the daemon.
pub working_dir: Option<PathBuf>,
/// Explicit foreground mode (for systemd Type=simple).
pub foreground: bool,
}
impl DaemonOptions {
/// Returns the effective PID file path.
pub fn pid_file_path(&self) -> &Path {
self.pid_file
.as_deref()
.unwrap_or(Path::new(DEFAULT_PID_FILE))
}
/// Returns true if we should actually daemonize.
/// Foreground flag takes precedence.
pub fn should_daemonize(&self) -> bool {
self.daemonize && !self.foreground
}
}
/// Error types for daemon operations.
#[derive(Debug, thiserror::Error)]
pub enum DaemonError {
#[error("fork failed: {0}")]
ForkFailed(#[source] nix::Error),
#[error("setsid failed: {0}")]
SetsidFailed(#[source] nix::Error),
#[error("chdir failed: {0}")]
ChdirFailed(#[source] nix::Error),
#[error("failed to open /dev/null: {0}")]
DevNullFailed(#[source] io::Error),
#[error("failed to redirect stdio: {0}")]
RedirectFailed(#[source] nix::Error),
#[error("PID file error: {0}")]
PidFile(String),
#[error("another instance is already running (pid {0})")]
AlreadyRunning(i32),
#[error("user '{0}' not found")]
UserNotFound(String),
#[error("group '{0}' not found")]
GroupNotFound(String),
#[error("failed to set uid/gid: {0}")]
PrivilegeDrop(#[source] nix::Error),
#[error("io error: {0}")]
Io(#[from] io::Error),
}
/// Result of a successful daemonize() call.
#[derive(Debug)]
pub enum DaemonizeResult {
/// We are the parent process and should exit.
Parent,
/// We are the daemon child process and should continue.
Child,
}
/// Performs classic Unix double-fork daemonization.
///
/// This detaches the process from the controlling terminal:
/// 1. First fork - parent exits, child continues
/// 2. setsid() - become session leader
/// 3. Second fork - ensure we can never acquire a controlling terminal
/// 4. chdir("/") - don't hold any directory open
/// 5. Redirect stdin/stdout/stderr to /dev/null
///
/// Returns `DaemonizeResult::Parent` in the original parent (which should exit),
/// or `DaemonizeResult::Child` in the final daemon child.
pub fn daemonize(working_dir: Option<&Path>) -> Result<DaemonizeResult, DaemonError> {
// First fork
match unsafe { fork() } {
Ok(ForkResult::Parent { .. }) => {
// Parent exits
return Ok(DaemonizeResult::Parent);
}
Ok(ForkResult::Child) => {
// Child continues
}
Err(e) => return Err(DaemonError::ForkFailed(e)),
}
// Create new session, become session leader
setsid().map_err(DaemonError::SetsidFailed)?;
// Second fork to ensure we can never acquire a controlling terminal
match unsafe { fork() } {
Ok(ForkResult::Parent { .. }) => {
// Intermediate parent exits
std::process::exit(0);
}
Ok(ForkResult::Child) => {
// Final daemon child continues
}
Err(e) => return Err(DaemonError::ForkFailed(e)),
}
// Change working directory
let target_dir = working_dir.unwrap_or(Path::new("/"));
chdir(target_dir).map_err(DaemonError::ChdirFailed)?;
// Redirect stdin, stdout, stderr to /dev/null
redirect_stdio_to_devnull()?;
Ok(DaemonizeResult::Child)
}
/// Redirects stdin, stdout, and stderr to /dev/null.
fn redirect_stdio_to_devnull() -> Result<(), DaemonError> {
let devnull = File::options()
.read(true)
.write(true)
.open("/dev/null")
.map_err(DaemonError::DevNullFailed)?;
let devnull_fd = std::os::unix::io::AsRawFd::as_raw_fd(&devnull);
// Redirect stdin (fd 0)
dup2(devnull_fd, 0).map_err(DaemonError::RedirectFailed)?;
// Redirect stdout (fd 1)
dup2(devnull_fd, 1).map_err(DaemonError::RedirectFailed)?;
// Redirect stderr (fd 2)
dup2(devnull_fd, 2).map_err(DaemonError::RedirectFailed)?;
// Close original devnull fd if it's not one of the standard fds
if devnull_fd > 2 {
let _ = close(devnull_fd);
}
Ok(())
}
/// PID file manager with flock-based locking.
pub struct PidFile {
path: PathBuf,
file: Option<File>,
locked: bool,
}
impl PidFile {
/// Creates a new PID file manager for the given path.
pub fn new<P: AsRef<Path>>(path: P) -> Self {
Self {
path: path.as_ref().to_path_buf(),
file: None,
locked: false,
}
}
/// Checks if another instance is already running.
///
/// Returns the PID of the running instance if one exists.
pub fn check_running(&self) -> Result<Option<i32>, DaemonError> {
if !self.path.exists() {
return Ok(None);
}
// Try to read existing PID
let mut contents = String::new();
File::open(&self.path)
.and_then(|mut f| f.read_to_string(&mut contents))
.map_err(|e| DaemonError::PidFile(format!("cannot read {}: {}", self.path.display(), e)))?;
let pid: i32 = contents
.trim()
.parse()
.map_err(|_| DaemonError::PidFile(format!("invalid PID in {}", self.path.display())))?;
// Check if process is still running
if is_process_running(pid) {
Ok(Some(pid))
} else {
// Stale PID file
debug!(pid, path = %self.path.display(), "Removing stale PID file");
let _ = fs::remove_file(&self.path);
Ok(None)
}
}
/// Acquires the PID file lock and writes the current PID.
///
/// Fails if another instance is already running.
pub fn acquire(&mut self) -> Result<(), DaemonError> {
// Check for running instance first
if let Some(pid) = self.check_running()? {
return Err(DaemonError::AlreadyRunning(pid));
}
// Ensure parent directory exists
if let Some(parent) = self.path.parent() {
if !parent.exists() {
fs::create_dir_all(parent).map_err(|e| {
DaemonError::PidFile(format!(
"cannot create directory {}: {}",
parent.display(),
e
))
})?;
}
}
// Open/create PID file with exclusive lock
let file = OpenOptions::new()
.write(true)
.create(true)
.truncate(true)
.mode(0o644)
.open(&self.path)
.map_err(|e| {
DaemonError::PidFile(format!("cannot open {}: {}", self.path.display(), e))
})?;
// Try to acquire exclusive lock (non-blocking)
let flock = Flock::lock(file, FlockArg::LockExclusiveNonblock).map_err(|(_, errno)| {
// Check if another instance grabbed the lock
if let Some(pid) = self.check_running().ok().flatten() {
DaemonError::AlreadyRunning(pid)
} else {
DaemonError::PidFile(format!("cannot lock {}: {}", self.path.display(), errno))
}
})?;
// Write our PID
let pid = getpid();
let mut file = flock.unlock().map_err(|(_, errno)| {
DaemonError::PidFile(format!("unlock failed: {}", errno))
})?;
writeln!(file, "{}", pid).map_err(|e| {
DaemonError::PidFile(format!("cannot write PID to {}: {}", self.path.display(), e))
})?;
// Re-acquire lock and keep it
let flock = Flock::lock(file, FlockArg::LockExclusiveNonblock).map_err(|(_, errno)| {
DaemonError::PidFile(format!("cannot re-lock {}: {}", self.path.display(), errno))
})?;
self.file = Some(flock.unlock().map_err(|(_, errno)| {
DaemonError::PidFile(format!("unlock for storage failed: {}", errno))
})?);
self.locked = true;
info!(pid = pid.as_raw(), path = %self.path.display(), "PID file created");
Ok(())
}
/// Releases the PID file lock and removes the file.
pub fn release(&mut self) -> Result<(), DaemonError> {
if let Some(file) = self.file.take() {
drop(file);
}
self.locked = false;
if self.path.exists() {
fs::remove_file(&self.path).map_err(|e| {
DaemonError::PidFile(format!("cannot remove {}: {}", self.path.display(), e))
})?;
debug!(path = %self.path.display(), "PID file removed");
}
Ok(())
}
/// Returns the path to this PID file.
#[allow(dead_code)]
pub fn path(&self) -> &Path {
&self.path
}
}
impl Drop for PidFile {
fn drop(&mut self) {
if self.locked {
if let Err(e) = self.release() {
warn!(error = %e, "Failed to clean up PID file on drop");
}
}
}
}
/// Checks if a process with the given PID is running.
fn is_process_running(pid: i32) -> bool {
// kill(pid, 0) checks if process exists without sending a signal
nix::sys::signal::kill(Pid::from_raw(pid), None).is_ok()
}
/// Drops privileges to the specified user and group.
///
/// This should be called after binding privileged ports but before
/// entering the main event loop.
pub fn drop_privileges(user: Option<&str>, group: Option<&str>) -> Result<(), DaemonError> {
// Look up group first (need to do this while still root)
let target_gid = if let Some(group_name) = group {
Some(lookup_group(group_name)?)
} else if let Some(user_name) = user {
// If no group specified but user is, use user's primary group
Some(lookup_user_primary_gid(user_name)?)
} else {
None
};
// Look up user
let target_uid = if let Some(user_name) = user {
Some(lookup_user(user_name)?)
} else {
None
};
// Drop privileges: set GID first, then UID
// (Setting UID first would prevent us from setting GID)
if let Some(gid) = target_gid {
unistd::setgid(gid).map_err(DaemonError::PrivilegeDrop)?;
// Also set supplementary groups to just this one
unistd::setgroups(&[gid]).map_err(DaemonError::PrivilegeDrop)?;
info!(gid = gid.as_raw(), "Dropped group privileges");
}
if let Some(uid) = target_uid {
unistd::setuid(uid).map_err(DaemonError::PrivilegeDrop)?;
info!(uid = uid.as_raw(), "Dropped user privileges");
}
Ok(())
}
/// Looks up a user by name and returns their UID.
fn lookup_user(name: &str) -> Result<Uid, DaemonError> {
// Use libc getpwnam
let c_name = std::ffi::CString::new(name).map_err(|_| DaemonError::UserNotFound(name.to_string()))?;
unsafe {
let pwd = libc::getpwnam(c_name.as_ptr());
if pwd.is_null() {
Err(DaemonError::UserNotFound(name.to_string()))
} else {
Ok(Uid::from_raw((*pwd).pw_uid))
}
}
}
/// Looks up a user's primary GID by username.
fn lookup_user_primary_gid(name: &str) -> Result<Gid, DaemonError> {
let c_name = std::ffi::CString::new(name).map_err(|_| DaemonError::UserNotFound(name.to_string()))?;
unsafe {
let pwd = libc::getpwnam(c_name.as_ptr());
if pwd.is_null() {
Err(DaemonError::UserNotFound(name.to_string()))
} else {
Ok(Gid::from_raw((*pwd).pw_gid))
}
}
}
/// Looks up a group by name and returns its GID.
fn lookup_group(name: &str) -> Result<Gid, DaemonError> {
let c_name = std::ffi::CString::new(name).map_err(|_| DaemonError::GroupNotFound(name.to_string()))?;
unsafe {
let grp = libc::getgrnam(c_name.as_ptr());
if grp.is_null() {
Err(DaemonError::GroupNotFound(name.to_string()))
} else {
Ok(Gid::from_raw((*grp).gr_gid))
}
}
}
/// Reads PID from a PID file.
#[allow(dead_code)]
pub fn read_pid_file<P: AsRef<Path>>(path: P) -> Result<i32, DaemonError> {
let path = path.as_ref();
let mut contents = String::new();
File::open(path)
.and_then(|mut f| f.read_to_string(&mut contents))
.map_err(|e| DaemonError::PidFile(format!("cannot read {}: {}", path.display(), e)))?;
contents
.trim()
.parse()
.map_err(|_| DaemonError::PidFile(format!("invalid PID in {}", path.display())))
}
/// Sends a signal to the process specified in a PID file.
#[allow(dead_code)]
pub fn signal_pid_file<P: AsRef<Path>>(
path: P,
signal: nix::sys::signal::Signal,
) -> Result<(), DaemonError> {
let pid = read_pid_file(&path)?;
if !is_process_running(pid) {
return Err(DaemonError::PidFile(format!(
"process {} from {} is not running",
pid,
path.as_ref().display()
)));
}
nix::sys::signal::kill(Pid::from_raw(pid), signal).map_err(|e| {
DaemonError::PidFile(format!("cannot signal process {}: {}", pid, e))
})?;
Ok(())
}
/// Returns the status of the daemon based on PID file.
#[allow(dead_code)]
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum DaemonStatus {
/// Daemon is running with the given PID.
Running(i32),
/// PID file exists but process is not running.
Stale(i32),
/// No PID file exists.
NotRunning,
}
/// Checks the daemon status from a PID file.
#[allow(dead_code)]
pub fn check_status<P: AsRef<Path>>(path: P) -> DaemonStatus {
let path = path.as_ref();
if !path.exists() {
return DaemonStatus::NotRunning;
}
match read_pid_file(path) {
Ok(pid) => {
if is_process_running(pid) {
DaemonStatus::Running(pid)
} else {
DaemonStatus::Stale(pid)
}
}
Err(_) => DaemonStatus::NotRunning,
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_daemon_options_default() {
let opts = DaemonOptions::default();
assert!(!opts.daemonize);
assert!(!opts.should_daemonize());
assert_eq!(opts.pid_file_path(), Path::new(DEFAULT_PID_FILE));
}
#[test]
fn test_daemon_options_foreground_overrides() {
let opts = DaemonOptions {
daemonize: true,
foreground: true,
..Default::default()
};
assert!(!opts.should_daemonize());
}
#[test]
fn test_check_status_not_running() {
let path = "/tmp/telemt_test_nonexistent.pid";
assert_eq!(check_status(path), DaemonStatus::NotRunning);
}
#[test]
fn test_pid_file_basic() {
let path = "/tmp/telemt_test_pidfile.pid";
let _ = fs::remove_file(path);
let mut pf = PidFile::new(path);
assert!(pf.check_running().unwrap().is_none());
pf.acquire().unwrap();
assert!(Path::new(path).exists());
// Read it back
let pid = read_pid_file(path).unwrap();
assert_eq!(pid, std::process::id() as i32);
pf.release().unwrap();
assert!(!Path::new(path).exists());
}
}

291
src/logging.rs Normal file
View File

@@ -0,0 +1,291 @@
//! Logging configuration for telemt.
//!
//! Supports multiple log destinations:
//! - stderr (default, works with systemd journald)
//! - syslog (Unix only, for traditional init systems)
//! - file (with optional rotation)
#![allow(dead_code)] // Infrastructure module - used via CLI flags
use std::path::Path;
use tracing_subscriber::layer::SubscriberExt;
use tracing_subscriber::util::SubscriberInitExt;
use tracing_subscriber::{EnvFilter, fmt, reload};
/// Log destination configuration.
#[derive(Debug, Clone, Default)]
pub enum LogDestination {
/// Log to stderr (default, captured by systemd journald).
#[default]
Stderr,
/// Log to syslog (Unix only).
#[cfg(unix)]
Syslog,
/// Log to a file with optional rotation.
File {
path: String,
/// Rotate daily if true.
rotate_daily: bool,
},
}
/// Logging options parsed from CLI/config.
#[derive(Debug, Clone, Default)]
pub struct LoggingOptions {
/// Where to send logs.
pub destination: LogDestination,
/// Disable ANSI colors.
pub disable_colors: bool,
}
/// Guard that must be held to keep file logging active.
/// When dropped, flushes and closes log files.
pub struct LoggingGuard {
_guard: Option<tracing_appender::non_blocking::WorkerGuard>,
}
impl LoggingGuard {
fn new(guard: Option<tracing_appender::non_blocking::WorkerGuard>) -> Self {
Self { _guard: guard }
}
/// Creates a no-op guard for stderr/syslog logging.
pub fn noop() -> Self {
Self { _guard: None }
}
}
/// Initialize the tracing subscriber with the specified options.
///
/// Returns a reload handle for dynamic log level changes and a guard
/// that must be kept alive for file logging.
pub fn init_logging(
opts: &LoggingOptions,
initial_filter: &str,
) -> (reload::Handle<EnvFilter, impl tracing::Subscriber + Send + Sync>, LoggingGuard) {
let (filter_layer, filter_handle) = reload::Layer::new(EnvFilter::new(initial_filter));
match &opts.destination {
LogDestination::Stderr => {
let fmt_layer = fmt::Layer::default()
.with_ansi(!opts.disable_colors)
.with_target(true);
tracing_subscriber::registry()
.with(filter_layer)
.with(fmt_layer)
.init();
(filter_handle, LoggingGuard::noop())
}
#[cfg(unix)]
LogDestination::Syslog => {
// Use a custom fmt layer that writes to syslog
let fmt_layer = fmt::Layer::default()
.with_ansi(false)
.with_target(true)
.with_writer(SyslogWriter::new);
tracing_subscriber::registry()
.with(filter_layer)
.with(fmt_layer)
.init();
(filter_handle, LoggingGuard::noop())
}
LogDestination::File { path, rotate_daily } => {
let (non_blocking, guard) = if *rotate_daily {
// Extract directory and filename prefix
let path = Path::new(path);
let dir = path.parent().unwrap_or(Path::new("/var/log"));
let prefix = path.file_name()
.and_then(|s| s.to_str())
.unwrap_or("telemt");
let file_appender = tracing_appender::rolling::daily(dir, prefix);
tracing_appender::non_blocking(file_appender)
} else {
let file = std::fs::OpenOptions::new()
.create(true)
.append(true)
.open(path)
.expect("Failed to open log file");
tracing_appender::non_blocking(file)
};
let fmt_layer = fmt::Layer::default()
.with_ansi(false)
.with_target(true)
.with_writer(non_blocking);
tracing_subscriber::registry()
.with(filter_layer)
.with(fmt_layer)
.init();
(filter_handle, LoggingGuard::new(Some(guard)))
}
}
}
/// Syslog writer for tracing.
#[cfg(unix)]
struct SyslogWriter {
_private: (),
}
#[cfg(unix)]
impl SyslogWriter {
fn new() -> Self {
// Open syslog connection on first use
static INIT: std::sync::Once = std::sync::Once::new();
INIT.call_once(|| {
unsafe {
// Open syslog with ident "telemt", LOG_PID, LOG_DAEMON facility
let ident = b"telemt\0".as_ptr() as *const libc::c_char;
libc::openlog(ident, libc::LOG_PID | libc::LOG_NDELAY, libc::LOG_DAEMON);
}
});
Self { _private: () }
}
}
#[cfg(unix)]
impl std::io::Write for SyslogWriter {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
// Convert to C string, stripping newlines
let msg = String::from_utf8_lossy(buf);
let msg = msg.trim_end();
if msg.is_empty() {
return Ok(buf.len());
}
// Determine priority based on log level in the message
let priority = if msg.contains(" ERROR ") || msg.contains(" error ") {
libc::LOG_ERR
} else if msg.contains(" WARN ") || msg.contains(" warn ") {
libc::LOG_WARNING
} else if msg.contains(" INFO ") || msg.contains(" info ") {
libc::LOG_INFO
} else if msg.contains(" DEBUG ") || msg.contains(" debug ") {
libc::LOG_DEBUG
} else {
libc::LOG_INFO
};
// Write to syslog
let c_msg = std::ffi::CString::new(msg.as_bytes())
.unwrap_or_else(|_| std::ffi::CString::new("(invalid utf8)").unwrap());
unsafe {
libc::syslog(priority, b"%s\0".as_ptr() as *const libc::c_char, c_msg.as_ptr());
}
Ok(buf.len())
}
fn flush(&mut self) -> std::io::Result<()> {
Ok(())
}
}
#[cfg(unix)]
impl<'a> tracing_subscriber::fmt::MakeWriter<'a> for SyslogWriter {
type Writer = SyslogWriter;
fn make_writer(&'a self) -> Self::Writer {
SyslogWriter::new()
}
}
/// Parse log destination from CLI arguments.
pub fn parse_log_destination(args: &[String]) -> LogDestination {
let mut i = 0;
while i < args.len() {
match args[i].as_str() {
#[cfg(unix)]
"--syslog" => {
return LogDestination::Syslog;
}
"--log-file" => {
i += 1;
if i < args.len() {
return LogDestination::File {
path: args[i].clone(),
rotate_daily: false,
};
}
}
s if s.starts_with("--log-file=") => {
return LogDestination::File {
path: s.trim_start_matches("--log-file=").to_string(),
rotate_daily: false,
};
}
"--log-file-daily" => {
i += 1;
if i < args.len() {
return LogDestination::File {
path: args[i].clone(),
rotate_daily: true,
};
}
}
s if s.starts_with("--log-file-daily=") => {
return LogDestination::File {
path: s.trim_start_matches("--log-file-daily=").to_string(),
rotate_daily: true,
};
}
_ => {}
}
i += 1;
}
LogDestination::Stderr
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_log_destination_default() {
let args: Vec<String> = vec![];
assert!(matches!(parse_log_destination(&args), LogDestination::Stderr));
}
#[test]
fn test_parse_log_destination_file() {
let args = vec!["--log-file".to_string(), "/var/log/telemt.log".to_string()];
match parse_log_destination(&args) {
LogDestination::File { path, rotate_daily } => {
assert_eq!(path, "/var/log/telemt.log");
assert!(!rotate_daily);
}
_ => panic!("Expected File destination"),
}
}
#[test]
fn test_parse_log_destination_file_daily() {
let args = vec!["--log-file-daily=/var/log/telemt".to_string()];
match parse_log_destination(&args) {
LogDestination::File { path, rotate_daily } => {
assert_eq!(path, "/var/log/telemt");
assert!(rotate_daily);
}
_ => panic!("Expected File destination"),
}
}
#[cfg(unix)]
#[test]
fn test_parse_log_destination_syslog() {
let args = vec!["--syslog".to_string()];
assert!(matches!(parse_log_destination(&args), LogDestination::Syslog));
}
}

View File

@@ -6,11 +6,21 @@ use tracing::{debug, error, info, warn};
use crate::cli;
use crate::config::ProxyConfig;
use crate::logging::LogDestination;
use crate::transport::middle_proxy::{
ProxyConfigData, fetch_proxy_config_with_raw, load_proxy_config_cache, save_proxy_config_cache,
};
pub(crate) fn parse_cli() -> (String, Option<PathBuf>, bool, Option<String>) {
/// Parsed CLI arguments.
pub(crate) struct CliArgs {
pub config_path: String,
pub data_path: Option<PathBuf>,
pub silent: bool,
pub log_level: Option<String>,
pub log_destination: LogDestination,
}
pub(crate) fn parse_cli() -> CliArgs {
let mut config_path = "config.toml".to_string();
let mut data_path: Option<PathBuf> = None;
let mut silent = false;
@@ -18,6 +28,9 @@ pub(crate) fn parse_cli() -> (String, Option<PathBuf>, bool, Option<String>) {
let args: Vec<String> = std::env::args().skip(1).collect();
// Parse log destination
let log_destination = crate::logging::parse_log_destination(&args);
// Check for --init first (handled before tokio)
if let Some(init_opts) = cli::parse_init_args(&args) {
if let Err(e) = cli::run_init(init_opts) {
@@ -55,34 +68,35 @@ pub(crate) fn parse_cli() -> (String, Option<PathBuf>, bool, Option<String>) {
log_level = Some(s.trim_start_matches("--log-level=").to_string());
}
"--help" | "-h" => {
eprintln!("Usage: telemt [config.toml] [OPTIONS]");
eprintln!();
eprintln!("Options:");
eprintln!(" --data-path <DIR> Set data directory (absolute path; overrides config value)");
eprintln!(" --silent, -s Suppress info logs");
eprintln!(" --log-level <LEVEL> debug|verbose|normal|silent");
eprintln!(" --help, -h Show this help");
eprintln!();
eprintln!("Setup (fire-and-forget):");
eprintln!(
" --init Generate config, install systemd service, start"
);
eprintln!(" --port <PORT> Listen port (default: 443)");
eprintln!(
" --domain <DOMAIN> TLS domain for masking (default: www.google.com)"
);
eprintln!(
" --secret <HEX> 32-char hex secret (auto-generated if omitted)"
);
eprintln!(" --user <NAME> Username (default: user)");
eprintln!(" --config-dir <DIR> Config directory (default: /etc/telemt)");
eprintln!(" --no-start Don't start the service after install");
print_help();
std::process::exit(0);
}
"--version" | "-V" => {
println!("telemt {}", env!("CARGO_PKG_VERSION"));
std::process::exit(0);
}
// Skip daemon-related flags (already parsed)
"--daemon" | "-d" | "--foreground" | "-f" => {}
s if s.starts_with("--pid-file") => {
if !s.contains('=') {
i += 1; // skip value
}
}
s if s.starts_with("--run-as-user") => {
if !s.contains('=') {
i += 1;
}
}
s if s.starts_with("--run-as-group") => {
if !s.contains('=') {
i += 1;
}
}
s if s.starts_with("--working-dir") => {
if !s.contains('=') {
i += 1;
}
}
s if !s.starts_with('-') => {
config_path = s.to_string();
}
@@ -93,7 +107,77 @@ pub(crate) fn parse_cli() -> (String, Option<PathBuf>, bool, Option<String>) {
i += 1;
}
(config_path, data_path, silent, log_level)
CliArgs {
config_path,
data_path,
silent,
log_level,
log_destination,
}
}
fn print_help() {
eprintln!("Usage: telemt [COMMAND] [OPTIONS] [config.toml]");
eprintln!();
eprintln!("Commands:");
eprintln!(" run Run in foreground (default if no command given)");
#[cfg(unix)]
{
eprintln!(" start Start as background daemon");
eprintln!(" stop Stop a running daemon");
eprintln!(" reload Reload configuration (send SIGHUP)");
eprintln!(" status Check if daemon is running");
}
eprintln!();
eprintln!("Options:");
eprintln!(" --data-path <DIR> Set data directory (absolute path; overrides config value)");
eprintln!(" --silent, -s Suppress info logs");
eprintln!(" --log-level <LEVEL> debug|verbose|normal|silent");
eprintln!(" --help, -h Show this help");
eprintln!(" --version, -V Show version");
eprintln!();
eprintln!("Logging options:");
eprintln!(" --log-file <PATH> Log to file (default: stderr)");
eprintln!(" --log-file-daily <PATH> Log to file with daily rotation");
#[cfg(unix)]
eprintln!(" --syslog Log to syslog (Unix only)");
eprintln!();
#[cfg(unix)]
{
eprintln!("Daemon options (Unix only):");
eprintln!(" --daemon, -d Fork to background (daemonize)");
eprintln!(" --foreground, -f Explicit foreground mode (for systemd)");
eprintln!(" --pid-file <PATH> PID file path (default: /var/run/telemt.pid)");
eprintln!(" --run-as-user <USER> Drop privileges to this user after binding");
eprintln!(" --run-as-group <GROUP> Drop privileges to this group after binding");
eprintln!(" --working-dir <DIR> Working directory for daemon mode");
eprintln!();
}
eprintln!("Setup (fire-and-forget):");
eprintln!(
" --init Generate config, install systemd service, start"
);
eprintln!(" --port <PORT> Listen port (default: 443)");
eprintln!(
" --domain <DOMAIN> TLS domain for masking (default: www.google.com)"
);
eprintln!(
" --secret <HEX> 32-char hex secret (auto-generated if omitted)"
);
eprintln!(" --user <NAME> Username (default: user)");
eprintln!(" --config-dir <DIR> Config directory (default: /etc/telemt)");
eprintln!(" --no-start Don't start the service after install");
#[cfg(unix)]
{
eprintln!();
eprintln!("Examples:");
eprintln!(" telemt config.toml Run in foreground");
eprintln!(" telemt start config.toml Start as daemon");
eprintln!(" telemt start --pid-file /tmp/t.pid Start with custom PID file");
eprintln!(" telemt stop Stop daemon");
eprintln!(" telemt reload Reload configuration");
eprintln!(" telemt status Check daemon status");
}
}
pub(crate) fn print_proxy_links(host: &str, port: u16, config: &ProxyConfig) {

View File

@@ -237,6 +237,7 @@ pub(crate) async fn initialize_me_pool(
config.general.me_adaptive_floor_max_warm_writers_global,
config.general.hardswap,
config.general.me_pool_drain_ttl_secs,
config.general.me_instadrain,
config.general.me_pool_drain_threshold,
config.general.me_pool_drain_soft_evict_enabled,
config.general.me_pool_drain_soft_evict_grace_secs,
@@ -331,18 +332,76 @@ pub(crate) async fn initialize_me_pool(
"Middle-End pool initialized successfully"
);
let pool_health = pool_bg.clone();
let rng_health = rng_bg.clone();
let min_conns = pool_size;
tokio::spawn(async move {
crate::transport::middle_proxy::me_health_monitor(
pool_health,
rng_health,
min_conns,
)
.await;
});
break;
// ── Supervised background tasks ──────────────────
// Each task runs inside a nested tokio::spawn so
// that a panic is caught via JoinHandle and the
// outer loop restarts the task automatically.
let pool_health = pool_bg.clone();
let rng_health = rng_bg.clone();
let min_conns = pool_size;
tokio::spawn(async move {
loop {
let p = pool_health.clone();
let r = rng_health.clone();
let res = tokio::spawn(async move {
crate::transport::middle_proxy::me_health_monitor(
p, r, min_conns,
)
.await;
})
.await;
match res {
Ok(()) => warn!("me_health_monitor exited unexpectedly, restarting"),
Err(e) => {
error!(error = %e, "me_health_monitor panicked, restarting in 1s");
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
}
});
let pool_drain_enforcer = pool_bg.clone();
tokio::spawn(async move {
loop {
let p = pool_drain_enforcer.clone();
let res = tokio::spawn(async move {
crate::transport::middle_proxy::me_drain_timeout_enforcer(p).await;
})
.await;
match res {
Ok(()) => warn!("me_drain_timeout_enforcer exited unexpectedly, restarting"),
Err(e) => {
error!(error = %e, "me_drain_timeout_enforcer panicked, restarting in 1s");
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
}
});
let pool_watchdog = pool_bg.clone();
tokio::spawn(async move {
loop {
let p = pool_watchdog.clone();
let res = tokio::spawn(async move {
crate::transport::middle_proxy::me_zombie_writer_watchdog(p).await;
})
.await;
match res {
Ok(()) => warn!("me_zombie_writer_watchdog exited unexpectedly, restarting"),
Err(e) => {
error!(error = %e, "me_zombie_writer_watchdog panicked, restarting in 1s");
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
}
});
// CRITICAL: keep the current-thread runtime
// alive. Without this, block_on() returns,
// the Runtime is dropped, and ALL spawned
// background tasks (health monitor, drain
// enforcer, zombie watchdog) are silently
// cancelled — causing the draining-writer
// leak that brought us here.
std::future::pending::<()>().await;
unreachable!();
}
Err(e) => {
startup_tracker_bg.set_me_last_error(Some(e.to_string())).await;
@@ -400,16 +459,65 @@ pub(crate) async fn initialize_me_pool(
"Middle-End pool initialized successfully"
);
let pool_clone = pool.clone();
let rng_clone = rng.clone();
let min_conns = pool_size;
tokio::spawn(async move {
crate::transport::middle_proxy::me_health_monitor(
pool_clone, rng_clone, min_conns,
)
.await;
});
// ── Supervised background tasks ──────────────────
let pool_clone = pool.clone();
let rng_clone = rng.clone();
let min_conns = pool_size;
tokio::spawn(async move {
loop {
let p = pool_clone.clone();
let r = rng_clone.clone();
let res = tokio::spawn(async move {
crate::transport::middle_proxy::me_health_monitor(
p, r, min_conns,
)
.await;
})
.await;
match res {
Ok(()) => warn!("me_health_monitor exited unexpectedly, restarting"),
Err(e) => {
error!(error = %e, "me_health_monitor panicked, restarting in 1s");
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
}
});
let pool_drain_enforcer = pool.clone();
tokio::spawn(async move {
loop {
let p = pool_drain_enforcer.clone();
let res = tokio::spawn(async move {
crate::transport::middle_proxy::me_drain_timeout_enforcer(p).await;
})
.await;
match res {
Ok(()) => warn!("me_drain_timeout_enforcer exited unexpectedly, restarting"),
Err(e) => {
error!(error = %e, "me_drain_timeout_enforcer panicked, restarting in 1s");
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
}
});
let pool_watchdog = pool.clone();
tokio::spawn(async move {
loop {
let p = pool_watchdog.clone();
let res = tokio::spawn(async move {
crate::transport::middle_proxy::me_zombie_writer_watchdog(p).await;
})
.await;
match res {
Ok(()) => warn!("me_zombie_writer_watchdog exited unexpectedly, restarting"),
Err(e) => {
error!(error = %e, "me_zombie_writer_watchdog panicked, restarting in 1s");
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
}
});
break Some(pool);
}
Err(e) => {

View File

@@ -47,8 +47,56 @@ use crate::transport::middle_proxy::MePool;
use crate::transport::UpstreamManager;
use helpers::parse_cli;
#[cfg(unix)]
use crate::daemon::{DaemonOptions, PidFile, drop_privileges};
/// Runs the full telemt runtime startup pipeline and blocks until shutdown.
///
/// On Unix, daemon options should be handled before calling this function
/// (daemonization must happen before tokio runtime starts).
#[cfg(unix)]
pub async fn run_with_daemon(
daemon_opts: DaemonOptions,
) -> std::result::Result<(), Box<dyn std::error::Error>> {
run_inner(daemon_opts).await
}
/// Runs the full telemt runtime startup pipeline and blocks until shutdown.
///
/// This is the main entry point for non-daemon mode or when called as a library.
#[allow(dead_code)]
pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
#[cfg(unix)]
{
// Parse CLI to get daemon options even in simple run() path
let args: Vec<String> = std::env::args().skip(1).collect();
let daemon_opts = crate::cli::parse_daemon_args(&args);
run_inner(daemon_opts).await
}
#[cfg(not(unix))]
{
run_inner().await
}
}
#[cfg(unix)]
async fn run_inner(
daemon_opts: DaemonOptions,
) -> std::result::Result<(), Box<dyn std::error::Error>> {
// Acquire PID file if daemonizing or if explicitly requested
// Keep it alive until shutdown (underscore prefix = intentionally kept for RAII cleanup)
let _pid_file = if daemon_opts.daemonize || daemon_opts.pid_file.is_some() {
let mut pf = PidFile::new(daemon_opts.pid_file_path());
if let Err(e) = pf.acquire() {
eprintln!("[telemt] {}", e);
std::process::exit(1);
}
Some(pf)
} else {
None
};
let process_started_at = Instant::now();
let process_started_at_epoch_secs = SystemTime::now()
.duration_since(UNIX_EPOCH)
@@ -58,7 +106,12 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
startup_tracker
.start_component(COMPONENT_CONFIG_LOAD, Some("load and validate config".to_string()))
.await;
let (config_path, data_path, cli_silent, cli_log_level) = parse_cli();
let cli_args = parse_cli();
let config_path = cli_args.config_path;
let data_path = cli_args.data_path;
let cli_silent = cli_args.silent;
let cli_log_level = cli_args.log_level;
let log_destination = cli_args.log_destination;
let mut config = match ProxyConfig::load(&config_path) {
Ok(c) => c,
@@ -130,17 +183,43 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
.start_component(COMPONENT_TRACING_INIT, Some("initialize tracing subscriber".to_string()))
.await;
// Configure color output based on config
let fmt_layer = if config.general.disable_colors {
fmt::Layer::default().with_ansi(false)
} else {
fmt::Layer::default().with_ansi(true)
};
// Initialize logging based on destination
let _logging_guard: Option<crate::logging::LoggingGuard>;
match log_destination {
crate::logging::LogDestination::Stderr => {
// Default: log to stderr (works with systemd journald)
let fmt_layer = if config.general.disable_colors {
fmt::Layer::default().with_ansi(false)
} else {
fmt::Layer::default().with_ansi(true)
};
tracing_subscriber::registry()
.with(filter_layer)
.with(fmt_layer)
.init();
_logging_guard = None;
}
#[cfg(unix)]
crate::logging::LogDestination::Syslog => {
// Syslog: for OpenRC/FreeBSD
let logging_opts = crate::logging::LoggingOptions {
destination: log_destination,
disable_colors: true,
};
let (_, guard) = crate::logging::init_logging(&logging_opts, "info");
_logging_guard = Some(guard);
}
crate::logging::LogDestination::File { .. } => {
// File logging with optional rotation
let logging_opts = crate::logging::LoggingOptions {
destination: log_destination,
disable_colors: true,
};
let (_, guard) = crate::logging::init_logging(&logging_opts, "info");
_logging_guard = Some(guard);
}
}
tracing_subscriber::registry()
.with(filter_layer)
.with(fmt_layer)
.init();
startup_tracker
.complete_component(COMPONENT_TRACING_INIT, Some("tracing initialized".to_string()))
.await;
@@ -555,6 +634,17 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
std::process::exit(1);
}
// Drop privileges after binding sockets (which may require root for port < 1024)
if daemon_opts.user.is_some() || daemon_opts.group.is_some() {
if let Err(e) = drop_privileges(
daemon_opts.user.as_deref(),
daemon_opts.group.as_deref(),
) {
error!(error = %e, "Failed to drop privileges");
std::process::exit(1);
}
}
runtime_tasks::apply_runtime_log_filter(
has_rust_log,
&effective_log_level,
@@ -575,6 +665,9 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
runtime_tasks::mark_runtime_ready(&startup_tracker).await;
// Spawn signal handlers for SIGUSR1/SIGUSR2 (non-shutdown signals)
shutdown::spawn_signal_handlers(stats.clone(), process_started_at);
listeners::spawn_tcp_accept_loops(
listeners,
config_rx.clone(),
@@ -592,7 +685,7 @@ pub async fn run() -> std::result::Result<(), Box<dyn std::error::Error>> {
max_connections.clone(),
);
shutdown::wait_for_shutdown(process_started_at, me_pool).await;
shutdown::wait_for_shutdown(process_started_at, me_pool, stats).await;
Ok(())
}

View File

@@ -1,42 +1,211 @@
//! Shutdown and signal handling for telemt.
//!
//! Handles graceful shutdown on various signals:
//! - SIGINT (Ctrl+C) / SIGTERM: Graceful shutdown
//! - SIGQUIT: Graceful shutdown with stats dump
//! - SIGUSR1: Reserved for log rotation (logs acknowledgment)
//! - SIGUSR2: Dump runtime status to log
//!
//! SIGHUP is handled separately in config/hot_reload.rs for config reload.
use std::sync::Arc;
use std::time::{Duration, Instant};
#[cfg(unix)]
use tokio::signal::unix::{SignalKind, signal};
#[cfg(not(unix))]
use tokio::signal;
use tracing::{error, info, warn};
use tracing::{info, warn};
use crate::stats::Stats;
use crate::transport::middle_proxy::MePool;
use super::helpers::{format_uptime, unit_label};
pub(crate) async fn wait_for_shutdown(process_started_at: Instant, me_pool: Option<Arc<MePool>>) {
match signal::ctrl_c().await {
Ok(()) => {
let shutdown_started_at = Instant::now();
info!("Shutting down...");
let uptime_secs = process_started_at.elapsed().as_secs();
info!("Uptime: {}", format_uptime(uptime_secs));
if let Some(pool) = &me_pool {
match tokio::time::timeout(Duration::from_secs(2), pool.shutdown_send_close_conn_all())
.await
{
Ok(total) => {
info!(
close_conn_sent = total,
"ME shutdown: RPC_CLOSE_CONN broadcast completed"
);
}
Err(_) => {
warn!("ME shutdown: RPC_CLOSE_CONN broadcast timed out");
}
}
}
let shutdown_secs = shutdown_started_at.elapsed().as_secs();
info!(
"Shutdown completed successfully in {} {}.",
shutdown_secs,
unit_label(shutdown_secs, "second", "seconds")
);
/// Signal that triggered shutdown.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum ShutdownSignal {
/// SIGINT (Ctrl+C)
Interrupt,
/// SIGTERM
Terminate,
/// SIGQUIT (with stats dump)
Quit,
}
impl std::fmt::Display for ShutdownSignal {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
ShutdownSignal::Interrupt => write!(f, "SIGINT"),
ShutdownSignal::Terminate => write!(f, "SIGTERM"),
ShutdownSignal::Quit => write!(f, "SIGQUIT"),
}
Err(e) => error!("Signal error: {}", e),
}
}
/// Waits for a shutdown signal and performs graceful shutdown.
pub(crate) async fn wait_for_shutdown(
process_started_at: Instant,
me_pool: Option<Arc<MePool>>,
stats: Arc<Stats>,
) {
let signal = wait_for_shutdown_signal().await;
perform_shutdown(signal, process_started_at, me_pool, &stats).await;
}
/// Waits for any shutdown signal (SIGINT, SIGTERM, SIGQUIT).
#[cfg(unix)]
async fn wait_for_shutdown_signal() -> ShutdownSignal {
let mut sigint = signal(SignalKind::interrupt()).expect("Failed to register SIGINT handler");
let mut sigterm = signal(SignalKind::terminate()).expect("Failed to register SIGTERM handler");
let mut sigquit = signal(SignalKind::quit()).expect("Failed to register SIGQUIT handler");
tokio::select! {
_ = sigint.recv() => ShutdownSignal::Interrupt,
_ = sigterm.recv() => ShutdownSignal::Terminate,
_ = sigquit.recv() => ShutdownSignal::Quit,
}
}
#[cfg(not(unix))]
async fn wait_for_shutdown_signal() -> ShutdownSignal {
signal::ctrl_c().await.expect("Failed to listen for Ctrl+C");
ShutdownSignal::Interrupt
}
/// Performs graceful shutdown sequence.
async fn perform_shutdown(
signal: ShutdownSignal,
process_started_at: Instant,
me_pool: Option<Arc<MePool>>,
stats: &Stats,
) {
let shutdown_started_at = Instant::now();
info!(signal = %signal, "Received shutdown signal");
// Dump stats if SIGQUIT
if signal == ShutdownSignal::Quit {
dump_stats(stats, process_started_at);
}
info!("Shutting down...");
let uptime_secs = process_started_at.elapsed().as_secs();
info!("Uptime: {}", format_uptime(uptime_secs));
// Graceful ME pool shutdown
if let Some(pool) = &me_pool {
match tokio::time::timeout(Duration::from_secs(2), pool.shutdown_send_close_conn_all()).await
{
Ok(total) => {
info!(
close_conn_sent = total,
"ME shutdown: RPC_CLOSE_CONN broadcast completed"
);
}
Err(_) => {
warn!("ME shutdown: RPC_CLOSE_CONN broadcast timed out");
}
}
}
let shutdown_secs = shutdown_started_at.elapsed().as_secs();
info!(
"Shutdown completed successfully in {} {}.",
shutdown_secs,
unit_label(shutdown_secs, "second", "seconds")
);
}
/// Dumps runtime statistics to the log.
fn dump_stats(stats: &Stats, process_started_at: Instant) {
let uptime_secs = process_started_at.elapsed().as_secs();
info!("=== Runtime Statistics Dump ===");
info!("Uptime: {}", format_uptime(uptime_secs));
// Connection stats
info!(
"Connections: total={}, current={} (direct={}, me={}), bad={}",
stats.get_connects_all(),
stats.get_current_connections_total(),
stats.get_current_connections_direct(),
stats.get_current_connections_me(),
stats.get_connects_bad(),
);
// ME pool stats
info!(
"ME keepalive: sent={}, pong={}, failed={}, timeout={}",
stats.get_me_keepalive_sent(),
stats.get_me_keepalive_pong(),
stats.get_me_keepalive_failed(),
stats.get_me_keepalive_timeout(),
);
// Relay stats
info!(
"Relay adaptive: promotions={}, demotions={}, hard_promotions={}",
stats.get_relay_adaptive_promotions_total(),
stats.get_relay_adaptive_demotions_total(),
stats.get_relay_adaptive_hard_promotions_total(),
);
info!("=== End Statistics Dump ===");
}
/// Spawns a background task to handle operational signals (SIGUSR1, SIGUSR2).
///
/// These signals don't trigger shutdown but perform specific actions:
/// - SIGUSR1: Log rotation acknowledgment (for external log rotation tools)
/// - SIGUSR2: Dump runtime status to log
#[cfg(unix)]
pub(crate) fn spawn_signal_handlers(
stats: Arc<Stats>,
process_started_at: Instant,
) {
tokio::spawn(async move {
let mut sigusr1 = signal(SignalKind::user_defined1())
.expect("Failed to register SIGUSR1 handler");
let mut sigusr2 = signal(SignalKind::user_defined2())
.expect("Failed to register SIGUSR2 handler");
loop {
tokio::select! {
_ = sigusr1.recv() => {
handle_sigusr1();
}
_ = sigusr2.recv() => {
handle_sigusr2(&stats, process_started_at);
}
}
}
});
}
/// No-op on non-Unix platforms.
#[cfg(not(unix))]
pub(crate) fn spawn_signal_handlers(
_stats: Arc<Stats>,
_process_started_at: Instant,
) {
// No SIGUSR1/SIGUSR2 on non-Unix
}
/// Handles SIGUSR1 - log rotation signal.
///
/// This signal is typically sent by logrotate or similar tools after
/// rotating log files. Since tracing-subscriber doesn't natively support
/// reopening files, we just acknowledge the signal. If file logging is
/// added in the future, this would reopen log file handles.
#[cfg(unix)]
fn handle_sigusr1() {
info!("SIGUSR1 received - log rotation acknowledged");
// Future: If using file-based logging, reopen file handles here
}
/// Handles SIGUSR2 - dump runtime status.
#[cfg(unix)]
fn handle_sigusr2(stats: &Stats, process_started_at: Instant) {
info!("SIGUSR2 received - dumping runtime status");
dump_stats(stats, process_started_at);
}

View File

@@ -38,12 +38,15 @@ pub(crate) async fn bootstrap_tls_front(
.clone()
.unwrap_or_else(|| config.censorship.tls_domain.clone());
let mask_unix_sock = config.censorship.mask_unix_sock.clone();
let tls_fetch_scope = (!config.censorship.tls_fetch_scope.is_empty())
.then(|| config.censorship.tls_fetch_scope.clone());
let fetch_timeout = Duration::from_secs(5);
let cache_initial = cache.clone();
let domains_initial = tls_domains.to_vec();
let host_initial = mask_host.clone();
let unix_sock_initial = mask_unix_sock.clone();
let scope_initial = tls_fetch_scope.clone();
let upstream_initial = upstream_manager.clone();
tokio::spawn(async move {
let mut join = tokio::task::JoinSet::new();
@@ -51,6 +54,7 @@ pub(crate) async fn bootstrap_tls_front(
let cache_domain = cache_initial.clone();
let host_domain = host_initial.clone();
let unix_sock_domain = unix_sock_initial.clone();
let scope_domain = scope_initial.clone();
let upstream_domain = upstream_initial.clone();
join.spawn(async move {
match crate::tls_front::fetcher::fetch_real_tls(
@@ -59,6 +63,7 @@ pub(crate) async fn bootstrap_tls_front(
&domain,
fetch_timeout,
Some(upstream_domain),
scope_domain.as_deref(),
proxy_protocol,
unix_sock_domain.as_deref(),
)
@@ -100,6 +105,7 @@ pub(crate) async fn bootstrap_tls_front(
let domains_refresh = tls_domains.to_vec();
let host_refresh = mask_host.clone();
let unix_sock_refresh = mask_unix_sock.clone();
let scope_refresh = tls_fetch_scope.clone();
let upstream_refresh = upstream_manager.clone();
tokio::spawn(async move {
loop {
@@ -112,6 +118,7 @@ pub(crate) async fn bootstrap_tls_front(
let cache_domain = cache_refresh.clone();
let host_domain = host_refresh.clone();
let unix_sock_domain = unix_sock_refresh.clone();
let scope_domain = scope_refresh.clone();
let upstream_domain = upstream_refresh.clone();
join.spawn(async move {
match crate::tls_front::fetcher::fetch_real_tls(
@@ -120,6 +127,7 @@ pub(crate) async fn bootstrap_tls_front(
&domain,
fetch_timeout,
Some(upstream_domain),
scope_domain.as_deref(),
proxy_protocol,
unix_sock_domain.as_deref(),
)

View File

@@ -4,8 +4,12 @@ mod api;
mod cli;
mod config;
mod crypto;
#[cfg(unix)]
mod daemon;
mod error;
mod ip_tracker;
mod logging;
mod service;
#[cfg(test)]
mod ip_tracker_regression_tests;
mod maestro;
@@ -20,7 +24,49 @@ mod tls_front;
mod transport;
mod util;
#[tokio::main]
async fn main() -> std::result::Result<(), Box<dyn std::error::Error>> {
maestro::run().await
fn main() -> std::result::Result<(), Box<dyn std::error::Error>> {
let args: Vec<String> = std::env::args().skip(1).collect();
let cmd = cli::parse_command(&args);
// Handle subcommands that don't need the server (stop, reload, status, init)
if let Some(exit_code) = cli::execute_subcommand(&cmd) {
std::process::exit(exit_code);
}
// On Unix, handle daemonization before starting tokio runtime
#[cfg(unix)]
{
let daemon_opts = cmd.daemon_opts;
// Daemonize if requested (must happen before tokio runtime starts)
if daemon_opts.should_daemonize() {
match daemon::daemonize(daemon_opts.working_dir.as_deref()) {
Ok(daemon::DaemonizeResult::Parent) => {
// Parent process exits successfully
std::process::exit(0);
}
Ok(daemon::DaemonizeResult::Child) => {
// Continue as daemon child
}
Err(e) => {
eprintln!("[telemt] Daemonization failed: {}", e);
std::process::exit(1);
}
}
}
// Now start tokio runtime and run the server
tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()?
.block_on(maestro::run_with_daemon(daemon_opts))
}
#[cfg(not(unix))]
{
tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()?
.block_on(maestro::run())
}
}

View File

@@ -16,7 +16,9 @@ use tracing::{info, warn, debug};
use crate::config::ProxyConfig;
use crate::ip_tracker::UserIpTracker;
use crate::stats::beobachten::BeobachtenStore;
use crate::stats::Stats;
use crate::stats::{
MeWriterCleanupSideEffectStep, MeWriterTeardownMode, MeWriterTeardownReason, Stats,
};
use crate::transport::{ListenOptions, create_listener};
pub async fn serve(
@@ -1770,6 +1772,169 @@ async fn render_metrics(stats: &Stats, config: &ProxyConfig, ip_tracker: &UserIp
}
);
let _ = writeln!(
out,
"# HELP telemt_me_writer_teardown_attempt_total ME writer teardown attempts by reason and mode"
);
let _ = writeln!(out, "# TYPE telemt_me_writer_teardown_attempt_total counter");
for reason in MeWriterTeardownReason::ALL {
for mode in MeWriterTeardownMode::ALL {
let _ = writeln!(
out,
"telemt_me_writer_teardown_attempt_total{{reason=\"{}\",mode=\"{}\"}} {}",
reason.as_str(),
mode.as_str(),
if me_allows_normal {
stats.get_me_writer_teardown_attempt_total(reason, mode)
} else {
0
}
);
}
}
let _ = writeln!(
out,
"# HELP telemt_me_writer_teardown_success_total ME writer teardown successes by mode"
);
let _ = writeln!(out, "# TYPE telemt_me_writer_teardown_success_total counter");
for mode in MeWriterTeardownMode::ALL {
let _ = writeln!(
out,
"telemt_me_writer_teardown_success_total{{mode=\"{}\"}} {}",
mode.as_str(),
if me_allows_normal {
stats.get_me_writer_teardown_success_total(mode)
} else {
0
}
);
}
let _ = writeln!(
out,
"# HELP telemt_me_writer_teardown_timeout_total Teardown operations that timed out"
);
let _ = writeln!(out, "# TYPE telemt_me_writer_teardown_timeout_total counter");
let _ = writeln!(
out,
"telemt_me_writer_teardown_timeout_total {}",
if me_allows_normal {
stats.get_me_writer_teardown_timeout_total()
} else {
0
}
);
let _ = writeln!(
out,
"# HELP telemt_me_writer_teardown_escalation_total Watchdog teardown escalations to hard detach"
);
let _ = writeln!(
out,
"# TYPE telemt_me_writer_teardown_escalation_total counter"
);
let _ = writeln!(
out,
"telemt_me_writer_teardown_escalation_total {}",
if me_allows_normal {
stats.get_me_writer_teardown_escalation_total()
} else {
0
}
);
let _ = writeln!(
out,
"# HELP telemt_me_writer_teardown_noop_total Teardown operations that became no-op"
);
let _ = writeln!(out, "# TYPE telemt_me_writer_teardown_noop_total counter");
let _ = writeln!(
out,
"telemt_me_writer_teardown_noop_total {}",
if me_allows_normal {
stats.get_me_writer_teardown_noop_total()
} else {
0
}
);
let _ = writeln!(
out,
"# HELP telemt_me_writer_teardown_duration_seconds ME writer teardown latency histogram by mode"
);
let _ = writeln!(
out,
"# TYPE telemt_me_writer_teardown_duration_seconds histogram"
);
let bucket_labels = Stats::me_writer_teardown_duration_bucket_labels();
for mode in MeWriterTeardownMode::ALL {
for (bucket_idx, label) in bucket_labels.iter().enumerate() {
let _ = writeln!(
out,
"telemt_me_writer_teardown_duration_seconds_bucket{{mode=\"{}\",le=\"{}\"}} {}",
mode.as_str(),
label,
if me_allows_normal {
stats.get_me_writer_teardown_duration_bucket_total(mode, bucket_idx)
} else {
0
}
);
}
let _ = writeln!(
out,
"telemt_me_writer_teardown_duration_seconds_bucket{{mode=\"{}\",le=\"+Inf\"}} {}",
mode.as_str(),
if me_allows_normal {
stats.get_me_writer_teardown_duration_count(mode)
} else {
0
}
);
let _ = writeln!(
out,
"telemt_me_writer_teardown_duration_seconds_sum{{mode=\"{}\"}} {:.6}",
mode.as_str(),
if me_allows_normal {
stats.get_me_writer_teardown_duration_sum_seconds(mode)
} else {
0.0
}
);
let _ = writeln!(
out,
"telemt_me_writer_teardown_duration_seconds_count{{mode=\"{}\"}} {}",
mode.as_str(),
if me_allows_normal {
stats.get_me_writer_teardown_duration_count(mode)
} else {
0
}
);
}
let _ = writeln!(
out,
"# HELP telemt_me_writer_cleanup_side_effect_failures_total Failed cleanup side effects by step"
);
let _ = writeln!(
out,
"# TYPE telemt_me_writer_cleanup_side_effect_failures_total counter"
);
for step in MeWriterCleanupSideEffectStep::ALL {
let _ = writeln!(
out,
"telemt_me_writer_cleanup_side_effect_failures_total{{step=\"{}\"}} {}",
step.as_str(),
if me_allows_normal {
stats.get_me_writer_cleanup_side_effect_failures_total(step)
} else {
0
}
);
}
let _ = writeln!(out, "# HELP telemt_me_refill_triggered_total Immediate ME refill runs started");
let _ = writeln!(out, "# TYPE telemt_me_refill_triggered_total counter");
let _ = writeln!(
@@ -2175,6 +2340,17 @@ mod tests {
assert!(output.contains("# TYPE telemt_me_rpc_proxy_req_signal_sent_total counter"));
assert!(output.contains("# TYPE telemt_me_idle_close_by_peer_total counter"));
assert!(output.contains("# TYPE telemt_me_writer_removed_total counter"));
assert!(output.contains("# TYPE telemt_me_writer_teardown_attempt_total counter"));
assert!(output.contains("# TYPE telemt_me_writer_teardown_success_total counter"));
assert!(output.contains("# TYPE telemt_me_writer_teardown_timeout_total counter"));
assert!(output.contains("# TYPE telemt_me_writer_teardown_escalation_total counter"));
assert!(output.contains("# TYPE telemt_me_writer_teardown_noop_total counter"));
assert!(output.contains(
"# TYPE telemt_me_writer_teardown_duration_seconds histogram"
));
assert!(output.contains(
"# TYPE telemt_me_writer_cleanup_side_effect_failures_total counter"
));
assert!(output.contains("# TYPE telemt_me_writer_close_signal_drop_total counter"));
assert!(output.contains(
"# TYPE telemt_me_writer_close_signal_channel_full_total counter"

View File

@@ -3,8 +3,7 @@ use std::io::Write;
use std::net::SocketAddr;
use std::sync::Arc;
use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt};
use tokio::net::TcpStream;
use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt, ReadHalf, WriteHalf, split};
use tokio::sync::watch;
use tracing::{debug, info, warn};
@@ -15,7 +14,7 @@ use crate::protocol::constants::*;
use crate::proxy::handshake::{HandshakeSuccess, encrypt_tg_nonce_with_ciphers, generate_tg_nonce};
use crate::proxy::relay::relay_bidirectional;
use crate::proxy::route_mode::{
RelayRouteMode, RouteCutoverState, ROUTE_SWITCH_ERROR_MSG, affected_cutover_state,
ROUTE_SWITCH_ERROR_MSG, RelayRouteMode, RouteCutoverState, affected_cutover_state,
cutover_stagger_delay,
};
use crate::proxy::adaptive_buffers;
@@ -56,7 +55,11 @@ where
);
let tg_stream = upstream_manager
.connect(dc_addr, Some(success.dc_idx), user.strip_prefix("scope_").filter(|s| !s.is_empty()))
.connect(
dc_addr,
Some(success.dc_idx),
user.strip_prefix("scope_").filter(|s| !s.is_empty()),
)
.await?;
debug!(peer = %success.peer, dc_addr = %dc_addr, "Connected, performing TG handshake");
@@ -93,11 +96,9 @@ where
);
tokio::pin!(relay_result);
let relay_result = loop {
if let Some(cutover) = affected_cutover_state(
&route_rx,
RelayRouteMode::Direct,
route_snapshot.generation,
) {
if let Some(cutover) =
affected_cutover_state(&route_rx, RelayRouteMode::Direct, route_snapshot.generation)
{
let delay = cutover_stagger_delay(session_id, cutover.generation);
warn!(
user = %user,
@@ -148,7 +149,9 @@ fn get_dc_addr_static(dc_idx: i16, config: &ProxyConfig) -> Result<SocketAddr> {
for addr_str in addrs {
match addr_str.parse::<SocketAddr>() {
Ok(addr) => parsed.push(addr),
Err(_) => warn!(dc_idx = dc_idx, addr_str = %addr_str, "Invalid DC override address in config, ignoring"),
Err(_) => {
warn!(dc_idx = dc_idx, addr_str = %addr_str, "Invalid DC override address in config, ignoring")
}
}
}
@@ -170,7 +173,10 @@ fn get_dc_addr_static(dc_idx: i16, config: &ProxyConfig) -> Result<SocketAddr> {
// Unknown DC requested by client without override: log and fall back.
if !config.dc_overrides.contains_key(&dc_key) {
warn!(dc_idx = dc_idx, "Requested non-standard DC with no override; falling back to default cluster");
warn!(
dc_idx = dc_idx,
"Requested non-standard DC with no override; falling back to default cluster"
);
if config.general.unknown_dc_file_log_enabled
&& let Some(path) = &config.general.unknown_dc_log_path
&& let Ok(handle) = tokio::runtime::Handle::try_current()
@@ -204,15 +210,15 @@ fn get_dc_addr_static(dc_idx: i16, config: &ProxyConfig) -> Result<SocketAddr> {
))
}
async fn do_tg_handshake_static(
mut stream: TcpStream,
async fn do_tg_handshake_static<S>(
mut stream: S,
success: &HandshakeSuccess,
config: &ProxyConfig,
rng: &SecureRandom,
) -> Result<(
CryptoReader<tokio::net::tcp::OwnedReadHalf>,
CryptoWriter<tokio::net::tcp::OwnedWriteHalf>,
)> {
) -> Result<(CryptoReader<ReadHalf<S>>, CryptoWriter<WriteHalf<S>>)>
where
S: AsyncRead + AsyncWrite + Unpin,
{
let (nonce, _tg_enc_key, _tg_enc_iv, _tg_dec_key, _tg_dec_iv) = generate_tg_nonce(
success.proto_tag,
success.dc_idx,
@@ -235,7 +241,7 @@ async fn do_tg_handshake_static(
stream.write_all(&encrypted_nonce).await?;
stream.flush().await?;
let (read_half, write_half) = stream.into_split();
let (read_half, write_half) = split(stream);
let max_pending = config.general.crypto_pending_buffer;
Ok((

376
src/service/mod.rs Normal file
View File

@@ -0,0 +1,376 @@
//! Service manager integration for telemt.
//!
//! Supports generating service files for:
//! - systemd (Linux)
//! - OpenRC (Alpine, Gentoo)
//! - rc.d (FreeBSD)
use std::path::Path;
/// Detected init/service system.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum InitSystem {
/// systemd (most modern Linux distributions)
Systemd,
/// OpenRC (Alpine, Gentoo, some BSDs)
OpenRC,
/// FreeBSD rc.d
FreeBSDRc,
/// No known init system detected
Unknown,
}
impl std::fmt::Display for InitSystem {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
InitSystem::Systemd => write!(f, "systemd"),
InitSystem::OpenRC => write!(f, "OpenRC"),
InitSystem::FreeBSDRc => write!(f, "FreeBSD rc.d"),
InitSystem::Unknown => write!(f, "unknown"),
}
}
}
/// Detects the init system in use on the current host.
pub fn detect_init_system() -> InitSystem {
// Check for systemd first (most common on Linux)
if Path::new("/run/systemd/system").exists() {
return InitSystem::Systemd;
}
// Check for OpenRC
if Path::new("/sbin/openrc-run").exists() || Path::new("/sbin/openrc").exists() {
return InitSystem::OpenRC;
}
// Check for FreeBSD rc.d
if Path::new("/etc/rc.subr").exists() && Path::new("/etc/rc.d").exists() {
return InitSystem::FreeBSDRc;
}
// Fallback: check if systemctl exists even without /run/systemd
if Path::new("/usr/bin/systemctl").exists() || Path::new("/bin/systemctl").exists() {
return InitSystem::Systemd;
}
InitSystem::Unknown
}
/// Returns the default service file path for the given init system.
pub fn service_file_path(init_system: InitSystem) -> &'static str {
match init_system {
InitSystem::Systemd => "/etc/systemd/system/telemt.service",
InitSystem::OpenRC => "/etc/init.d/telemt",
InitSystem::FreeBSDRc => "/usr/local/etc/rc.d/telemt",
InitSystem::Unknown => "/etc/init.d/telemt",
}
}
/// Options for generating service files.
pub struct ServiceOptions<'a> {
/// Path to the telemt executable
pub exe_path: &'a Path,
/// Path to the configuration file
pub config_path: &'a Path,
/// User to run as (optional)
pub user: Option<&'a str>,
/// Group to run as (optional)
pub group: Option<&'a str>,
/// PID file path
pub pid_file: &'a str,
/// Working directory
pub working_dir: Option<&'a str>,
/// Description
pub description: &'a str,
}
impl<'a> Default for ServiceOptions<'a> {
fn default() -> Self {
Self {
exe_path: Path::new("/usr/local/bin/telemt"),
config_path: Path::new("/etc/telemt/config.toml"),
user: Some("telemt"),
group: Some("telemt"),
pid_file: "/var/run/telemt.pid",
working_dir: Some("/var/lib/telemt"),
description: "Telemt MTProxy - Telegram MTProto Proxy",
}
}
}
/// Generates a service file for the given init system.
pub fn generate_service_file(init_system: InitSystem, opts: &ServiceOptions) -> String {
match init_system {
InitSystem::Systemd => generate_systemd_unit(opts),
InitSystem::OpenRC => generate_openrc_script(opts),
InitSystem::FreeBSDRc => generate_freebsd_rc_script(opts),
InitSystem::Unknown => generate_systemd_unit(opts), // Default to systemd format
}
}
/// Generates an enhanced systemd unit file.
fn generate_systemd_unit(opts: &ServiceOptions) -> String {
let user_line = opts.user.map(|u| format!("User={}", u)).unwrap_or_default();
let group_line = opts.group.map(|g| format!("Group={}", g)).unwrap_or_default();
let working_dir = opts.working_dir.map(|d| format!("WorkingDirectory={}", d)).unwrap_or_default();
format!(
r#"[Unit]
Description={description}
Documentation=https://github.com/telemt/telemt
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
ExecStart={exe} --foreground --pid-file {pid_file} {config}
ExecReload=/bin/kill -HUP $MAINPID
PIDFile={pid_file}
Restart=always
RestartSec=5
{user}
{group}
{working_dir}
# Resource limits
LimitNOFILE=65535
LimitNPROC=4096
# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
MemoryDenyWriteExecute=true
LockPersonality=true
# Allow binding to privileged ports and writing to specific paths
AmbientCapabilities=CAP_NET_BIND_SERVICE
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
ReadWritePaths=/etc/telemt /var/run /var/lib/telemt
[Install]
WantedBy=multi-user.target
"#,
description = opts.description,
exe = opts.exe_path.display(),
config = opts.config_path.display(),
pid_file = opts.pid_file,
user = user_line,
group = group_line,
working_dir = working_dir,
)
}
/// Generates an OpenRC init script.
fn generate_openrc_script(opts: &ServiceOptions) -> String {
let user = opts.user.unwrap_or("root");
let group = opts.group.unwrap_or("root");
format!(
r#"#!/sbin/openrc-run
# OpenRC init script for telemt
description="{description}"
command="{exe}"
command_args="--daemon --syslog --pid-file {pid_file} {config}"
command_user="{user}:{group}"
pidfile="{pid_file}"
depend() {{
need net
use logger
after firewall
}}
start_pre() {{
checkpath --directory --owner {user}:{group} --mode 0755 /var/run
checkpath --directory --owner {user}:{group} --mode 0755 /var/lib/telemt
checkpath --directory --owner {user}:{group} --mode 0755 /var/log/telemt
}}
reload() {{
ebegin "Reloading ${{RC_SVCNAME}}"
start-stop-daemon --signal HUP --pidfile "${{pidfile}}"
eend $?
}}
"#,
description = opts.description,
exe = opts.exe_path.display(),
config = opts.config_path.display(),
pid_file = opts.pid_file,
user = user,
group = group,
)
}
/// Generates a FreeBSD rc.d script.
fn generate_freebsd_rc_script(opts: &ServiceOptions) -> String {
let user = opts.user.unwrap_or("root");
let group = opts.group.unwrap_or("wheel");
format!(
r#"#!/bin/sh
#
# PROVIDE: telemt
# REQUIRE: LOGIN NETWORKING
# KEYWORD: shutdown
#
# Add the following lines to /etc/rc.conf to enable telemt:
#
# telemt_enable="YES"
# telemt_config="/etc/telemt/config.toml" # optional
# telemt_user="telemt" # optional
# telemt_group="telemt" # optional
#
. /etc/rc.subr
name="telemt"
rcvar="telemt_enable"
desc="{description}"
load_rc_config $name
: ${{telemt_enable:="NO"}}
: ${{telemt_config:="{config}"}}
: ${{telemt_user:="{user}"}}
: ${{telemt_group:="{group}"}}
: ${{telemt_pidfile:="{pid_file}"}}
pidfile="${{telemt_pidfile}}"
command="{exe}"
command_args="--daemon --syslog --pid-file ${{telemt_pidfile}} ${{telemt_config}}"
start_precmd="telemt_prestart"
reload_cmd="telemt_reload"
extra_commands="reload"
telemt_prestart() {{
install -d -o ${{telemt_user}} -g ${{telemt_group}} -m 755 /var/run
install -d -o ${{telemt_user}} -g ${{telemt_group}} -m 755 /var/lib/telemt
}}
telemt_reload() {{
if [ -f "${{pidfile}}" ]; then
echo "Reloading ${{name}} configuration."
kill -HUP $(cat ${{pidfile}})
else
echo "${{name}} is not running."
return 1
fi
}}
run_rc_command "$1"
"#,
description = opts.description,
exe = opts.exe_path.display(),
config = opts.config_path.display(),
pid_file = opts.pid_file,
user = user,
group = group,
)
}
/// Installation instructions for each init system.
pub fn installation_instructions(init_system: InitSystem) -> &'static str {
match init_system {
InitSystem::Systemd => {
r#"To install and enable the service:
sudo systemctl daemon-reload
sudo systemctl enable telemt
sudo systemctl start telemt
To check status:
sudo systemctl status telemt
To view logs:
journalctl -u telemt -f
To reload configuration:
sudo systemctl reload telemt
"#
}
InitSystem::OpenRC => {
r#"To install and enable the service:
sudo chmod +x /etc/init.d/telemt
sudo rc-update add telemt default
sudo rc-service telemt start
To check status:
sudo rc-service telemt status
To reload configuration:
sudo rc-service telemt reload
"#
}
InitSystem::FreeBSDRc => {
r#"To install and enable the service:
sudo chmod +x /usr/local/etc/rc.d/telemt
sudo sysrc telemt_enable="YES"
sudo service telemt start
To check status:
sudo service telemt status
To reload configuration:
sudo service telemt reload
"#
}
InitSystem::Unknown => {
r#"No supported init system detected.
You may need to create a service file manually or run telemt directly:
telemt start /etc/telemt/config.toml
"#
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_systemd_unit_generation() {
let opts = ServiceOptions::default();
let unit = generate_systemd_unit(&opts);
assert!(unit.contains("[Unit]"));
assert!(unit.contains("[Service]"));
assert!(unit.contains("[Install]"));
assert!(unit.contains("ExecReload="));
assert!(unit.contains("PIDFile="));
}
#[test]
fn test_openrc_script_generation() {
let opts = ServiceOptions::default();
let script = generate_openrc_script(&opts);
assert!(script.contains("#!/sbin/openrc-run"));
assert!(script.contains("depend()"));
assert!(script.contains("reload()"));
}
#[test]
fn test_freebsd_rc_script_generation() {
let opts = ServiceOptions::default();
let script = generate_freebsd_rc_script(&opts);
assert!(script.contains("#!/bin/sh"));
assert!(script.contains("PROVIDE: telemt"));
assert!(script.contains("run_rc_command"));
}
#[test]
fn test_service_file_paths() {
assert_eq!(service_file_path(InitSystem::Systemd), "/etc/systemd/system/telemt.service");
assert_eq!(service_file_path(InitSystem::OpenRC), "/etc/init.d/telemt");
assert_eq!(service_file_path(InitSystem::FreeBSDRc), "/usr/local/etc/rc.d/telemt");
}
}

View File

@@ -19,6 +19,137 @@ use tracing::debug;
use crate::config::{MeTelemetryLevel, MeWriterPickMode};
use self::telemetry::TelemetryPolicy;
const ME_WRITER_TEARDOWN_MODE_COUNT: usize = 2;
const ME_WRITER_TEARDOWN_REASON_COUNT: usize = 11;
const ME_WRITER_CLEANUP_SIDE_EFFECT_STEP_COUNT: usize = 2;
const ME_WRITER_TEARDOWN_DURATION_BUCKET_COUNT: usize = 12;
const ME_WRITER_TEARDOWN_DURATION_BUCKET_BOUNDS_MICROS: [u64; ME_WRITER_TEARDOWN_DURATION_BUCKET_COUNT] = [
1_000,
5_000,
10_000,
25_000,
50_000,
100_000,
250_000,
500_000,
1_000_000,
2_500_000,
5_000_000,
10_000_000,
];
const ME_WRITER_TEARDOWN_DURATION_BUCKET_LABELS: [&str; ME_WRITER_TEARDOWN_DURATION_BUCKET_COUNT] = [
"0.001",
"0.005",
"0.01",
"0.025",
"0.05",
"0.1",
"0.25",
"0.5",
"1",
"2.5",
"5",
"10",
];
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
#[repr(u8)]
pub enum MeWriterTeardownMode {
Normal = 0,
HardDetach = 1,
}
impl MeWriterTeardownMode {
pub const ALL: [Self; ME_WRITER_TEARDOWN_MODE_COUNT] =
[Self::Normal, Self::HardDetach];
pub const fn as_str(self) -> &'static str {
match self {
Self::Normal => "normal",
Self::HardDetach => "hard_detach",
}
}
const fn idx(self) -> usize {
self as usize
}
}
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
#[repr(u8)]
pub enum MeWriterTeardownReason {
ReaderExit = 0,
WriterTaskExit = 1,
PingSendFail = 2,
SignalSendFail = 3,
RouteChannelClosed = 4,
CloseRpcChannelClosed = 5,
PruneClosedWriter = 6,
ReapTimeoutExpired = 7,
ReapThresholdForce = 8,
ReapEmpty = 9,
WatchdogStuckDraining = 10,
}
impl MeWriterTeardownReason {
pub const ALL: [Self; ME_WRITER_TEARDOWN_REASON_COUNT] = [
Self::ReaderExit,
Self::WriterTaskExit,
Self::PingSendFail,
Self::SignalSendFail,
Self::RouteChannelClosed,
Self::CloseRpcChannelClosed,
Self::PruneClosedWriter,
Self::ReapTimeoutExpired,
Self::ReapThresholdForce,
Self::ReapEmpty,
Self::WatchdogStuckDraining,
];
pub const fn as_str(self) -> &'static str {
match self {
Self::ReaderExit => "reader_exit",
Self::WriterTaskExit => "writer_task_exit",
Self::PingSendFail => "ping_send_fail",
Self::SignalSendFail => "signal_send_fail",
Self::RouteChannelClosed => "route_channel_closed",
Self::CloseRpcChannelClosed => "close_rpc_channel_closed",
Self::PruneClosedWriter => "prune_closed_writer",
Self::ReapTimeoutExpired => "reap_timeout_expired",
Self::ReapThresholdForce => "reap_threshold_force",
Self::ReapEmpty => "reap_empty",
Self::WatchdogStuckDraining => "watchdog_stuck_draining",
}
}
const fn idx(self) -> usize {
self as usize
}
}
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
#[repr(u8)]
pub enum MeWriterCleanupSideEffectStep {
CloseSignalChannelFull = 0,
CloseSignalChannelClosed = 1,
}
impl MeWriterCleanupSideEffectStep {
pub const ALL: [Self; ME_WRITER_CLEANUP_SIDE_EFFECT_STEP_COUNT] =
[Self::CloseSignalChannelFull, Self::CloseSignalChannelClosed];
pub const fn as_str(self) -> &'static str {
match self {
Self::CloseSignalChannelFull => "close_signal_channel_full",
Self::CloseSignalChannelClosed => "close_signal_channel_closed",
}
}
const fn idx(self) -> usize {
self as usize
}
}
// ============= Stats =============
#[derive(Default)]
@@ -128,6 +259,18 @@ pub struct Stats {
me_draining_writers_reap_progress_total: AtomicU64,
me_writer_removed_total: AtomicU64,
me_writer_removed_unexpected_total: AtomicU64,
me_writer_teardown_attempt_total:
[[AtomicU64; ME_WRITER_TEARDOWN_MODE_COUNT]; ME_WRITER_TEARDOWN_REASON_COUNT],
me_writer_teardown_success_total: [AtomicU64; ME_WRITER_TEARDOWN_MODE_COUNT],
me_writer_teardown_timeout_total: AtomicU64,
me_writer_teardown_escalation_total: AtomicU64,
me_writer_teardown_noop_total: AtomicU64,
me_writer_cleanup_side_effect_failures_total:
[AtomicU64; ME_WRITER_CLEANUP_SIDE_EFFECT_STEP_COUNT],
me_writer_teardown_duration_bucket_hits:
[[AtomicU64; ME_WRITER_TEARDOWN_DURATION_BUCKET_COUNT + 1]; ME_WRITER_TEARDOWN_MODE_COUNT],
me_writer_teardown_duration_sum_micros: [AtomicU64; ME_WRITER_TEARDOWN_MODE_COUNT],
me_writer_teardown_duration_count: [AtomicU64; ME_WRITER_TEARDOWN_MODE_COUNT],
me_refill_triggered_total: AtomicU64,
me_refill_skipped_inflight_total: AtomicU64,
me_refill_failed_total: AtomicU64,
@@ -765,6 +908,74 @@ impl Stats {
self.me_writer_removed_unexpected_total.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_me_writer_teardown_attempt_total(
&self,
reason: MeWriterTeardownReason,
mode: MeWriterTeardownMode,
) {
if self.telemetry_me_allows_normal() {
self.me_writer_teardown_attempt_total[reason.idx()][mode.idx()]
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_me_writer_teardown_success_total(&self, mode: MeWriterTeardownMode) {
if self.telemetry_me_allows_normal() {
self.me_writer_teardown_success_total[mode.idx()].fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_me_writer_teardown_timeout_total(&self) {
if self.telemetry_me_allows_normal() {
self.me_writer_teardown_timeout_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_me_writer_teardown_escalation_total(&self) {
if self.telemetry_me_allows_normal() {
self.me_writer_teardown_escalation_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_me_writer_teardown_noop_total(&self) {
if self.telemetry_me_allows_normal() {
self.me_writer_teardown_noop_total
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn increment_me_writer_cleanup_side_effect_failures_total(
&self,
step: MeWriterCleanupSideEffectStep,
) {
if self.telemetry_me_allows_normal() {
self.me_writer_cleanup_side_effect_failures_total[step.idx()]
.fetch_add(1, Ordering::Relaxed);
}
}
pub fn observe_me_writer_teardown_duration(
&self,
mode: MeWriterTeardownMode,
duration: Duration,
) {
if !self.telemetry_me_allows_normal() {
return;
}
let duration_micros = duration.as_micros().min(u64::MAX as u128) as u64;
let mut bucket_idx = ME_WRITER_TEARDOWN_DURATION_BUCKET_COUNT;
for (idx, upper_bound_micros) in ME_WRITER_TEARDOWN_DURATION_BUCKET_BOUNDS_MICROS
.iter()
.copied()
.enumerate()
{
if duration_micros <= upper_bound_micros {
bucket_idx = idx;
break;
}
}
self.me_writer_teardown_duration_bucket_hits[mode.idx()][bucket_idx]
.fetch_add(1, Ordering::Relaxed);
self.me_writer_teardown_duration_sum_micros[mode.idx()]
.fetch_add(duration_micros, Ordering::Relaxed);
self.me_writer_teardown_duration_count[mode.idx()].fetch_add(1, Ordering::Relaxed);
}
pub fn increment_me_refill_triggered_total(&self) {
if self.telemetry_me_allows_debug() {
self.me_refill_triggered_total.fetch_add(1, Ordering::Relaxed);
@@ -1297,6 +1508,79 @@ impl Stats {
pub fn get_me_writer_removed_unexpected_total(&self) -> u64 {
self.me_writer_removed_unexpected_total.load(Ordering::Relaxed)
}
pub fn get_me_writer_teardown_attempt_total(
&self,
reason: MeWriterTeardownReason,
mode: MeWriterTeardownMode,
) -> u64 {
self.me_writer_teardown_attempt_total[reason.idx()][mode.idx()]
.load(Ordering::Relaxed)
}
pub fn get_me_writer_teardown_attempt_total_by_mode(&self, mode: MeWriterTeardownMode) -> u64 {
MeWriterTeardownReason::ALL
.iter()
.copied()
.map(|reason| self.get_me_writer_teardown_attempt_total(reason, mode))
.sum()
}
pub fn get_me_writer_teardown_success_total(&self, mode: MeWriterTeardownMode) -> u64 {
self.me_writer_teardown_success_total[mode.idx()].load(Ordering::Relaxed)
}
pub fn get_me_writer_teardown_timeout_total(&self) -> u64 {
self.me_writer_teardown_timeout_total.load(Ordering::Relaxed)
}
pub fn get_me_writer_teardown_escalation_total(&self) -> u64 {
self.me_writer_teardown_escalation_total
.load(Ordering::Relaxed)
}
pub fn get_me_writer_teardown_noop_total(&self) -> u64 {
self.me_writer_teardown_noop_total.load(Ordering::Relaxed)
}
pub fn get_me_writer_cleanup_side_effect_failures_total(
&self,
step: MeWriterCleanupSideEffectStep,
) -> u64 {
self.me_writer_cleanup_side_effect_failures_total[step.idx()]
.load(Ordering::Relaxed)
}
pub fn get_me_writer_cleanup_side_effect_failures_total_all(&self) -> u64 {
MeWriterCleanupSideEffectStep::ALL
.iter()
.copied()
.map(|step| self.get_me_writer_cleanup_side_effect_failures_total(step))
.sum()
}
pub fn me_writer_teardown_duration_bucket_labels(
) -> &'static [&'static str; ME_WRITER_TEARDOWN_DURATION_BUCKET_COUNT] {
&ME_WRITER_TEARDOWN_DURATION_BUCKET_LABELS
}
pub fn get_me_writer_teardown_duration_bucket_hits(
&self,
mode: MeWriterTeardownMode,
bucket_idx: usize,
) -> u64 {
self.me_writer_teardown_duration_bucket_hits[mode.idx()][bucket_idx]
.load(Ordering::Relaxed)
}
pub fn get_me_writer_teardown_duration_bucket_total(
&self,
mode: MeWriterTeardownMode,
bucket_idx: usize,
) -> u64 {
let capped_idx = bucket_idx.min(ME_WRITER_TEARDOWN_DURATION_BUCKET_COUNT);
let mut total = 0u64;
for idx in 0..=capped_idx {
total = total.saturating_add(self.get_me_writer_teardown_duration_bucket_hits(mode, idx));
}
total
}
pub fn get_me_writer_teardown_duration_count(&self, mode: MeWriterTeardownMode) -> u64 {
self.me_writer_teardown_duration_count[mode.idx()].load(Ordering::Relaxed)
}
pub fn get_me_writer_teardown_duration_sum_seconds(&self, mode: MeWriterTeardownMode) -> f64 {
self.me_writer_teardown_duration_sum_micros[mode.idx()].load(Ordering::Relaxed) as f64
/ 1_000_000.0
}
pub fn get_me_refill_triggered_total(&self) -> u64 {
self.me_refill_triggered_total.load(Ordering::Relaxed)
}
@@ -1800,6 +2084,79 @@ mod tests {
assert_eq!(stats.get_me_keepalive_sent(), 0);
assert_eq!(stats.get_me_route_drop_queue_full(), 0);
}
#[test]
fn test_teardown_counters_and_duration() {
let stats = Stats::new();
stats.increment_me_writer_teardown_attempt_total(
MeWriterTeardownReason::ReaderExit,
MeWriterTeardownMode::Normal,
);
stats.increment_me_writer_teardown_success_total(MeWriterTeardownMode::Normal);
stats.observe_me_writer_teardown_duration(
MeWriterTeardownMode::Normal,
Duration::from_millis(3),
);
stats.increment_me_writer_cleanup_side_effect_failures_total(
MeWriterCleanupSideEffectStep::CloseSignalChannelFull,
);
assert_eq!(
stats.get_me_writer_teardown_attempt_total(
MeWriterTeardownReason::ReaderExit,
MeWriterTeardownMode::Normal
),
1
);
assert_eq!(
stats.get_me_writer_teardown_success_total(MeWriterTeardownMode::Normal),
1
);
assert_eq!(
stats.get_me_writer_teardown_duration_count(MeWriterTeardownMode::Normal),
1
);
assert!(
stats.get_me_writer_teardown_duration_sum_seconds(MeWriterTeardownMode::Normal) > 0.0
);
assert_eq!(
stats.get_me_writer_cleanup_side_effect_failures_total(
MeWriterCleanupSideEffectStep::CloseSignalChannelFull
),
1
);
}
#[test]
fn test_teardown_counters_respect_me_silent() {
let stats = Stats::new();
stats.apply_telemetry_policy(TelemetryPolicy {
core_enabled: true,
user_enabled: true,
me_level: MeTelemetryLevel::Silent,
});
stats.increment_me_writer_teardown_attempt_total(
MeWriterTeardownReason::ReaderExit,
MeWriterTeardownMode::Normal,
);
stats.increment_me_writer_teardown_timeout_total();
stats.observe_me_writer_teardown_duration(
MeWriterTeardownMode::Normal,
Duration::from_millis(1),
);
assert_eq!(
stats.get_me_writer_teardown_attempt_total(
MeWriterTeardownReason::ReaderExit,
MeWriterTeardownMode::Normal
),
0
);
assert_eq!(stats.get_me_writer_teardown_timeout_total(), 0);
assert_eq!(
stats.get_me_writer_teardown_duration_count(MeWriterTeardownMode::Normal),
0
);
}
#[test]
fn test_replay_checker_basic() {

View File

@@ -7,33 +7,29 @@ use tokio::net::TcpStream;
#[cfg(unix)]
use tokio::net::UnixStream;
use tokio::time::timeout;
use tokio_rustls::client::TlsStream;
use tokio_rustls::TlsConnector;
use tokio_rustls::client::TlsStream;
use tracing::{debug, warn};
use rustls::client::danger::{HandshakeSignatureValid, ServerCertVerified, ServerCertVerifier};
use rustls::client::ClientConfig;
use rustls::client::danger::{HandshakeSignatureValid, ServerCertVerified, ServerCertVerifier};
use rustls::pki_types::{CertificateDer, ServerName, UnixTime};
use rustls::{DigitallySignedStruct, Error as RustlsError};
use x509_parser::prelude::FromDer;
use x509_parser::certificate::X509Certificate;
use x509_parser::prelude::FromDer;
use crate::crypto::SecureRandom;
use crate::network::dns_overrides::resolve_socket_addr;
use crate::protocol::constants::{
TLS_RECORD_APPLICATION, TLS_RECORD_CHANGE_CIPHER, TLS_RECORD_HANDSHAKE,
};
use crate::transport::proxy_protocol::{ProxyProtocolV1Builder, ProxyProtocolV2Builder};
use crate::tls_front::types::{
ParsedCertificateInfo,
ParsedServerHello,
TlsBehaviorProfile,
TlsCertPayload,
TlsExtension,
TlsFetchResult,
TlsProfileSource,
ParsedCertificateInfo, ParsedServerHello, TlsBehaviorProfile, TlsCertPayload, TlsExtension,
TlsFetchResult, TlsProfileSource,
};
use crate::transport::UpstreamStream;
use crate::transport::proxy_protocol::{ProxyProtocolV1Builder, ProxyProtocolV2Builder};
/// No-op verifier: accept any certificate (we only need lengths and metadata).
#[derive(Debug)]
@@ -144,21 +140,27 @@ fn build_client_hello(sni: &str, rng: &SecureRandom) -> Vec<u8> {
exts.extend_from_slice(&0x000au16.to_be_bytes());
exts.extend_from_slice(&((2 + groups.len() * 2) as u16).to_be_bytes());
exts.extend_from_slice(&(groups.len() as u16 * 2).to_be_bytes());
for g in groups { exts.extend_from_slice(&g.to_be_bytes()); }
for g in groups {
exts.extend_from_slice(&g.to_be_bytes());
}
// signature_algorithms
let sig_algs: [u16; 4] = [0x0804, 0x0805, 0x0403, 0x0503]; // rsa_pss_rsae_sha256/384, ecdsa_secp256r1_sha256, rsa_pkcs1_sha256
exts.extend_from_slice(&0x000du16.to_be_bytes());
exts.extend_from_slice(&((2 + sig_algs.len() * 2) as u16).to_be_bytes());
exts.extend_from_slice(&(sig_algs.len() as u16 * 2).to_be_bytes());
for a in sig_algs { exts.extend_from_slice(&a.to_be_bytes()); }
for a in sig_algs {
exts.extend_from_slice(&a.to_be_bytes());
}
// supported_versions (TLS1.3 + TLS1.2)
let versions: [u16; 2] = [0x0304, 0x0303];
exts.extend_from_slice(&0x002bu16.to_be_bytes());
exts.extend_from_slice(&((1 + versions.len() * 2) as u16).to_be_bytes());
exts.push((versions.len() * 2) as u8);
for v in versions { exts.extend_from_slice(&v.to_be_bytes()); }
for v in versions {
exts.extend_from_slice(&v.to_be_bytes());
}
// key_share (x25519)
let key = gen_key_share(rng);
@@ -273,7 +275,10 @@ fn parse_server_hello(body: &[u8]) -> Option<ParsedServerHello> {
pos += 4;
let data = body.get(pos..pos + elen)?.to_vec();
pos += elen;
extensions.push(TlsExtension { ext_type: etype, data });
extensions.push(TlsExtension {
ext_type: etype,
data,
});
}
Some(ParsedServerHello {
@@ -394,37 +399,42 @@ async fn connect_tcp_with_upstream(
port: u16,
connect_timeout: Duration,
upstream: Option<std::sync::Arc<crate::transport::UpstreamManager>>,
) -> Result<TcpStream> {
scope: Option<&str>,
) -> Result<UpstreamStream> {
if let Some(manager) = upstream {
if let Some(addr) = resolve_socket_addr(host, port) {
match manager.connect(addr, None, None).await {
match manager.connect(addr, None, scope).await {
Ok(stream) => return Ok(stream),
Err(e) => {
warn!(
host = %host,
port = port,
scope = ?scope,
error = %e,
"Upstream connect failed, using direct connect"
);
}
}
} else if let Ok(mut addrs) = tokio::net::lookup_host((host, port)).await {
if let Some(addr) = addrs.find(|a| a.is_ipv4()) {
match manager.connect(addr, None, None).await {
Ok(stream) => return Ok(stream),
Err(e) => {
warn!(
host = %host,
port = port,
error = %e,
"Upstream connect failed, using direct connect"
);
}
} else if let Ok(mut addrs) = tokio::net::lookup_host((host, port)).await
&& let Some(addr) = addrs.find(|a| a.is_ipv4())
{
match manager.connect(addr, None, scope).await {
Ok(stream) => return Ok(stream),
Err(e) => {
warn!(
host = %host,
port = port,
scope = ?scope,
error = %e,
"Upstream connect failed, using direct connect"
);
}
}
}
}
connect_with_dns_override(host, port, connect_timeout).await
Ok(UpstreamStream::Tcp(
connect_with_dns_override(host, port, connect_timeout).await?,
))
}
fn encode_tls13_certificate_message(cert_chain_der: &[Vec<u8>]) -> Option<Vec<u8>> {
@@ -443,9 +453,7 @@ fn encode_tls13_certificate_message(cert_chain_der: &[Vec<u8>]) -> Option<Vec<u8
}
// Certificate = context_len(1) + certificate_list_len(3) + entries
let body_len = 1usize
.checked_add(3)?
.checked_add(certificate_list.len())?;
let body_len = 1usize.checked_add(3)?.checked_add(certificate_list.len())?;
let mut message = Vec::with_capacity(4 + body_len);
message.push(0x0b); // HandshakeType::certificate
@@ -537,6 +545,7 @@ async fn fetch_via_raw_tls(
sni: &str,
connect_timeout: Duration,
upstream: Option<std::sync::Arc<crate::transport::UpstreamManager>>,
scope: Option<&str>,
proxy_protocol: u8,
unix_sock: Option<&str>,
) -> Result<TlsFetchResult> {
@@ -549,7 +558,8 @@ async fn fetch_via_raw_tls(
sock = %sock_path,
"Raw TLS fetch using mask unix socket"
);
return fetch_via_raw_tls_stream(stream, sni, connect_timeout, proxy_protocol).await;
return fetch_via_raw_tls_stream(stream, sni, connect_timeout, proxy_protocol)
.await;
}
Ok(Err(e)) => {
warn!(
@@ -572,7 +582,7 @@ async fn fetch_via_raw_tls(
#[cfg(not(unix))]
let _ = unix_sock;
let stream = connect_tcp_with_upstream(host, port, connect_timeout, upstream).await?;
let stream = connect_tcp_with_upstream(host, port, connect_timeout, upstream, scope).await?;
fetch_via_raw_tls_stream(stream, sni, connect_timeout, proxy_protocol).await
}
@@ -616,12 +626,13 @@ where
.map(|slice| slice.to_vec())
.unwrap_or_default();
let cert_chain_der: Vec<Vec<u8>> = certs.iter().map(|c| c.as_ref().to_vec()).collect();
let cert_payload = encode_tls13_certificate_message(&cert_chain_der).map(|certificate_message| {
TlsCertPayload {
cert_chain_der: cert_chain_der.clone(),
certificate_message,
}
});
let cert_payload =
encode_tls13_certificate_message(&cert_chain_der).map(|certificate_message| {
TlsCertPayload {
cert_chain_der: cert_chain_der.clone(),
certificate_message,
}
});
let total_cert_len = cert_payload
.as_ref()
@@ -675,6 +686,7 @@ async fn fetch_via_rustls(
sni: &str,
connect_timeout: Duration,
upstream: Option<std::sync::Arc<crate::transport::UpstreamManager>>,
scope: Option<&str>,
proxy_protocol: u8,
unix_sock: Option<&str>,
) -> Result<TlsFetchResult> {
@@ -710,7 +722,7 @@ async fn fetch_via_rustls(
#[cfg(not(unix))]
let _ = unix_sock;
let stream = connect_tcp_with_upstream(host, port, connect_timeout, upstream).await?;
let stream = connect_tcp_with_upstream(host, port, connect_timeout, upstream, scope).await?;
fetch_via_rustls_stream(stream, host, sni, proxy_protocol).await
}
@@ -726,6 +738,7 @@ pub async fn fetch_real_tls(
sni: &str,
connect_timeout: Duration,
upstream: Option<std::sync::Arc<crate::transport::UpstreamManager>>,
scope: Option<&str>,
proxy_protocol: u8,
unix_sock: Option<&str>,
) -> Result<TlsFetchResult> {
@@ -735,6 +748,7 @@ pub async fn fetch_real_tls(
sni,
connect_timeout,
upstream.clone(),
scope,
proxy_protocol,
unix_sock,
)
@@ -753,6 +767,7 @@ pub async fn fetch_real_tls(
sni,
connect_timeout,
upstream,
scope,
proxy_protocol,
unix_sock,
)

View File

@@ -298,6 +298,7 @@ async fn run_update_cycle(
pool.update_runtime_reinit_policy(
cfg.general.hardswap,
cfg.general.me_pool_drain_ttl_secs,
cfg.general.me_instadrain,
cfg.general.me_pool_drain_threshold,
cfg.general.me_pool_drain_soft_evict_enabled,
cfg.general.me_pool_drain_soft_evict_grace_secs,
@@ -530,6 +531,7 @@ pub async fn me_config_updater(
pool.update_runtime_reinit_policy(
cfg.general.hardswap,
cfg.general.me_pool_drain_ttl_secs,
cfg.general.me_instadrain,
cfg.general.me_pool_drain_threshold,
cfg.general.me_pool_drain_soft_evict_enabled,
cfg.general.me_pool_drain_soft_evict_grace_secs,

View File

@@ -10,8 +10,10 @@ use tracing::{debug, info, warn};
use crate::config::MeFloorMode;
use crate::crypto::SecureRandom;
use crate::network::IpFamily;
use crate::stats::MeWriterTeardownReason;
use super::MePool;
use super::pool::{MeFamilyRuntimeState, MeWriter};
const JITTER_FRAC_NUM: u64 = 2; // jitter up to 50% of backoff
#[allow(dead_code)]
@@ -30,6 +32,35 @@ const HEALTH_DRAIN_CLOSE_BUDGET_MIN: usize = 16;
const HEALTH_DRAIN_CLOSE_BUDGET_MAX: usize = 256;
const HEALTH_DRAIN_SOFT_EVICT_BUDGET_MIN: usize = 8;
const HEALTH_DRAIN_SOFT_EVICT_BUDGET_MAX: usize = 256;
const HEALTH_DRAIN_REAP_OPPORTUNISTIC_INTERVAL_SECS: u64 = 1;
const HEALTH_DRAIN_TIMEOUT_ENFORCER_INTERVAL_SECS: u64 = 1;
const FAMILY_SUPPRESS_FAIL_STREAK_THRESHOLD: u32 = 6;
const FAMILY_SUPPRESS_WINDOW_SECS: u64 = 120;
const FAMILY_RECOVER_PROBE_INTERVAL_SECS: u64 = 5;
const FAMILY_RECOVER_SUCCESS_STREAK_REQUIRED: u32 = 3;
#[derive(Debug, Clone)]
struct FamilyCircuitState {
state: MeFamilyRuntimeState,
state_since_at: Instant,
suppressed_until: Option<Instant>,
next_probe_at: Instant,
fail_streak: u32,
recover_success_streak: u32,
}
impl FamilyCircuitState {
fn new(now: Instant) -> Self {
Self {
state: MeFamilyRuntimeState::Healthy,
state_since_at: now,
suppressed_until: None,
next_probe_at: now,
fail_streak: 0,
recover_success_streak: 0,
}
}
}
#[derive(Debug, Clone)]
struct DcFloorPlanEntry {
@@ -69,6 +100,25 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
let mut floor_warn_next_allowed: HashMap<(i32, IpFamily), Instant> = HashMap::new();
let mut drain_warn_next_allowed: HashMap<u64, Instant> = HashMap::new();
let mut drain_soft_evict_next_allowed: HashMap<u64, Instant> = HashMap::new();
let mut family_v4_circuit = FamilyCircuitState::new(Instant::now());
let mut family_v6_circuit = FamilyCircuitState::new(Instant::now());
let init_epoch_secs = MePool::now_epoch_secs();
pool.set_family_runtime_state(
IpFamily::V4,
family_v4_circuit.state,
init_epoch_secs,
0,
family_v4_circuit.fail_streak,
family_v4_circuit.recover_success_streak,
);
pool.set_family_runtime_state(
IpFamily::V6,
family_v6_circuit.state,
init_epoch_secs,
0,
family_v6_circuit.fail_streak,
family_v6_circuit.recover_success_streak,
);
let mut degraded_interval = true;
loop {
let interval = if degraded_interval {
@@ -84,7 +134,9 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
&mut drain_soft_evict_next_allowed,
)
.await;
let v4_degraded = check_family(
let now = Instant::now();
let now_epoch_secs = MePool::now_epoch_secs();
let v4_degraded_raw = check_family(
IpFamily::V4,
&pool,
&rng,
@@ -99,29 +151,252 @@ pub async fn me_health_monitor(pool: Arc<MePool>, rng: Arc<SecureRandom>, _min_c
&mut adaptive_idle_since,
&mut adaptive_recover_until,
&mut floor_warn_next_allowed,
&mut drain_warn_next_allowed,
&mut drain_soft_evict_next_allowed,
)
.await;
let v6_degraded = check_family(
IpFamily::V6,
let v4_degraded = apply_family_circuit_result(
&pool,
&rng,
&mut backoff,
&mut next_attempt,
&mut inflight,
&mut outage_backoff,
&mut outage_next_attempt,
&mut single_endpoint_outage,
&mut shadow_rotate_deadline,
&mut idle_refresh_next_attempt,
&mut adaptive_idle_since,
&mut adaptive_recover_until,
&mut floor_warn_next_allowed,
)
.await;
IpFamily::V4,
&mut family_v4_circuit,
Some(v4_degraded_raw),
false,
now,
now_epoch_secs,
);
let v6_check_ran = should_run_family_check(&mut family_v6_circuit, now);
let v6_degraded_raw = if v6_check_ran {
check_family(
IpFamily::V6,
&pool,
&rng,
&mut backoff,
&mut next_attempt,
&mut inflight,
&mut outage_backoff,
&mut outage_next_attempt,
&mut single_endpoint_outage,
&mut shadow_rotate_deadline,
&mut idle_refresh_next_attempt,
&mut adaptive_idle_since,
&mut adaptive_recover_until,
&mut floor_warn_next_allowed,
&mut drain_warn_next_allowed,
&mut drain_soft_evict_next_allowed,
)
.await
} else {
false
};
let v6_degraded = apply_family_circuit_result(
&pool,
IpFamily::V6,
&mut family_v6_circuit,
if v6_check_ran {
Some(v6_degraded_raw)
} else {
None
},
true,
now,
now_epoch_secs,
);
degraded_interval = v4_degraded || v6_degraded;
}
}
pub async fn me_drain_timeout_enforcer(pool: Arc<MePool>) {
let mut drain_warn_next_allowed: HashMap<u64, Instant> = HashMap::new();
let mut drain_soft_evict_next_allowed: HashMap<u64, Instant> = HashMap::new();
loop {
tokio::time::sleep(Duration::from_secs(
HEALTH_DRAIN_TIMEOUT_ENFORCER_INTERVAL_SECS,
))
.await;
reap_draining_writers(
&pool,
&mut drain_warn_next_allowed,
&mut drain_soft_evict_next_allowed,
)
.await;
}
}
fn should_run_family_check(circuit: &mut FamilyCircuitState, now: Instant) -> bool {
match circuit.state {
MeFamilyRuntimeState::Suppressed => {
if now < circuit.next_probe_at {
return false;
}
circuit.next_probe_at =
now + Duration::from_secs(FAMILY_RECOVER_PROBE_INTERVAL_SECS);
true
}
_ => true,
}
}
fn apply_family_circuit_result(
pool: &Arc<MePool>,
family: IpFamily,
circuit: &mut FamilyCircuitState,
degraded: Option<bool>,
allow_suppress: bool,
now: Instant,
now_epoch_secs: u64,
) -> bool {
let Some(degraded) = degraded else {
// Preserve suppression state when probe tick is intentionally skipped.
return false;
};
let previous_state = circuit.state;
match circuit.state {
MeFamilyRuntimeState::Suppressed => {
if degraded {
circuit.fail_streak = circuit.fail_streak.saturating_add(1);
circuit.recover_success_streak = 0;
let until = now + Duration::from_secs(FAMILY_SUPPRESS_WINDOW_SECS);
circuit.suppressed_until = Some(until);
circuit.state_since_at = now;
warn!(
?family,
fail_streak = circuit.fail_streak,
suppress_secs = FAMILY_SUPPRESS_WINDOW_SECS,
"ME family remains suppressed due to ongoing failures"
);
} else {
circuit.fail_streak = 0;
circuit.recover_success_streak = 1;
circuit.state = MeFamilyRuntimeState::Recovering;
}
}
MeFamilyRuntimeState::Recovering => {
if degraded {
circuit.fail_streak = circuit.fail_streak.saturating_add(1);
if allow_suppress {
circuit.state = MeFamilyRuntimeState::Suppressed;
let until = now + Duration::from_secs(FAMILY_SUPPRESS_WINDOW_SECS);
circuit.suppressed_until = Some(until);
circuit.next_probe_at =
now + Duration::from_secs(FAMILY_RECOVER_PROBE_INTERVAL_SECS);
warn!(
?family,
fail_streak = circuit.fail_streak,
suppress_secs = FAMILY_SUPPRESS_WINDOW_SECS,
"ME family temporarily suppressed after repeated degradation"
);
} else {
circuit.state = MeFamilyRuntimeState::Degraded;
}
} else {
circuit.recover_success_streak = circuit.recover_success_streak.saturating_add(1);
if circuit.recover_success_streak >= FAMILY_RECOVER_SUCCESS_STREAK_REQUIRED {
circuit.fail_streak = 0;
circuit.recover_success_streak = 0;
circuit.suppressed_until = None;
circuit.state = MeFamilyRuntimeState::Healthy;
info!(
?family,
"ME family suppression lifted after stable recovery probes"
);
}
}
}
_ => {
if degraded {
circuit.fail_streak = circuit.fail_streak.saturating_add(1);
circuit.recover_success_streak = 0;
circuit.state = MeFamilyRuntimeState::Degraded;
if allow_suppress && circuit.fail_streak >= FAMILY_SUPPRESS_FAIL_STREAK_THRESHOLD {
circuit.state = MeFamilyRuntimeState::Suppressed;
let until = now + Duration::from_secs(FAMILY_SUPPRESS_WINDOW_SECS);
circuit.suppressed_until = Some(until);
circuit.next_probe_at =
now + Duration::from_secs(FAMILY_RECOVER_PROBE_INTERVAL_SECS);
warn!(
?family,
fail_streak = circuit.fail_streak,
suppress_secs = FAMILY_SUPPRESS_WINDOW_SECS,
"ME family temporarily suppressed after repeated degradation"
);
}
} else {
circuit.fail_streak = 0;
circuit.recover_success_streak = 0;
circuit.suppressed_until = None;
circuit.state = MeFamilyRuntimeState::Healthy;
}
}
}
if previous_state != circuit.state {
circuit.state_since_at = now;
}
let suppressed_until_epoch_secs = circuit
.suppressed_until
.and_then(|until| {
if until > now {
Some(
now_epoch_secs
.saturating_add(until.saturating_duration_since(now).as_secs()),
)
} else {
None
}
})
.unwrap_or(0);
let state_since_epoch_secs = if previous_state == circuit.state {
pool.family_runtime_state_since_epoch_secs(family)
} else {
now_epoch_secs
};
pool.set_family_runtime_state(
family,
circuit.state,
state_since_epoch_secs,
suppressed_until_epoch_secs,
circuit.fail_streak,
circuit.recover_success_streak,
);
!matches!(circuit.state, MeFamilyRuntimeState::Suppressed) && degraded
}
fn draining_writer_timeout_expired(
pool: &MePool,
writer: &MeWriter,
now_epoch_secs: u64,
drain_ttl_secs: u64,
) -> bool {
if pool
.me_instadrain
.load(std::sync::atomic::Ordering::Relaxed)
{
return true;
}
let deadline_epoch_secs = writer
.drain_deadline_epoch_secs
.load(std::sync::atomic::Ordering::Relaxed);
if deadline_epoch_secs != 0 {
return now_epoch_secs >= deadline_epoch_secs;
}
if drain_ttl_secs == 0 {
return false;
}
let drain_started_at_epoch_secs = writer
.draining_started_at_epoch_secs
.load(std::sync::atomic::Ordering::Relaxed);
if drain_started_at_epoch_secs == 0 {
return false;
}
now_epoch_secs.saturating_sub(drain_started_at_epoch_secs) > drain_ttl_secs
}
pub(super) async fn reap_draining_writers(
pool: &Arc<MePool>,
warn_next_allowed: &mut HashMap<u64, Instant>,
@@ -137,11 +412,16 @@ pub(super) async fn reap_draining_writers(
let activity = pool.registry.writer_activity_snapshot().await;
let mut draining_writers = Vec::new();
let mut empty_writer_ids = Vec::<u64>::new();
let mut timeout_expired_writer_ids = Vec::<u64>::new();
let mut force_close_writer_ids = Vec::<u64>::new();
for writer in writers {
if !writer.draining.load(std::sync::atomic::Ordering::Relaxed) {
continue;
}
if draining_writer_timeout_expired(pool, &writer, now_epoch_secs, drain_ttl_secs) {
timeout_expired_writer_ids.push(writer.id);
continue;
}
if activity
.bound_clients_by_writer
.get(&writer.id)
@@ -207,14 +487,6 @@ pub(super) async fn reap_draining_writers(
"ME draining writer remains non-empty past drain TTL"
);
}
let deadline_epoch_secs = writer
.drain_deadline_epoch_secs
.load(std::sync::atomic::Ordering::Relaxed);
if deadline_epoch_secs != 0 && now_epoch_secs >= deadline_epoch_secs {
warn!(writer_id = writer.id, "Drain timeout, force-closing");
force_close_writer_ids.push(writer.id);
active_draining_writer_ids.remove(&writer.id);
}
}
warn_next_allowed.retain(|writer_id, _| active_draining_writer_ids.contains(writer_id));
@@ -299,11 +571,22 @@ pub(super) async fn reap_draining_writers(
}
}
let close_budget = health_drain_close_budget();
let mut closed_writer_ids = HashSet::<u64>::new();
for writer_id in timeout_expired_writer_ids {
if !closed_writer_ids.insert(writer_id) {
continue;
}
pool.stats.increment_pool_force_close_total();
pool.remove_writer_and_close_clients(writer_id, MeWriterTeardownReason::ReapTimeoutExpired)
.await;
pool.stats
.increment_me_draining_writers_reap_progress_total();
}
let requested_force_close = force_close_writer_ids.len();
let requested_empty_close = empty_writer_ids.len();
let requested_close_total = requested_force_close.saturating_add(requested_empty_close);
let mut closed_writer_ids = HashSet::<u64>::new();
let close_budget = health_drain_close_budget();
let mut closed_total = 0usize;
for writer_id in force_close_writer_ids {
if closed_total >= close_budget {
@@ -313,7 +596,8 @@ pub(super) async fn reap_draining_writers(
continue;
}
pool.stats.increment_pool_force_close_total();
pool.remove_writer_and_close_clients(writer_id).await;
pool.remove_writer_and_close_clients(writer_id, MeWriterTeardownReason::ReapThresholdForce)
.await;
pool.stats
.increment_me_draining_writers_reap_progress_total();
closed_total = closed_total.saturating_add(1);
@@ -325,7 +609,8 @@ pub(super) async fn reap_draining_writers(
if !closed_writer_ids.insert(writer_id) {
continue;
}
pool.remove_writer_and_close_clients(writer_id).await;
pool.remove_writer_and_close_clients(writer_id, MeWriterTeardownReason::ReapEmpty)
.await;
pool.stats
.increment_me_draining_writers_reap_progress_total();
closed_total = closed_total.saturating_add(1);
@@ -396,6 +681,8 @@ async fn check_family(
adaptive_idle_since: &mut HashMap<(i32, IpFamily), Instant>,
adaptive_recover_until: &mut HashMap<(i32, IpFamily), Instant>,
floor_warn_next_allowed: &mut HashMap<(i32, IpFamily), Instant>,
drain_warn_next_allowed: &mut HashMap<u64, Instant>,
drain_soft_evict_next_allowed: &mut HashMap<u64, Instant>,
) -> bool {
let enabled = match family {
IpFamily::V4 => pool.decision.ipv4_me,
@@ -476,8 +763,15 @@ async fn check_family(
floor_plan.active_writers_current,
floor_plan.warm_writers_current,
);
let mut next_drain_reap_at = Instant::now();
for (dc, endpoints) in dc_endpoints {
if Instant::now() >= next_drain_reap_at {
reap_draining_writers(pool, drain_warn_next_allowed, drain_soft_evict_next_allowed)
.await;
next_drain_reap_at = Instant::now()
+ Duration::from_secs(HEALTH_DRAIN_REAP_OPPORTUNISTIC_INTERVAL_SECS);
}
if endpoints.is_empty() {
continue;
}
@@ -621,6 +915,12 @@ async fn check_family(
let mut restored = 0usize;
for _ in 0..missing {
if Instant::now() >= next_drain_reap_at {
reap_draining_writers(pool, drain_warn_next_allowed, drain_soft_evict_next_allowed)
.await;
next_drain_reap_at = Instant::now()
+ Duration::from_secs(HEALTH_DRAIN_REAP_OPPORTUNISTIC_INTERVAL_SECS);
}
if reconnect_budget == 0 {
break;
}
@@ -1472,6 +1772,187 @@ async fn maybe_rotate_single_endpoint_shadow(
);
}
/// Last-resort safety net for draining writers stuck past their deadline.
///
/// Runs every `TICK_SECS` and force-closes any draining writer whose
/// `drain_deadline_epoch_secs` has been exceeded by more than a threshold.
///
/// Two thresholds:
/// - `SOFT_THRESHOLD_SECS` (60s): writers with no bound clients
/// - `HARD_THRESHOLD_SECS` (300s): writers WITH bound clients (unconditional)
///
/// Intentionally kept trivial and independent of pool config to minimise
/// the probability of panicking itself. Uses `SystemTime` directly
/// as a fallback clock source and timeouts on every lock acquisition
/// and writer removal so one stuck writer cannot block the rest.
pub async fn me_zombie_writer_watchdog(pool: Arc<MePool>) {
use std::time::{SystemTime, UNIX_EPOCH};
const TICK_SECS: u64 = 30;
const SOFT_THRESHOLD_SECS: u64 = 60;
const HARD_THRESHOLD_SECS: u64 = 300;
const LOCK_TIMEOUT_SECS: u64 = 5;
const REMOVE_TIMEOUT_SECS: u64 = 10;
const HARD_DETACH_TIMEOUT_STREAK: u8 = 3;
let mut removal_timeout_streak = HashMap::<u64, u8>::new();
loop {
tokio::time::sleep(Duration::from_secs(TICK_SECS)).await;
let now = match SystemTime::now().duration_since(UNIX_EPOCH) {
Ok(d) => d.as_secs(),
Err(_) => continue,
};
// Phase 1: collect zombie IDs under a short read-lock with timeout.
let zombie_ids_with_meta: Vec<(u64, bool)> = {
let Ok(ws) = tokio::time::timeout(
Duration::from_secs(LOCK_TIMEOUT_SECS),
pool.writers.read(),
)
.await
else {
warn!("zombie_watchdog: writers read-lock timeout, skipping tick");
continue;
};
ws.iter()
.filter(|w| w.draining.load(std::sync::atomic::Ordering::Relaxed))
.filter_map(|w| {
let deadline = w
.drain_deadline_epoch_secs
.load(std::sync::atomic::Ordering::Relaxed);
if deadline == 0 {
return None;
}
let overdue = now.saturating_sub(deadline);
if overdue == 0 {
return None;
}
let started = w
.draining_started_at_epoch_secs
.load(std::sync::atomic::Ordering::Relaxed);
let drain_age = now.saturating_sub(started);
if drain_age > HARD_THRESHOLD_SECS {
return Some((w.id, true));
}
if overdue > SOFT_THRESHOLD_SECS {
return Some((w.id, false));
}
None
})
.collect()
};
// read lock released here
if zombie_ids_with_meta.is_empty() {
removal_timeout_streak.clear();
continue;
}
let mut active_zombie_ids = HashSet::<u64>::with_capacity(zombie_ids_with_meta.len());
for (writer_id, _) in &zombie_ids_with_meta {
active_zombie_ids.insert(*writer_id);
}
removal_timeout_streak.retain(|writer_id, _| active_zombie_ids.contains(writer_id));
warn!(
zombie_count = zombie_ids_with_meta.len(),
soft_threshold_secs = SOFT_THRESHOLD_SECS,
hard_threshold_secs = HARD_THRESHOLD_SECS,
"Zombie draining writers detected by watchdog, force-closing"
);
// Phase 2: remove each writer individually with a timeout.
// One stuck removal cannot block the rest.
for (writer_id, had_clients) in &zombie_ids_with_meta {
let result = tokio::time::timeout(
Duration::from_secs(REMOVE_TIMEOUT_SECS),
pool.remove_writer_and_close_clients(
*writer_id,
MeWriterTeardownReason::WatchdogStuckDraining,
),
)
.await;
match result {
Ok(true) => {
removal_timeout_streak.remove(writer_id);
pool.stats.increment_pool_force_close_total();
pool.stats
.increment_me_draining_writers_reap_progress_total();
info!(
writer_id,
had_clients,
"Zombie writer removed by watchdog"
);
}
Ok(false) => {
removal_timeout_streak.remove(writer_id);
debug!(
writer_id,
had_clients,
"Zombie writer watchdog removal became no-op"
);
}
Err(_) => {
pool.stats.increment_me_writer_teardown_timeout_total();
let streak = removal_timeout_streak
.entry(*writer_id)
.and_modify(|value| *value = value.saturating_add(1))
.or_insert(1);
warn!(
writer_id,
had_clients,
timeout_streak = *streak,
"Zombie writer removal timed out"
);
if *streak < HARD_DETACH_TIMEOUT_STREAK {
continue;
}
pool.stats.increment_me_writer_teardown_escalation_total();
let hard_detach = tokio::time::timeout(
Duration::from_secs(REMOVE_TIMEOUT_SECS),
pool.remove_draining_writer_hard_detach(
*writer_id,
MeWriterTeardownReason::WatchdogStuckDraining,
),
)
.await;
match hard_detach {
Ok(true) => {
removal_timeout_streak.remove(writer_id);
pool.stats.increment_pool_force_close_total();
pool.stats
.increment_me_draining_writers_reap_progress_total();
info!(
writer_id,
had_clients,
"Zombie writer hard-detached after repeated timeouts"
);
}
Ok(false) => {
removal_timeout_streak.remove(writer_id);
debug!(
writer_id,
had_clients,
"Zombie hard-detach skipped (writer already gone or no longer draining)"
);
}
Err(_) => {
pool.stats.increment_me_writer_teardown_timeout_total();
warn!(
writer_id,
had_clients,
"Zombie hard-detach timed out, will retry next tick"
);
}
}
}
}
}
}
}
#[cfg(test)]
mod tests {
use std::collections::HashMap;
@@ -1483,13 +1964,19 @@ mod tests {
use tokio::sync::mpsc;
use tokio_util::sync::CancellationToken;
use super::reap_draining_writers;
use super::{
FamilyCircuitState, apply_family_circuit_result, reap_draining_writers,
should_run_family_check,
};
use crate::config::{GeneralConfig, MeRouteNoWriterMode, MeSocksKdfPolicy, MeWriterPickMode};
use crate::crypto::SecureRandom;
use crate::network::IpFamily;
use crate::network::probe::NetworkDecision;
use crate::stats::Stats;
use crate::transport::middle_proxy::codec::WriterCommand;
use crate::transport::middle_proxy::pool::{MePool, MeWriter, WriterContour};
use crate::transport::middle_proxy::pool::{
MeFamilyRuntimeState, MePool, MeWriter, WriterContour,
};
use crate::transport::middle_proxy::registry::ConnMeta;
async fn make_pool(me_pool_drain_threshold: u64) -> Arc<MePool> {
@@ -1548,6 +2035,7 @@ mod tests {
general.me_adaptive_floor_max_warm_writers_global,
general.hardswap,
general.me_pool_drain_ttl_secs,
general.me_instadrain,
general.me_pool_drain_threshold,
general.me_pool_drain_soft_evict_enabled,
general.me_pool_drain_soft_evict_grace_secs,
@@ -1666,4 +2154,47 @@ mod tests {
assert_eq!(pool.registry.get_writer(conn_b).await.unwrap().writer_id, 20);
assert_eq!(pool.registry.get_writer(conn_c).await.unwrap().writer_id, 30);
}
#[tokio::test]
async fn suppressed_family_probe_skip_preserves_suppressed_state() {
let pool = make_pool(0).await;
let now = Instant::now();
let now_epoch_secs = MePool::now_epoch_secs();
let suppressed_until_epoch_secs = now_epoch_secs.saturating_add(60);
pool.set_family_runtime_state(
IpFamily::V6,
MeFamilyRuntimeState::Suppressed,
now_epoch_secs,
suppressed_until_epoch_secs,
7,
0,
);
let mut circuit = FamilyCircuitState {
state: MeFamilyRuntimeState::Suppressed,
state_since_at: now,
suppressed_until: Some(now + Duration::from_secs(60)),
next_probe_at: now + Duration::from_secs(5),
fail_streak: 7,
recover_success_streak: 0,
};
assert!(!should_run_family_check(&mut circuit, now));
assert!(!apply_family_circuit_result(
&pool,
IpFamily::V6,
&mut circuit,
None,
true,
now,
now_epoch_secs,
));
assert_eq!(circuit.state, MeFamilyRuntimeState::Suppressed);
assert_eq!(circuit.fail_streak, 7);
assert_eq!(circuit.recover_success_streak, 0);
assert_eq!(
pool.family_runtime_state(IpFamily::V6),
MeFamilyRuntimeState::Suppressed,
);
}
}

View File

@@ -81,6 +81,7 @@ async fn make_pool(
general.me_adaptive_floor_max_warm_writers_global,
general.hardswap,
general.me_pool_drain_ttl_secs,
general.me_instadrain,
general.me_pool_drain_threshold,
general.me_pool_drain_soft_evict_enabled,
general.me_pool_drain_soft_evict_grace_secs,
@@ -213,7 +214,7 @@ async fn reap_draining_writers_respects_threshold_across_multiple_overflow_cycle
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(600).saturating_add(writer_id),
now_epoch_secs.saturating_sub(20),
1,
0,
)
@@ -230,7 +231,7 @@ async fn reap_draining_writers_respects_threshold_across_multiple_overflow_cycle
}
assert_eq!(writer_count(&pool).await, threshold as usize);
assert_eq!(sorted_writer_ids(&pool).await, vec![58, 59, 60]);
assert_eq!(sorted_writer_ids(&pool).await, vec![1, 2, 3]);
}
#[tokio::test]
@@ -315,7 +316,12 @@ async fn reap_draining_writers_maintains_warn_state_subset_property_under_bulk_c
let ids = sorted_writer_ids(&pool).await;
for writer_id in ids.into_iter().take(3) {
let _ = pool.remove_writer_and_close_clients(writer_id).await;
let _ = pool
.remove_writer_and_close_clients(
writer_id,
crate::stats::MeWriterTeardownReason::ReapEmpty,
)
.await;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;

View File

@@ -80,6 +80,7 @@ async fn make_pool(
general.me_adaptive_floor_max_warm_writers_global,
general.hardswap,
general.me_pool_drain_ttl_secs,
general.me_instadrain,
general.me_pool_drain_threshold,
general.me_pool_drain_soft_evict_enabled,
general.me_pool_drain_soft_evict_grace_secs,

View File

@@ -12,7 +12,9 @@ use super::codec::WriterCommand;
use super::health::{health_drain_close_budget, reap_draining_writers};
use super::pool::{MePool, MeWriter, WriterContour};
use super::registry::ConnMeta;
use crate::config::{GeneralConfig, MeRouteNoWriterMode, MeSocksKdfPolicy, MeWriterPickMode};
use crate::config::{
GeneralConfig, MeBindStaleMode, MeRouteNoWriterMode, MeSocksKdfPolicy, MeWriterPickMode,
};
use crate::crypto::SecureRandom;
use crate::network::probe::NetworkDecision;
use crate::stats::Stats;
@@ -74,6 +76,7 @@ async fn make_pool(me_pool_drain_threshold: u64) -> Arc<MePool> {
general.me_adaptive_floor_max_warm_writers_global,
general.hardswap,
general.me_pool_drain_ttl_secs,
general.me_instadrain,
general.me_pool_drain_threshold,
general.me_pool_drain_soft_evict_enabled,
general.me_pool_drain_soft_evict_grace_secs,
@@ -180,15 +183,23 @@ async fn current_writer_ids(pool: &Arc<MePool>) -> Vec<u64> {
async fn reap_draining_writers_drops_warn_state_for_removed_writer() {
let pool = make_pool(128).await;
let now_epoch_secs = MePool::now_epoch_secs();
let conn_ids =
insert_draining_writer(&pool, 7, now_epoch_secs.saturating_sub(180), 1, 0).await;
let conn_ids = insert_draining_writer(
&pool,
7,
now_epoch_secs.saturating_sub(180),
1,
now_epoch_secs.saturating_add(3_600),
)
.await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(warn_next_allowed.contains_key(&7));
let _ = pool.remove_writer_and_close_clients(7).await;
let _ = pool
.remove_writer_and_close_clients(7, crate::stats::MeWriterTeardownReason::ReapEmpty)
.await;
assert!(pool.registry.get_writer(conn_ids[0]).await.is_none());
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
@@ -331,17 +342,17 @@ async fn reap_draining_writers_deadline_force_close_applies_under_threshold() {
#[tokio::test]
async fn reap_draining_writers_limits_closes_per_health_tick() {
let pool = make_pool(128).await;
let pool = make_pool(1).await;
let now_epoch_secs = MePool::now_epoch_secs();
let close_budget = health_drain_close_budget();
let writer_total = close_budget.saturating_add(19);
let writer_total = close_budget.saturating_add(20);
for writer_id in 1..=writer_total as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(20),
1,
now_epoch_secs.saturating_sub(1),
0,
)
.await;
}
@@ -364,8 +375,8 @@ async fn reap_draining_writers_backlog_drains_across_ticks() {
&pool,
writer_id,
now_epoch_secs.saturating_sub(20),
1,
now_epoch_secs.saturating_sub(1),
0,
0,
)
.await;
}
@@ -393,7 +404,7 @@ async fn reap_draining_writers_threshold_backlog_converges_to_threshold() {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(200).saturating_add(writer_id),
now_epoch_secs.saturating_sub(20),
1,
0,
)
@@ -429,27 +440,27 @@ async fn reap_draining_writers_threshold_zero_preserves_non_expired_non_empty_wr
#[tokio::test]
async fn reap_draining_writers_prioritizes_force_close_before_empty_cleanup() {
let pool = make_pool(128).await;
let pool = make_pool(1).await;
let now_epoch_secs = MePool::now_epoch_secs();
let close_budget = health_drain_close_budget();
for writer_id in 1..=close_budget as u64 {
for writer_id in 1..=close_budget.saturating_add(1) as u64 {
insert_draining_writer(
&pool,
writer_id,
now_epoch_secs.saturating_sub(20),
1,
now_epoch_secs.saturating_sub(1),
0,
)
.await;
}
let empty_writer_id = close_budget as u64 + 1;
let empty_writer_id = close_budget.saturating_add(2) as u64;
insert_draining_writer(&pool, empty_writer_id, now_epoch_secs.saturating_sub(20), 0, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert_eq!(current_writer_ids(&pool).await, vec![empty_writer_id]);
assert_eq!(current_writer_ids(&pool).await, vec![1, empty_writer_id]);
}
#[tokio::test]
@@ -518,7 +529,12 @@ async fn reap_draining_writers_warn_state_never_exceeds_live_draining_population
let existing_writer_ids = current_writer_ids(&pool).await;
for writer_id in existing_writer_ids.into_iter().take(4) {
let _ = pool.remove_writer_and_close_clients(writer_id).await;
let _ = pool
.remove_writer_and_close_clients(
writer_id,
crate::stats::MeWriterTeardownReason::ReapEmpty,
)
.await;
}
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(warn_next_allowed.len() <= pool.writers.read().await.len());
@@ -571,7 +587,14 @@ async fn reap_draining_writers_soft_evicts_stuck_writer_with_per_writer_cap() {
.store(1, Ordering::Relaxed);
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 77, now_epoch_secs.saturating_sub(240), 3, 0).await;
insert_draining_writer(
&pool,
77,
now_epoch_secs.saturating_sub(240),
3,
now_epoch_secs.saturating_add(3_600),
)
.await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
@@ -595,7 +618,14 @@ async fn reap_draining_writers_soft_evict_respects_cooldown_per_writer() {
.store(60_000, Ordering::Relaxed);
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 88, now_epoch_secs.saturating_sub(240), 3, 0).await;
insert_draining_writer(
&pool,
88,
now_epoch_secs.saturating_sub(240),
3,
now_epoch_secs.saturating_add(3_600),
)
.await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
@@ -608,12 +638,40 @@ async fn reap_draining_writers_soft_evict_respects_cooldown_per_writer() {
assert_eq!(pool.stats.get_pool_drain_soft_evict_writer_total(), 1);
}
#[tokio::test]
async fn reap_draining_writers_instadrain_removes_non_expired_writers_immediately() {
let pool = make_pool(0).await;
pool.me_instadrain.store(true, Ordering::Relaxed);
let now_epoch_secs = MePool::now_epoch_secs();
insert_draining_writer(&pool, 101, now_epoch_secs.saturating_sub(5), 1, 0).await;
insert_draining_writer(&pool, 102, now_epoch_secs.saturating_sub(4), 1, 0).await;
let mut warn_next_allowed = HashMap::new();
let mut soft_evict_next_allowed = HashMap::new();
reap_draining_writers(&pool, &mut warn_next_allowed, &mut soft_evict_next_allowed).await;
assert!(current_writer_ids(&pool).await.is_empty());
}
#[test]
fn general_config_default_drain_threshold_remains_enabled() {
assert_eq!(GeneralConfig::default().me_pool_drain_threshold, 128);
assert_eq!(GeneralConfig::default().me_pool_drain_threshold, 32);
assert!(GeneralConfig::default().me_pool_drain_soft_evict_enabled);
assert_eq!(
GeneralConfig::default().me_pool_drain_soft_evict_per_writer,
1
GeneralConfig::default().me_pool_drain_soft_evict_grace_secs,
10
);
assert_eq!(
GeneralConfig::default().me_pool_drain_soft_evict_per_writer,
2
);
assert_eq!(
GeneralConfig::default().me_pool_drain_soft_evict_budget_per_core,
16
);
assert_eq!(
GeneralConfig::default().me_pool_drain_soft_evict_cooldown_ms,
1000
);
assert_eq!(GeneralConfig::default().me_bind_stale_mode, MeBindStaleMode::Never);
}

View File

@@ -30,7 +30,7 @@ mod health_adversarial_tests;
use bytes::Bytes;
pub use health::me_health_monitor;
pub use health::{me_drain_timeout_enforcer, me_health_monitor, me_zombie_writer_watchdog};
#[allow(unused_imports)]
pub use ping::{run_me_ping, format_sample_line, format_me_route, MePingReport, MePingSample, MePingFamily};
pub use pool::MePool;

View File

@@ -7,6 +7,7 @@ use tokio::net::UdpSocket;
use crate::config::{UpstreamConfig, UpstreamType};
use crate::crypto::SecureRandom;
use crate::error::ProxyError;
use crate::transport::shadowsocks::sanitize_shadowsocks_url;
use crate::transport::{UpstreamEgressInfo, UpstreamRouteKind};
use super::MePool;
@@ -40,7 +41,11 @@ pub fn format_sample_line(sample: &MePingSample) -> String {
let sign = if sample.dc >= 0 { "+" } else { "-" };
let addr = format!("{}:{}", sample.addr.ip(), sample.addr.port());
match (sample.connect_ms, sample.handshake_ms.as_ref(), sample.error.as_ref()) {
match (
sample.connect_ms,
sample.handshake_ms.as_ref(),
sample.error.as_ref(),
) {
(Some(conn), Some(hs), None) => format!(
" {sign} {addr}\tPing: {:.0} ms / RPC: {:.0} ms / OK",
conn, hs
@@ -121,6 +126,7 @@ fn route_from_egress(egress: Option<UpstreamEgressInfo>) -> Option<String> {
None => route,
})
}
UpstreamRouteKind::Shadowsocks => Some("shadowsocks".to_string()),
}
}
@@ -232,6 +238,9 @@ pub async fn format_me_route(
}
UpstreamType::Socks4 { address, .. } => format!("socks4://{address}"),
UpstreamType::Socks5 { address, .. } => format!("socks5://{address}"),
UpstreamType::Shadowsocks { url, .. } => sanitize_shadowsocks_url(url)
.map(|address| format!("shadowsocks://{address}"))
.unwrap_or_else(|_| "shadowsocks://invalid".to_string()),
};
}
@@ -254,6 +263,12 @@ pub async fn format_me_route(
if has_socks5 {
kinds.push("socks5");
}
if enabled_upstreams
.iter()
.any(|u| matches!(u.upstream_type, UpstreamType::Shadowsocks { .. }))
{
kinds.push("shadowsocks");
}
format!("mixed upstreams ({})", kinds.join(", "))
}
@@ -335,7 +350,10 @@ pub async fn run_me_ping(pool: &Arc<MePool>, rng: &SecureRandom) -> Vec<MePingRe
Ok((stream, conn_rtt, upstream_egress)) => {
connect_ms = Some(conn_rtt);
route = route_from_egress(upstream_egress);
match pool.handshake_only(stream, addr, upstream_egress, rng).await {
match pool
.handshake_only(stream, addr, upstream_egress, rng)
.await
{
Ok(hs) => {
handshake_ms = Some(hs.handshake_ms);
// drop halves to close

View File

@@ -18,6 +18,8 @@ use crate::transport::UpstreamManager;
use super::ConnRegistry;
use super::codec::WriterCommand;
const ME_FORCE_CLOSE_SAFETY_FALLBACK_SECS: u64 = 300;
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub(super) struct RefillDcKey {
pub dc: i32,
@@ -72,6 +74,64 @@ impl WriterContour {
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u8)]
pub(crate) enum MeFamilyRuntimeState {
Healthy = 0,
Degraded = 1,
Suppressed = 2,
Recovering = 3,
}
impl MeFamilyRuntimeState {
pub(crate) fn from_u8(value: u8) -> Self {
match value {
1 => Self::Degraded,
2 => Self::Suppressed,
3 => Self::Recovering,
_ => Self::Healthy,
}
}
pub(crate) fn as_str(self) -> &'static str {
match self {
Self::Healthy => "healthy",
Self::Degraded => "degraded",
Self::Suppressed => "suppressed",
Self::Recovering => "recovering",
}
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u8)]
pub(crate) enum MeDrainGateReason {
Open = 0,
CoverageQuorum = 1,
Redundancy = 2,
SuppressionActive = 3,
}
impl MeDrainGateReason {
pub(crate) fn from_u8(value: u8) -> Self {
match value {
1 => Self::CoverageQuorum,
2 => Self::Redundancy,
3 => Self::SuppressionActive,
_ => Self::Open,
}
}
pub(crate) fn as_str(self) -> &'static str {
match self {
Self::Open => "open",
Self::CoverageQuorum => "coverage_quorum",
Self::Redundancy => "redundancy",
Self::SuppressionActive => "suppression_active",
}
}
}
#[derive(Debug, Clone)]
pub struct SecretSnapshot {
pub epoch: u64,
@@ -171,6 +231,7 @@ pub struct MePool {
pub(super) endpoint_quarantine: Arc<Mutex<HashMap<SocketAddr, Instant>>>,
pub(super) kdf_material_fingerprint: Arc<RwLock<HashMap<SocketAddr, (u64, u16)>>>,
pub(super) me_pool_drain_ttl_secs: AtomicU64,
pub(super) me_instadrain: AtomicBool,
pub(super) me_pool_drain_threshold: AtomicU64,
pub(super) me_pool_drain_soft_evict_enabled: AtomicBool,
pub(super) me_pool_drain_soft_evict_grace_secs: AtomicU64,
@@ -200,6 +261,20 @@ pub struct MePool {
pub(super) me_health_interval_ms_unhealthy: AtomicU64,
pub(super) me_health_interval_ms_healthy: AtomicU64,
pub(super) me_warn_rate_limit_ms: AtomicU64,
pub(super) me_family_v4_runtime_state: AtomicU8,
pub(super) me_family_v6_runtime_state: AtomicU8,
pub(super) me_family_v4_state_since_epoch_secs: AtomicU64,
pub(super) me_family_v6_state_since_epoch_secs: AtomicU64,
pub(super) me_family_v4_suppressed_until_epoch_secs: AtomicU64,
pub(super) me_family_v6_suppressed_until_epoch_secs: AtomicU64,
pub(super) me_family_v4_fail_streak: AtomicU32,
pub(super) me_family_v6_fail_streak: AtomicU32,
pub(super) me_family_v4_recover_success_streak: AtomicU32,
pub(super) me_family_v6_recover_success_streak: AtomicU32,
pub(super) me_last_drain_gate_route_quorum_ok: AtomicBool,
pub(super) me_last_drain_gate_redundancy_ok: AtomicBool,
pub(super) me_last_drain_gate_block_reason: AtomicU8,
pub(super) me_last_drain_gate_updated_at_epoch_secs: AtomicU64,
pub(super) runtime_ready: AtomicBool,
pool_size: usize,
pub(super) preferred_endpoints_by_dc: Arc<RwLock<HashMap<i32, Vec<SocketAddr>>>>,
@@ -228,6 +303,14 @@ impl MePool {
.as_secs()
}
fn normalize_force_close_secs(force_close_secs: u64) -> u64 {
if force_close_secs == 0 {
ME_FORCE_CLOSE_SAFETY_FALLBACK_SECS
} else {
force_close_secs
}
}
pub fn new(
proxy_tag: Option<Vec<u8>>,
proxy_secret: Vec<u8>,
@@ -279,6 +362,7 @@ impl MePool {
me_adaptive_floor_max_warm_writers_global: u32,
hardswap: bool,
me_pool_drain_ttl_secs: u64,
me_instadrain: bool,
me_pool_drain_threshold: u64,
me_pool_drain_soft_evict_enabled: bool,
me_pool_drain_soft_evict_grace_secs: u64,
@@ -462,6 +546,7 @@ impl MePool {
endpoint_quarantine: Arc::new(Mutex::new(HashMap::new())),
kdf_material_fingerprint: Arc::new(RwLock::new(HashMap::new())),
me_pool_drain_ttl_secs: AtomicU64::new(me_pool_drain_ttl_secs),
me_instadrain: AtomicBool::new(me_instadrain),
me_pool_drain_threshold: AtomicU64::new(me_pool_drain_threshold),
me_pool_drain_soft_evict_enabled: AtomicBool::new(me_pool_drain_soft_evict_enabled),
me_pool_drain_soft_evict_grace_secs: AtomicU64::new(me_pool_drain_soft_evict_grace_secs),
@@ -474,7 +559,9 @@ impl MePool {
me_pool_drain_soft_evict_cooldown_ms: AtomicU64::new(
me_pool_drain_soft_evict_cooldown_ms.max(1),
),
me_pool_force_close_secs: AtomicU64::new(me_pool_force_close_secs),
me_pool_force_close_secs: AtomicU64::new(Self::normalize_force_close_secs(
me_pool_force_close_secs,
)),
me_pool_min_fresh_ratio_permille: AtomicU32::new(Self::ratio_to_permille(
me_pool_min_fresh_ratio,
)),
@@ -503,6 +590,20 @@ impl MePool {
me_health_interval_ms_unhealthy: AtomicU64::new(me_health_interval_ms_unhealthy.max(1)),
me_health_interval_ms_healthy: AtomicU64::new(me_health_interval_ms_healthy.max(1)),
me_warn_rate_limit_ms: AtomicU64::new(me_warn_rate_limit_ms.max(1)),
me_family_v4_runtime_state: AtomicU8::new(MeFamilyRuntimeState::Healthy as u8),
me_family_v6_runtime_state: AtomicU8::new(MeFamilyRuntimeState::Healthy as u8),
me_family_v4_state_since_epoch_secs: AtomicU64::new(Self::now_epoch_secs()),
me_family_v6_state_since_epoch_secs: AtomicU64::new(Self::now_epoch_secs()),
me_family_v4_suppressed_until_epoch_secs: AtomicU64::new(0),
me_family_v6_suppressed_until_epoch_secs: AtomicU64::new(0),
me_family_v4_fail_streak: AtomicU32::new(0),
me_family_v6_fail_streak: AtomicU32::new(0),
me_family_v4_recover_success_streak: AtomicU32::new(0),
me_family_v6_recover_success_streak: AtomicU32::new(0),
me_last_drain_gate_route_quorum_ok: AtomicBool::new(false),
me_last_drain_gate_redundancy_ok: AtomicBool::new(false),
me_last_drain_gate_block_reason: AtomicU8::new(MeDrainGateReason::Open as u8),
me_last_drain_gate_updated_at_epoch_secs: AtomicU64::new(Self::now_epoch_secs()),
runtime_ready: AtomicBool::new(false),
preferred_endpoints_by_dc: Arc::new(RwLock::new(preferred_endpoints_by_dc)),
})
@@ -520,10 +621,158 @@ impl MePool {
self.runtime_ready.load(Ordering::Relaxed)
}
pub(super) fn set_family_runtime_state(
&self,
family: IpFamily,
state: MeFamilyRuntimeState,
state_since_epoch_secs: u64,
suppressed_until_epoch_secs: u64,
fail_streak: u32,
recover_success_streak: u32,
) {
match family {
IpFamily::V4 => {
self.me_family_v4_runtime_state
.store(state as u8, Ordering::Relaxed);
self.me_family_v4_state_since_epoch_secs
.store(state_since_epoch_secs, Ordering::Relaxed);
self.me_family_v4_suppressed_until_epoch_secs
.store(suppressed_until_epoch_secs, Ordering::Relaxed);
self.me_family_v4_fail_streak
.store(fail_streak, Ordering::Relaxed);
self.me_family_v4_recover_success_streak
.store(recover_success_streak, Ordering::Relaxed);
}
IpFamily::V6 => {
self.me_family_v6_runtime_state
.store(state as u8, Ordering::Relaxed);
self.me_family_v6_state_since_epoch_secs
.store(state_since_epoch_secs, Ordering::Relaxed);
self.me_family_v6_suppressed_until_epoch_secs
.store(suppressed_until_epoch_secs, Ordering::Relaxed);
self.me_family_v6_fail_streak
.store(fail_streak, Ordering::Relaxed);
self.me_family_v6_recover_success_streak
.store(recover_success_streak, Ordering::Relaxed);
}
}
}
pub(crate) fn family_runtime_state(&self, family: IpFamily) -> MeFamilyRuntimeState {
match family {
IpFamily::V4 => MeFamilyRuntimeState::from_u8(
self.me_family_v4_runtime_state.load(Ordering::Relaxed),
),
IpFamily::V6 => MeFamilyRuntimeState::from_u8(
self.me_family_v6_runtime_state.load(Ordering::Relaxed),
),
}
}
pub(crate) fn family_runtime_state_since_epoch_secs(&self, family: IpFamily) -> u64 {
match family {
IpFamily::V4 => self
.me_family_v4_state_since_epoch_secs
.load(Ordering::Relaxed),
IpFamily::V6 => self
.me_family_v6_state_since_epoch_secs
.load(Ordering::Relaxed),
}
}
pub(crate) fn family_suppressed_until_epoch_secs(&self, family: IpFamily) -> u64 {
match family {
IpFamily::V4 => self
.me_family_v4_suppressed_until_epoch_secs
.load(Ordering::Relaxed),
IpFamily::V6 => self
.me_family_v6_suppressed_until_epoch_secs
.load(Ordering::Relaxed),
}
}
pub(crate) fn family_fail_streak(&self, family: IpFamily) -> u32 {
match family {
IpFamily::V4 => self.me_family_v4_fail_streak.load(Ordering::Relaxed),
IpFamily::V6 => self.me_family_v6_fail_streak.load(Ordering::Relaxed),
}
}
pub(crate) fn family_recover_success_streak(&self, family: IpFamily) -> u32 {
match family {
IpFamily::V4 => self
.me_family_v4_recover_success_streak
.load(Ordering::Relaxed),
IpFamily::V6 => self
.me_family_v6_recover_success_streak
.load(Ordering::Relaxed),
}
}
pub(crate) fn is_family_temporarily_suppressed(
&self,
family: IpFamily,
now_epoch_secs: u64,
) -> bool {
self.family_suppressed_until_epoch_secs(family) > now_epoch_secs
}
pub(super) fn family_enabled_for_drain_coverage(
&self,
family: IpFamily,
now_epoch_secs: u64,
) -> bool {
let configured = match family {
IpFamily::V4 => self.decision.ipv4_me,
IpFamily::V6 => self.decision.ipv6_me,
};
configured && !self.is_family_temporarily_suppressed(family, now_epoch_secs)
}
pub(super) fn set_last_drain_gate(
&self,
route_quorum_ok: bool,
redundancy_ok: bool,
block_reason: MeDrainGateReason,
updated_at_epoch_secs: u64,
) {
self.me_last_drain_gate_route_quorum_ok
.store(route_quorum_ok, Ordering::Relaxed);
self.me_last_drain_gate_redundancy_ok
.store(redundancy_ok, Ordering::Relaxed);
self.me_last_drain_gate_block_reason
.store(block_reason as u8, Ordering::Relaxed);
self.me_last_drain_gate_updated_at_epoch_secs
.store(updated_at_epoch_secs, Ordering::Relaxed);
}
pub(crate) fn last_drain_gate_route_quorum_ok(&self) -> bool {
self.me_last_drain_gate_route_quorum_ok
.load(Ordering::Relaxed)
}
pub(crate) fn last_drain_gate_redundancy_ok(&self) -> bool {
self.me_last_drain_gate_redundancy_ok
.load(Ordering::Relaxed)
}
pub(crate) fn last_drain_gate_block_reason(&self) -> MeDrainGateReason {
MeDrainGateReason::from_u8(
self.me_last_drain_gate_block_reason
.load(Ordering::Relaxed),
)
}
pub(crate) fn last_drain_gate_updated_at_epoch_secs(&self) -> u64 {
self.me_last_drain_gate_updated_at_epoch_secs
.load(Ordering::Relaxed)
}
pub fn update_runtime_reinit_policy(
&self,
hardswap: bool,
drain_ttl_secs: u64,
instadrain: bool,
pool_drain_threshold: u64,
pool_drain_soft_evict_enabled: bool,
pool_drain_soft_evict_grace_secs: u64,
@@ -568,6 +817,7 @@ impl MePool {
self.hardswap.store(hardswap, Ordering::Relaxed);
self.me_pool_drain_ttl_secs
.store(drain_ttl_secs, Ordering::Relaxed);
self.me_instadrain.store(instadrain, Ordering::Relaxed);
self.me_pool_drain_threshold
.store(pool_drain_threshold, Ordering::Relaxed);
self.me_pool_drain_soft_evict_enabled
@@ -582,8 +832,10 @@ impl MePool {
);
self.me_pool_drain_soft_evict_cooldown_ms
.store(pool_drain_soft_evict_cooldown_ms.max(1), Ordering::Relaxed);
self.me_pool_force_close_secs
.store(force_close_secs, Ordering::Relaxed);
self.me_pool_force_close_secs.store(
Self::normalize_force_close_secs(force_close_secs),
Ordering::Relaxed,
);
self.me_pool_min_fresh_ratio_permille
.store(Self::ratio_to_permille(min_fresh_ratio), Ordering::Relaxed);
self.me_hardswap_warmup_delay_min_ms
@@ -728,12 +980,9 @@ impl MePool {
}
pub(super) fn force_close_timeout(&self) -> Option<Duration> {
let secs = self.me_pool_force_close_secs.load(Ordering::Relaxed);
if secs == 0 {
None
} else {
Some(Duration::from_secs(secs))
}
let secs =
Self::normalize_force_close_secs(self.me_pool_force_close_secs.load(Ordering::Relaxed));
Some(Duration::from_secs(secs))
}
pub(super) fn drain_soft_evict_enabled(&self) -> bool {
@@ -1005,9 +1254,10 @@ impl MePool {
}
pub(super) async fn active_coverage_required_total(&self) -> usize {
let now_epoch_secs = Self::now_epoch_secs();
let mut endpoints_by_dc = HashMap::<i32, HashSet<SocketAddr>>::new();
if self.decision.ipv4_me {
if self.family_enabled_for_drain_coverage(IpFamily::V4, now_epoch_secs) {
let map = self.proxy_map_v4.read().await;
for (dc, addrs) in map.iter() {
let entry = endpoints_by_dc.entry(*dc).or_default();
@@ -1017,7 +1267,7 @@ impl MePool {
}
}
if self.decision.ipv6_me {
if self.family_enabled_for_drain_coverage(IpFamily::V6, now_epoch_secs) {
let map = self.proxy_map_v6.read().await;
for (dc, addrs) in map.iter() {
let entry = endpoints_by_dc.entry(*dc).or_default();

View File

@@ -74,9 +74,8 @@ impl MePool {
debug!(
%addr,
wait_ms = expiry.saturating_duration_since(now).as_millis(),
"All ME endpoints are quarantined for the DC group; retrying earliest one"
"All ME endpoints are quarantined for the DC group; waiting for quarantine expiry"
);
return vec![addr];
}
Vec::new()
@@ -165,9 +164,10 @@ impl MePool {
}
async fn endpoints_for_dc(&self, target_dc: i32) -> Vec<SocketAddr> {
let now_epoch_secs = Self::now_epoch_secs();
let mut endpoints = HashSet::<SocketAddr>::new();
if self.decision.ipv4_me {
if self.family_enabled_for_drain_coverage(IpFamily::V4, now_epoch_secs) {
let map = self.proxy_map_v4.read().await;
if let Some(addrs) = map.get(&target_dc) {
for (ip, port) in addrs {
@@ -176,7 +176,7 @@ impl MePool {
}
}
if self.decision.ipv6_me {
if self.family_enabled_for_drain_coverage(IpFamily::V6, now_epoch_secs) {
let map = self.proxy_map_v6.read().await;
if let Some(addrs) = map.get(&target_dc) {
for (ip, port) in addrs {

View File

@@ -11,8 +11,9 @@ use tracing::{debug, info, warn};
use std::collections::hash_map::DefaultHasher;
use crate::crypto::SecureRandom;
use crate::network::IpFamily;
use super::pool::{MePool, WriterContour};
use super::pool::{MeDrainGateReason, MePool, WriterContour};
const ME_HARDSWAP_PENDING_TTL_SECS: u64 = 1800;
@@ -120,9 +121,10 @@ impl MePool {
}
async fn desired_dc_endpoints(&self) -> HashMap<i32, HashSet<SocketAddr>> {
let now_epoch_secs = Self::now_epoch_secs();
let mut out: HashMap<i32, HashSet<SocketAddr>> = HashMap::new();
if self.decision.ipv4_me {
if self.family_enabled_for_drain_coverage(IpFamily::V4, now_epoch_secs) {
let map_v4 = self.proxy_map_v4.read().await.clone();
for (dc, addrs) in map_v4 {
let entry = out.entry(dc).or_default();
@@ -132,7 +134,7 @@ impl MePool {
}
}
if self.decision.ipv6_me {
if self.family_enabled_for_drain_coverage(IpFamily::V6, now_epoch_secs) {
let map_v6 = self.proxy_map_v6.read().await.clone();
for (dc, addrs) in map_v6 {
let entry = out.entry(dc).or_default();
@@ -313,13 +315,23 @@ impl MePool {
pub async fn zero_downtime_reinit_after_map_change(self: &Arc<Self>, rng: &SecureRandom) {
let desired_by_dc = self.desired_dc_endpoints().await;
let now_epoch_secs = Self::now_epoch_secs();
let v4_suppressed = self.is_family_temporarily_suppressed(IpFamily::V4, now_epoch_secs);
let v6_suppressed = self.is_family_temporarily_suppressed(IpFamily::V6, now_epoch_secs);
if desired_by_dc.is_empty() {
warn!("ME endpoint map is empty; skipping stale writer drain");
let reason = if (self.decision.ipv4_me && v4_suppressed)
|| (self.decision.ipv6_me && v6_suppressed)
{
MeDrainGateReason::SuppressionActive
} else {
MeDrainGateReason::CoverageQuorum
};
self.set_last_drain_gate(false, false, reason, now_epoch_secs);
return;
}
let desired_map_hash = Self::desired_map_hash(&desired_by_dc);
let now_epoch_secs = Self::now_epoch_secs();
let previous_generation = self.current_generation();
let hardswap = self.hardswap.load(Ordering::Relaxed);
let generation = if hardswap {
@@ -390,7 +402,17 @@ impl MePool {
.load(Ordering::Relaxed),
);
let (coverage_ratio, missing_dc) = Self::coverage_ratio(&desired_by_dc, &active_writer_addrs);
let mut route_quorum_ok = coverage_ratio >= min_ratio;
let mut redundancy_ok = missing_dc.is_empty();
let mut redundancy_missing_dc = missing_dc.clone();
let mut gate_coverage_ratio = coverage_ratio;
if !hardswap && coverage_ratio < min_ratio {
self.set_last_drain_gate(
false,
redundancy_ok,
MeDrainGateReason::CoverageQuorum,
now_epoch_secs,
);
warn!(
previous_generation,
generation,
@@ -411,7 +433,17 @@ impl MePool {
.collect();
let (fresh_coverage_ratio, fresh_missing_dc) =
Self::coverage_ratio(&desired_by_dc, &fresh_writer_addrs);
if !fresh_missing_dc.is_empty() {
route_quorum_ok = fresh_coverage_ratio >= min_ratio;
redundancy_ok = fresh_missing_dc.is_empty();
redundancy_missing_dc = fresh_missing_dc.clone();
gate_coverage_ratio = fresh_coverage_ratio;
if fresh_coverage_ratio < min_ratio {
self.set_last_drain_gate(
false,
redundancy_ok,
MeDrainGateReason::CoverageQuorum,
now_epoch_secs,
);
warn!(
previous_generation,
generation,
@@ -421,13 +453,16 @@ impl MePool {
);
return;
}
} else if !missing_dc.is_empty() {
}
self.set_last_drain_gate(route_quorum_ok, redundancy_ok, MeDrainGateReason::Open, now_epoch_secs);
if !redundancy_ok {
warn!(
missing_dc = ?missing_dc,
// Keep stale writers alive when fresh coverage is incomplete.
"ME reinit coverage incomplete; keeping stale writers"
missing_dc = ?redundancy_missing_dc,
coverage_ratio = format_args!("{gate_coverage_ratio:.3}"),
min_ratio = format_args!("{min_ratio:.3}"),
"ME reinit proceeds with weighted quorum while some DC groups remain uncovered"
);
return;
}
if hardswap {

View File

@@ -1,7 +1,7 @@
use std::collections::HashMap;
use std::time::Instant;
use super::pool::{MePool, RefillDcKey};
use super::pool::{MeDrainGateReason, MePool, RefillDcKey};
use crate::network::IpFamily;
#[derive(Clone, Debug)]
@@ -36,6 +36,24 @@ pub(crate) struct MeApiNatStunSnapshot {
pub stun_backoff_remaining_ms: Option<u64>,
}
#[derive(Clone, Debug)]
pub(crate) struct MeApiFamilyStateSnapshot {
pub family: &'static str,
pub state: &'static str,
pub state_since_epoch_secs: u64,
pub suppressed_until_epoch_secs: Option<u64>,
pub fail_streak: u32,
pub recover_success_streak: u32,
}
#[derive(Clone, Debug)]
pub(crate) struct MeApiDrainGateSnapshot {
pub route_quorum_ok: bool,
pub redundancy_ok: bool,
pub block_reason: &'static str,
pub updated_at_epoch_secs: u64,
}
impl MePool {
pub(crate) async fn api_refill_snapshot(&self) -> MeApiRefillSnapshot {
let inflight_endpoints_total = self.refill_inflight.lock().await.len();
@@ -125,4 +143,35 @@ impl MePool {
stun_backoff_remaining_ms,
}
}
pub(crate) fn api_family_state_snapshot(&self) -> Vec<MeApiFamilyStateSnapshot> {
[IpFamily::V4, IpFamily::V6]
.into_iter()
.map(|family| {
let state = self.family_runtime_state(family);
let suppressed_until = self.family_suppressed_until_epoch_secs(family);
MeApiFamilyStateSnapshot {
family: match family {
IpFamily::V4 => "v4",
IpFamily::V6 => "v6",
},
state: state.as_str(),
state_since_epoch_secs: self.family_runtime_state_since_epoch_secs(family),
suppressed_until_epoch_secs: (suppressed_until != 0).then_some(suppressed_until),
fail_streak: self.family_fail_streak(family),
recover_success_streak: self.family_recover_success_streak(family),
}
})
.collect()
}
pub(crate) fn api_drain_gate_snapshot(&self) -> MeApiDrainGateSnapshot {
let reason: MeDrainGateReason = self.last_drain_gate_block_reason();
MeApiDrainGateSnapshot {
route_quorum_ok: self.last_drain_gate_route_quorum_ok(),
redundancy_ok: self.last_drain_gate_redundancy_ok(),
block_reason: reason.as_str(),
updated_at_epoch_secs: self.last_drain_gate_updated_at_epoch_secs(),
}
}
}

View File

@@ -126,6 +126,7 @@ pub(crate) struct MeApiRuntimeSnapshot {
pub me_reconnect_backoff_cap_ms: u64,
pub me_reconnect_fast_retry_count: u32,
pub me_pool_drain_ttl_secs: u64,
pub me_instadrain: bool,
pub me_pool_drain_soft_evict_enabled: bool,
pub me_pool_drain_soft_evict_grace_secs: u64,
pub me_pool_drain_soft_evict_per_writer: u8,
@@ -583,6 +584,7 @@ impl MePool {
me_reconnect_backoff_cap_ms: self.me_reconnect_backoff_cap.as_millis() as u64,
me_reconnect_fast_retry_count: self.me_reconnect_fast_retry_count,
me_pool_drain_ttl_secs: self.me_pool_drain_ttl_secs.load(Ordering::Relaxed),
me_instadrain: self.me_instadrain.load(Ordering::Relaxed),
me_pool_drain_soft_evict_enabled: self
.me_pool_drain_soft_evict_enabled
.load(Ordering::Relaxed),

View File

@@ -16,11 +16,13 @@ use crate::config::MeBindStaleMode;
use crate::crypto::SecureRandom;
use crate::error::{ProxyError, Result};
use crate::protocol::constants::{RPC_CLOSE_EXT_U32, RPC_PING_U32};
use crate::stats::{
MeWriterCleanupSideEffectStep, MeWriterTeardownMode, MeWriterTeardownReason,
};
use super::codec::{RpcWriter, WriterCommand};
use super::pool::{MePool, MeWriter, WriterContour};
use super::reader::reader_loop;
use super::registry::BoundConn;
use super::wire::build_proxy_req_payload;
const ME_ACTIVE_PING_SECS: u64 = 25;
@@ -28,6 +30,12 @@ const ME_ACTIVE_PING_JITTER_SECS: i64 = 5;
const ME_IDLE_KEEPALIVE_MAX_SECS: u64 = 5;
const ME_RPC_PROXY_REQ_RESPONSE_WAIT_MS: u64 = 700;
#[derive(Clone, Copy)]
enum WriterRemoveGuardMode {
Any,
DrainingOnly,
}
fn is_me_peer_closed_error(error: &ProxyError) -> bool {
matches!(error, ProxyError::Io(ioe) if ioe.kind() == ErrorKind::UnexpectedEof)
}
@@ -44,9 +52,16 @@ impl MePool {
for writer_id in closed_writer_ids {
if self.registry.is_writer_empty(writer_id).await {
let _ = self.remove_writer_only(writer_id).await;
let _ = self
.remove_writer_only(writer_id, MeWriterTeardownReason::PruneClosedWriter)
.await;
} else {
let _ = self.remove_writer_and_close_clients(writer_id).await;
let _ = self
.remove_writer_and_close_clients(
writer_id,
MeWriterTeardownReason::PruneClosedWriter,
)
.await;
}
}
}
@@ -143,6 +158,9 @@ impl MePool {
crc_mode: hs.crc_mode,
};
let cancel_wr = cancel.clone();
let cleanup_done = Arc::new(AtomicBool::new(false));
let cleanup_for_writer = cleanup_done.clone();
let pool_writer_task = Arc::downgrade(self);
tokio::spawn(async move {
loop {
tokio::select! {
@@ -160,6 +178,20 @@ impl MePool {
_ = cancel_wr.cancelled() => break,
}
}
if cleanup_for_writer
.compare_exchange(false, true, Ordering::AcqRel, Ordering::Relaxed)
.is_ok()
{
if let Some(pool) = pool_writer_task.upgrade() {
pool.remove_writer_and_close_clients(
writer_id,
MeWriterTeardownReason::WriterTaskExit,
)
.await;
} else {
cancel_wr.cancel();
}
}
});
let writer = MeWriter {
id: writer_id,
@@ -196,7 +228,6 @@ impl MePool {
let cancel_ping = cancel.clone();
let tx_ping = tx.clone();
let ping_tracker_ping = ping_tracker.clone();
let cleanup_done = Arc::new(AtomicBool::new(false));
let cleanup_for_reader = cleanup_done.clone();
let cleanup_for_ping = cleanup_done.clone();
let keepalive_enabled = self.me_keepalive_enabled;
@@ -242,21 +273,29 @@ impl MePool {
stats_reader_close.increment_me_idle_close_by_peer_total();
info!(writer_id, "ME socket closed by peer on idle writer");
}
if let Some(pool) = pool.upgrade()
&& cleanup_for_reader
.compare_exchange(false, true, Ordering::AcqRel, Ordering::Relaxed)
.is_ok()
if cleanup_for_reader
.compare_exchange(false, true, Ordering::AcqRel, Ordering::Relaxed)
.is_ok()
{
pool.remove_writer_and_close_clients(writer_id).await;
if let Some(pool) = pool.upgrade() {
pool.remove_writer_and_close_clients(
writer_id,
MeWriterTeardownReason::ReaderExit,
)
.await;
} else {
// Fallback for shutdown races: make writer task exit quickly so stale
// channels are observable by periodic prune.
cancel_reader_token.cancel();
}
}
if let Err(e) = res {
if !idle_close_by_peer {
warn!(error = %e, "ME reader ended");
}
}
let mut ws = writers_arc.write().await;
ws.retain(|w| w.id != writer_id);
info!(remaining = ws.len(), "Dead ME writer removed from pool");
let remaining = writers_arc.read().await.len();
debug!(writer_id, remaining, "ME reader task finished");
});
let pool_ping = Arc::downgrade(self);
@@ -351,7 +390,11 @@ impl MePool {
.compare_exchange(false, true, Ordering::AcqRel, Ordering::Relaxed)
.is_ok()
{
pool.remove_writer_and_close_clients(writer_id).await;
pool.remove_writer_and_close_clients(
writer_id,
MeWriterTeardownReason::PingSendFail,
)
.await;
}
break;
}
@@ -444,7 +487,11 @@ impl MePool {
.compare_exchange(false, true, Ordering::AcqRel, Ordering::Relaxed)
.is_ok()
{
pool.remove_writer_and_close_clients(writer_id).await;
pool.remove_writer_and_close_clients(
writer_id,
MeWriterTeardownReason::SignalSendFail,
)
.await;
}
break;
}
@@ -478,7 +525,11 @@ impl MePool {
.compare_exchange(false, true, Ordering::AcqRel, Ordering::Relaxed)
.is_ok()
{
pool.remove_writer_and_close_clients(writer_id).await;
pool.remove_writer_and_close_clients(
writer_id,
MeWriterTeardownReason::SignalSendFail,
)
.await;
}
break;
}
@@ -491,21 +542,83 @@ impl MePool {
Ok(())
}
pub(crate) async fn remove_writer_and_close_clients(self: &Arc<Self>, writer_id: u64) {
pub(crate) async fn remove_writer_and_close_clients(
self: &Arc<Self>,
writer_id: u64,
reason: MeWriterTeardownReason,
) -> bool {
// Full client cleanup now happens inside `registry.writer_lost` to keep
// writer reap/remove paths strictly non-blocking per connection.
let _ = self.remove_writer_only(writer_id).await;
self.remove_writer_with_mode(
writer_id,
reason,
MeWriterTeardownMode::Normal,
WriterRemoveGuardMode::Any,
)
.await
}
async fn remove_writer_only(self: &Arc<Self>, writer_id: u64) -> Vec<BoundConn> {
pub(super) async fn remove_draining_writer_hard_detach(
self: &Arc<Self>,
writer_id: u64,
reason: MeWriterTeardownReason,
) -> bool {
self.remove_writer_with_mode(
writer_id,
reason,
MeWriterTeardownMode::HardDetach,
WriterRemoveGuardMode::DrainingOnly,
)
.await
}
async fn remove_writer_only(
self: &Arc<Self>,
writer_id: u64,
reason: MeWriterTeardownReason,
) -> bool {
self.remove_writer_with_mode(
writer_id,
reason,
MeWriterTeardownMode::Normal,
WriterRemoveGuardMode::Any,
)
.await
}
// Authoritative teardown primitive shared by normal cleanup and watchdog path.
// Lock-order invariant:
// 1) mutate `writers` under pool write lock,
// 2) release pool lock,
// 3) run registry/metrics/refill side effects.
// `registry.writer_lost` must never run while `writers` lock is held.
async fn remove_writer_with_mode(
self: &Arc<Self>,
writer_id: u64,
reason: MeWriterTeardownReason,
mode: MeWriterTeardownMode,
guard_mode: WriterRemoveGuardMode,
) -> bool {
let started_at = Instant::now();
self.stats
.increment_me_writer_teardown_attempt_total(reason, mode);
let mut close_tx: Option<mpsc::Sender<WriterCommand>> = None;
let mut removed_addr: Option<SocketAddr> = None;
let mut removed_dc: Option<i32> = None;
let mut removed_uptime: Option<Duration> = None;
let mut trigger_refill = false;
let mut removed = false;
{
let mut ws = self.writers.write().await;
if let Some(pos) = ws.iter().position(|w| w.id == writer_id) {
if matches!(guard_mode, WriterRemoveGuardMode::DrainingOnly)
&& !ws[pos].draining.load(Ordering::Relaxed)
{
self.stats.increment_me_writer_teardown_noop_total();
self.stats
.observe_me_writer_teardown_duration(mode, started_at.elapsed());
return false;
}
let w = ws.remove(pos);
let was_draining = w.draining.load(Ordering::Relaxed);
if was_draining {
@@ -522,6 +635,7 @@ impl MePool {
}
close_tx = Some(w.tx.clone());
self.conn_count.fetch_sub(1, Ordering::Relaxed);
removed = true;
}
}
// State invariant:
@@ -529,7 +643,7 @@ impl MePool {
// - writer is removed from registry routing/binding maps via `writer_lost`.
// The close command below is only a best-effort accelerator for task shutdown.
// Cleanup progress must never depend on command-channel availability.
let conns = self.registry.writer_lost(writer_id).await;
let _ = self.registry.writer_lost(writer_id).await;
{
let mut tracker = self.ping_tracker.lock().await;
tracker.retain(|_, (_, wid)| *wid != writer_id);
@@ -542,6 +656,9 @@ impl MePool {
self.stats.increment_me_writer_close_signal_drop_total();
self.stats
.increment_me_writer_close_signal_channel_full_total();
self.stats.increment_me_writer_cleanup_side_effect_failures_total(
MeWriterCleanupSideEffectStep::CloseSignalChannelFull,
);
debug!(
writer_id,
"Skipping close signal for removed writer: command channel is full"
@@ -549,6 +666,9 @@ impl MePool {
}
Err(TrySendError::Closed(_)) => {
self.stats.increment_me_writer_close_signal_drop_total();
self.stats.increment_me_writer_cleanup_side_effect_failures_total(
MeWriterCleanupSideEffectStep::CloseSignalChannelClosed,
);
debug!(
writer_id,
"Skipping close signal for removed writer: command channel is closed"
@@ -556,16 +676,24 @@ impl MePool {
}
}
}
if trigger_refill
&& let Some(addr) = removed_addr
&& let Some(writer_dc) = removed_dc
{
if let Some(addr) = removed_addr {
if let Some(uptime) = removed_uptime {
self.maybe_quarantine_flapping_endpoint(addr, uptime).await;
}
self.trigger_immediate_refill_for_dc(addr, writer_dc);
if trigger_refill
&& let Some(writer_dc) = removed_dc
{
self.trigger_immediate_refill_for_dc(addr, writer_dc);
}
}
conns
if removed {
self.stats.increment_me_writer_teardown_success_total(mode);
} else {
self.stats.increment_me_writer_teardown_noop_total();
}
self.stats
.observe_me_writer_teardown_duration(mode, started_at.elapsed());
removed
}
pub(crate) async fn mark_writer_draining_with_timeout(

View File

@@ -14,6 +14,7 @@ use crate::config::{MeRouteNoWriterMode, MeWriterPickMode};
use crate::error::{ProxyError, Result};
use crate::network::IpFamily;
use crate::protocol::constants::{RPC_CLOSE_CONN_U32, RPC_CLOSE_EXT_U32};
use crate::stats::MeWriterTeardownReason;
use super::MePool;
use super::codec::WriterCommand;
@@ -134,7 +135,11 @@ impl MePool {
Ok(()) => return Ok(()),
Err(TimedSendError::Closed(_)) => {
warn!(writer_id = current.writer_id, "ME writer channel closed");
self.remove_writer_and_close_clients(current.writer_id).await;
self.remove_writer_and_close_clients(
current.writer_id,
MeWriterTeardownReason::RouteChannelClosed,
)
.await;
continue;
}
Err(TimedSendError::Timeout(_)) => {
@@ -151,7 +156,11 @@ impl MePool {
}
Err(TrySendError::Closed(_)) => {
warn!(writer_id = current.writer_id, "ME writer channel closed");
self.remove_writer_and_close_clients(current.writer_id).await;
self.remove_writer_and_close_clients(
current.writer_id,
MeWriterTeardownReason::RouteChannelClosed,
)
.await;
continue;
}
}
@@ -458,7 +467,11 @@ impl MePool {
Err(TrySendError::Closed(_)) => {
self.stats.increment_me_writer_pick_closed_total(pick_mode);
warn!(writer_id = w.id, "ME writer channel closed");
self.remove_writer_and_close_clients(w.id).await;
self.remove_writer_and_close_clients(
w.id,
MeWriterTeardownReason::RouteChannelClosed,
)
.await;
continue;
}
}
@@ -503,7 +516,11 @@ impl MePool {
Err(TimedSendError::Closed(_)) => {
self.stats.increment_me_writer_pick_closed_total(pick_mode);
warn!(writer_id = w.id, "ME writer channel closed (blocking)");
self.remove_writer_and_close_clients(w.id).await;
self.remove_writer_and_close_clients(
w.id,
MeWriterTeardownReason::RouteChannelClosed,
)
.await;
}
Err(TimedSendError::Timeout(_)) => {
self.stats.increment_me_writer_pick_full_total(pick_mode);
@@ -654,7 +671,11 @@ impl MePool {
}
Err(TrySendError::Closed(_)) => {
debug!("ME close write failed");
self.remove_writer_and_close_clients(w.writer_id).await;
self.remove_writer_and_close_clients(
w.writer_id,
MeWriterTeardownReason::CloseRpcChannelClosed,
)
.await;
}
}
} else {

View File

@@ -2,6 +2,7 @@
pub mod pool;
pub mod proxy_protocol;
pub mod shadowsocks;
pub mod socket;
pub mod socks;
pub mod upstream;
@@ -14,5 +15,8 @@ pub use socket::*;
#[allow(unused_imports)]
pub use socks::*;
#[allow(unused_imports)]
pub use upstream::{DcPingResult, StartupPingResult, UpstreamEgressInfo, UpstreamManager, UpstreamRouteKind};
pub use upstream::{
DcPingResult, StartupPingResult, UpstreamEgressInfo, UpstreamManager, UpstreamRouteKind,
UpstreamStream,
};
pub mod middle_proxy;

View File

@@ -0,0 +1,60 @@
use std::net::{IpAddr, SocketAddr};
use std::time::Duration;
use shadowsocks::{
ProxyClientStream,
config::{ServerConfig, ServerType},
context::Context,
net::ConnectOpts,
};
use crate::error::{ProxyError, Result};
pub(crate) type ShadowsocksStream = ProxyClientStream<shadowsocks::net::TcpStream>;
fn parse_server_config(url: &str, connect_timeout: Duration) -> Result<ServerConfig> {
let mut config = ServerConfig::from_url(url)
.map_err(|error| ProxyError::Config(format!("invalid shadowsocks url: {error}")))?;
if config.plugin().is_some() {
return Err(ProxyError::Config(
"shadowsocks plugins are not supported".to_string(),
));
}
config.set_timeout(connect_timeout);
Ok(config)
}
pub(crate) fn sanitize_shadowsocks_url(url: &str) -> Result<String> {
Ok(parse_server_config(url, Duration::from_secs(1))?
.addr()
.to_string())
}
fn connect_opts_for_interface(interface: &Option<String>) -> ConnectOpts {
let mut opts = ConnectOpts::default();
if let Some(interface) = interface {
if let Ok(ip) = interface.parse::<IpAddr>() {
opts.bind_local_addr = Some(SocketAddr::new(ip, 0));
} else {
opts.bind_interface = Some(interface.clone());
}
}
opts
}
pub(crate) async fn connect_shadowsocks(
url: &str,
interface: &Option<String>,
target: SocketAddr,
connect_timeout: Duration,
) -> Result<ShadowsocksStream> {
let config = parse_server_config(url, connect_timeout)?;
let context = Context::new_shared(ServerType::Local);
let opts = connect_opts_for_interface(interface);
ProxyClientStream::connect_with_opts(context, &config, target, &opts)
.await
.map_err(ProxyError::Io)
}

View File

@@ -4,22 +4,28 @@
#![allow(deprecated)]
use rand::Rng;
use std::collections::{BTreeSet, HashMap};
use std::net::{SocketAddr, IpAddr};
use std::net::{IpAddr, SocketAddr};
use std::pin::Pin;
use std::sync::Arc;
use std::sync::atomic::{AtomicU64, AtomicUsize, Ordering};
use std::task::{Context, Poll};
use std::time::Duration;
use tokio::io::{AsyncRead, AsyncWrite, ReadBuf};
use tokio::net::TcpStream;
use tokio::sync::RwLock;
use tokio::time::Instant;
use rand::Rng;
use tracing::{debug, warn, info, trace};
use tracing::{debug, info, trace, warn};
use crate::config::{UpstreamConfig, UpstreamType};
use crate::error::{Result, ProxyError};
use crate::error::{ProxyError, Result};
use crate::network::dns_overrides::{resolve_socket_addr, split_host_port};
use crate::protocol::constants::{TG_DATACENTERS_V4, TG_DATACENTERS_V6, TG_DATACENTER_PORT};
use crate::protocol::constants::{TG_DATACENTER_PORT, TG_DATACENTERS_V4, TG_DATACENTERS_V6};
use crate::stats::Stats;
use crate::transport::shadowsocks::{
ShadowsocksStream, connect_shadowsocks, sanitize_shadowsocks_url,
};
use crate::transport::socket::{create_outgoing_socket_bound, resolve_interface_ip};
use crate::transport::socks::{connect_socks4, connect_socks5};
@@ -47,7 +53,10 @@ struct LatencyEma {
impl LatencyEma {
const fn new(alpha: f64) -> Self {
Self { value_ms: None, alpha }
Self {
value_ms: None,
alpha,
}
}
fn update(&mut self, sample_ms: f64) {
@@ -131,11 +140,17 @@ impl UpstreamState {
return Some(ms);
}
let (sum, count) = self.dc_latency.iter()
let (sum, count) = self
.dc_latency
.iter()
.filter_map(|l| l.get())
.fold((0.0, 0u32), |(s, c), v| (s + v, c + 1));
if count > 0 { Some(sum / count as f64) } else { None }
if count > 0 {
Some(sum / count as f64)
} else {
None
}
}
}
@@ -158,11 +173,78 @@ pub struct StartupPingResult {
pub both_available: bool,
}
pub enum UpstreamStream {
Tcp(TcpStream),
Shadowsocks(Box<ShadowsocksStream>),
}
impl std::fmt::Debug for UpstreamStream {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::Tcp(_) => f.write_str("UpstreamStream::Tcp(..)"),
Self::Shadowsocks(_) => f.write_str("UpstreamStream::Shadowsocks(..)"),
}
}
}
impl UpstreamStream {
pub fn into_tcp(self) -> Result<TcpStream> {
match self {
Self::Tcp(stream) => Ok(stream),
Self::Shadowsocks(_) => Err(ProxyError::Config(
"shadowsocks upstreams are not supported when general.use_middle_proxy = true"
.to_string(),
)),
}
}
}
impl AsyncRead for UpstreamStream {
fn poll_read(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
buf: &mut ReadBuf<'_>,
) -> Poll<std::io::Result<()>> {
match self.get_mut() {
Self::Tcp(stream) => Pin::new(stream).poll_read(cx, buf),
Self::Shadowsocks(stream) => Pin::new(stream.as_mut()).poll_read(cx, buf),
}
}
}
impl AsyncWrite for UpstreamStream {
fn poll_write(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
buf: &[u8],
) -> Poll<std::io::Result<usize>> {
match self.get_mut() {
Self::Tcp(stream) => Pin::new(stream).poll_write(cx, buf),
Self::Shadowsocks(stream) => Pin::new(stream.as_mut()).poll_write(cx, buf),
}
}
fn poll_flush(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<std::io::Result<()>> {
match self.get_mut() {
Self::Tcp(stream) => Pin::new(stream).poll_flush(cx),
Self::Shadowsocks(stream) => Pin::new(stream.as_mut()).poll_flush(cx),
}
}
fn poll_shutdown(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<std::io::Result<()>> {
match self.get_mut() {
Self::Tcp(stream) => Pin::new(stream).poll_shutdown(cx),
Self::Shadowsocks(stream) => Pin::new(stream.as_mut()).poll_shutdown(cx),
}
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum UpstreamRouteKind {
Direct,
Socks4,
Socks5,
Shadowsocks,
}
#[derive(Debug, Clone)]
@@ -194,6 +276,7 @@ pub struct UpstreamApiSummarySnapshot {
pub direct_total: usize,
pub socks4_total: usize,
pub socks5_total: usize,
pub shadowsocks_total: usize,
}
#[derive(Debug, Clone)]
@@ -253,7 +336,8 @@ impl UpstreamManager {
connect_failfast_hard_errors: bool,
stats: Arc<Stats>,
) -> Self {
let states = configs.into_iter()
let states = configs
.into_iter()
.filter(|c| c.enabled)
.map(UpstreamState::new)
.collect();
@@ -311,20 +395,13 @@ impl UpstreamManager {
summary.unhealthy_total += 1;
}
let (route_kind, address) = match &upstream.config.upstream_type {
UpstreamType::Direct { .. } => {
summary.direct_total += 1;
(UpstreamRouteKind::Direct, "direct".to_string())
}
UpstreamType::Socks4 { address, .. } => {
summary.socks4_total += 1;
(UpstreamRouteKind::Socks4, address.clone())
}
UpstreamType::Socks5 { address, .. } => {
summary.socks5_total += 1;
(UpstreamRouteKind::Socks5, address.clone())
}
};
let (route_kind, address) = Self::describe_upstream(&upstream.config.upstream_type);
match route_kind {
UpstreamRouteKind::Direct => summary.direct_total += 1,
UpstreamRouteKind::Socks4 => summary.socks4_total += 1,
UpstreamRouteKind::Socks5 => summary.socks5_total += 1,
UpstreamRouteKind::Shadowsocks => summary.shadowsocks_total += 1,
}
let mut dc = Vec::with_capacity(NUM_DCS);
for dc_idx in 0..NUM_DCS {
@@ -352,6 +429,18 @@ impl UpstreamManager {
Some(UpstreamApiSnapshot { summary, upstreams })
}
fn describe_upstream(upstream_type: &UpstreamType) -> (UpstreamRouteKind, String) {
match upstream_type {
UpstreamType::Direct { .. } => (UpstreamRouteKind::Direct, "direct".to_string()),
UpstreamType::Socks4 { address, .. } => (UpstreamRouteKind::Socks4, address.clone()),
UpstreamType::Socks5 { address, .. } => (UpstreamRouteKind::Socks5, address.clone()),
UpstreamType::Shadowsocks { url, .. } => (
UpstreamRouteKind::Shadowsocks,
sanitize_shadowsocks_url(url).unwrap_or_else(|_| "invalid".to_string()),
),
}
}
pub fn api_policy_snapshot(&self) -> UpstreamApiPolicySnapshot {
UpstreamApiPolicySnapshot {
connect_retry_attempts: self.connect_retry_attempts,
@@ -539,44 +628,44 @@ impl UpstreamManager {
// Scope filter:
// If scope is set: only scoped and matched items
// If scope is not set: only unscoped items
let filtered_upstreams : Vec<usize> = upstreams.iter()
let filtered_upstreams: Vec<usize> = upstreams
.iter()
.enumerate()
.filter(|(_, u)| {
scope.map_or(
u.config.scopes.is_empty(),
|req_scope| {
u.config.scopes
.split(',')
.map(str::trim)
.any(|s| s == req_scope)
}
)
scope.map_or(u.config.scopes.is_empty(), |req_scope| {
u.config
.scopes
.split(',')
.map(str::trim)
.any(|s| s == req_scope)
})
})
.map(|(i, _)| i)
.collect();
// Healthy filter
let healthy: Vec<usize> = filtered_upstreams.iter()
let healthy: Vec<usize> = filtered_upstreams
.iter()
.filter(|&&i| upstreams[i].healthy)
.copied()
.collect();
if filtered_upstreams.is_empty() {
if Self::should_emit_warn(
self.no_upstreams_warn_epoch_ms.as_ref(),
5_000,
) {
warn!(scope = scope, "No upstreams available! Using first (direct?)");
if Self::should_emit_warn(self.no_upstreams_warn_epoch_ms.as_ref(), 5_000) {
warn!(
scope = scope,
"No upstreams available! Using first (direct?)"
);
}
return None;
}
if healthy.is_empty() {
if Self::should_emit_warn(
self.no_healthy_warn_epoch_ms.as_ref(),
5_000,
) {
warn!(scope = scope, "No healthy upstreams available! Using random.");
if Self::should_emit_warn(self.no_healthy_warn_epoch_ms.as_ref(), 5_000) {
warn!(
scope = scope,
"No healthy upstreams available! Using random."
);
}
return Some(filtered_upstreams[rand::rng().gen_range(0..filtered_upstreams.len())]);
}
@@ -585,14 +674,18 @@ impl UpstreamManager {
return Some(healthy[0]);
}
let weights: Vec<(usize, f64)> = healthy.iter().map(|&i| {
let base = upstreams[i].config.weight as f64;
let latency_factor = upstreams[i].effective_latency(dc_idx)
.map(|ms| if ms > 1.0 { 1000.0 / ms } else { 1000.0 })
.unwrap_or(1.0);
let weights: Vec<(usize, f64)> = healthy
.iter()
.map(|&i| {
let base = upstreams[i].config.weight as f64;
let latency_factor = upstreams[i]
.effective_latency(dc_idx)
.map(|ms| if ms > 1.0 { 1000.0 / ms } else { 1000.0 })
.unwrap_or(1.0);
(i, base * latency_factor)
}).collect();
(i, base * latency_factor)
})
.collect();
let total: f64 = weights.iter().map(|(_, w)| w).sum();
@@ -620,8 +713,34 @@ impl UpstreamManager {
}
/// Connect to target through a selected upstream.
pub async fn connect(&self, target: SocketAddr, dc_idx: Option<i16>, scope: Option<&str>) -> Result<TcpStream> {
let (stream, _) = self.connect_with_details(target, dc_idx, scope).await?;
pub async fn connect(
&self,
target: SocketAddr,
dc_idx: Option<i16>,
scope: Option<&str>,
) -> Result<UpstreamStream> {
let idx = self
.select_upstream(dc_idx, scope)
.await
.ok_or_else(|| ProxyError::Config("No upstreams available".to_string()))?;
let mut upstream = {
let guard = self.upstreams.read().await;
guard[idx].config.clone()
};
if let Some(s) = scope {
upstream.selected_scope = s.to_string();
}
let bind_rr = {
let guard = self.upstreams.read().await;
guard.get(idx).map(|u| u.bind_rr.clone())
};
let (stream, _) = self
.connect_selected_upstream(idx, upstream, target, dc_idx, bind_rr)
.await?;
Ok(stream)
}
@@ -632,7 +751,9 @@ impl UpstreamManager {
dc_idx: Option<i16>,
scope: Option<&str>,
) -> Result<(TcpStream, UpstreamEgressInfo)> {
let idx = self.select_upstream(dc_idx, scope).await
let idx = self
.select_upstream(dc_idx, scope)
.await
.ok_or_else(|| ProxyError::Config("No upstreams available".to_string()))?;
let mut upstream = {
@@ -650,6 +771,20 @@ impl UpstreamManager {
guard.get(idx).map(|u| u.bind_rr.clone())
};
let (stream, egress) = self
.connect_selected_upstream(idx, upstream, target, dc_idx, bind_rr)
.await?;
Ok((stream.into_tcp()?, egress))
}
async fn connect_selected_upstream(
&self,
idx: usize,
upstream: UpstreamConfig,
target: SocketAddr,
dc_idx: Option<i16>,
bind_rr: Option<Arc<AtomicUsize>>,
) -> Result<(UpstreamStream, UpstreamEgressInfo)> {
let connect_started_at = Instant::now();
let mut last_error: Option<ProxyError> = None;
let mut attempts_used = 0u32;
@@ -662,8 +797,8 @@ impl UpstreamManager {
break;
}
let remaining_budget = self.connect_budget.saturating_sub(elapsed);
let attempt_timeout = Duration::from_secs(DIRECT_CONNECT_TIMEOUT_SECS)
.min(remaining_budget);
let attempt_timeout =
Duration::from_secs(DIRECT_CONNECT_TIMEOUT_SECS).min(remaining_budget);
if attempt_timeout.is_zero() {
last_error = Some(ProxyError::ConnectionTimeout {
addr: target.to_string(),
@@ -786,9 +921,12 @@ impl UpstreamManager {
target: SocketAddr,
bind_rr: Option<Arc<AtomicUsize>>,
connect_timeout: Duration,
) -> Result<(TcpStream, UpstreamEgressInfo)> {
) -> Result<(UpstreamStream, UpstreamEgressInfo)> {
match &config.upstream_type {
UpstreamType::Direct { interface, bind_addresses } => {
UpstreamType::Direct {
interface,
bind_addresses,
} => {
let bind_ip = Self::resolve_bind_address(
interface,
bind_addresses,
@@ -796,9 +934,7 @@ impl UpstreamManager {
bind_rr.as_deref(),
true,
);
if bind_ip.is_none()
&& bind_addresses.as_ref().is_some_and(|v| !v.is_empty())
{
if bind_ip.is_none() && bind_addresses.as_ref().is_some_and(|v| !v.is_empty()) {
return Err(ProxyError::Config(format!(
"No valid bind_addresses for target family {target}"
)));
@@ -813,8 +949,10 @@ impl UpstreamManager {
socket.set_nonblocking(true)?;
match socket.connect(&target.into()) {
Ok(()) => {},
Err(err) if err.raw_os_error() == Some(libc::EINPROGRESS) || err.kind() == std::io::ErrorKind::WouldBlock => {},
Ok(()) => {}
Err(err)
if err.raw_os_error() == Some(libc::EINPROGRESS)
|| err.kind() == std::io::ErrorKind::WouldBlock => {}
Err(err) => return Err(ProxyError::Io(err)),
}
@@ -836,7 +974,7 @@ impl UpstreamManager {
let local_addr = stream.local_addr().ok();
Ok((
stream,
UpstreamStream::Tcp(stream),
UpstreamEgressInfo {
upstream_id,
route_kind: UpstreamRouteKind::Direct,
@@ -846,8 +984,12 @@ impl UpstreamManager {
socks_proxy_addr: None,
},
))
},
UpstreamType::Socks4 { address, interface, user_id } => {
}
UpstreamType::Socks4 {
address,
interface,
user_id,
} => {
// Try to parse as SocketAddr first (IP:port), otherwise treat as hostname:port
let mut stream = if let Ok(proxy_addr) = address.parse::<SocketAddr>() {
// IP:port format - use socket with optional interface binding
@@ -863,8 +1005,10 @@ impl UpstreamManager {
socket.set_nonblocking(true)?;
match socket.connect(&proxy_addr.into()) {
Ok(()) => {},
Err(err) if err.raw_os_error() == Some(libc::EINPROGRESS) || err.kind() == std::io::ErrorKind::WouldBlock => {},
Ok(()) => {}
Err(err)
if err.raw_os_error() == Some(libc::EINPROGRESS)
|| err.kind() == std::io::ErrorKind::WouldBlock => {}
Err(err) => return Err(ProxyError::Io(err)),
}
@@ -888,14 +1032,16 @@ impl UpstreamManager {
// Hostname:port format - use tokio DNS resolution
// Note: interface binding is not supported for hostnames
if interface.is_some() {
warn!("SOCKS4 interface binding is not supported for hostname addresses, ignoring");
warn!(
"SOCKS4 interface binding is not supported for hostname addresses, ignoring"
);
}
Self::connect_hostname_with_dns_override(address, connect_timeout).await?
};
// replace socks user_id with config.selected_scope, if set
let scope: Option<&str> = Some(config.selected_scope.as_str())
.filter(|s| !s.is_empty());
let scope: Option<&str> =
Some(config.selected_scope.as_str()).filter(|s| !s.is_empty());
let _user_id: Option<&str> = scope.or(user_id.as_deref());
let bound = match tokio::time::timeout(
@@ -915,7 +1061,7 @@ impl UpstreamManager {
let local_addr = stream.local_addr().ok();
let socks_proxy_addr = stream.peer_addr().ok();
Ok((
stream,
UpstreamStream::Tcp(stream),
UpstreamEgressInfo {
upstream_id,
route_kind: UpstreamRouteKind::Socks4,
@@ -925,8 +1071,13 @@ impl UpstreamManager {
socks_proxy_addr,
},
))
},
UpstreamType::Socks5 { address, interface, username, password } => {
}
UpstreamType::Socks5 {
address,
interface,
username,
password,
} => {
// Try to parse as SocketAddr first (IP:port), otherwise treat as hostname:port
let mut stream = if let Ok(proxy_addr) = address.parse::<SocketAddr>() {
// IP:port format - use socket with optional interface binding
@@ -942,8 +1093,10 @@ impl UpstreamManager {
socket.set_nonblocking(true)?;
match socket.connect(&proxy_addr.into()) {
Ok(()) => {},
Err(err) if err.raw_os_error() == Some(libc::EINPROGRESS) || err.kind() == std::io::ErrorKind::WouldBlock => {},
Ok(()) => {}
Err(err)
if err.raw_os_error() == Some(libc::EINPROGRESS)
|| err.kind() == std::io::ErrorKind::WouldBlock => {}
Err(err) => return Err(ProxyError::Io(err)),
}
@@ -967,15 +1120,17 @@ impl UpstreamManager {
// Hostname:port format - use tokio DNS resolution
// Note: interface binding is not supported for hostnames
if interface.is_some() {
warn!("SOCKS5 interface binding is not supported for hostname addresses, ignoring");
warn!(
"SOCKS5 interface binding is not supported for hostname addresses, ignoring"
);
}
Self::connect_hostname_with_dns_override(address, connect_timeout).await?
};
debug!(config = ?config, "Socks5 connection");
// replace socks user:pass with config.selected_scope, if set
let scope: Option<&str> = Some(config.selected_scope.as_str())
.filter(|s| !s.is_empty());
let scope: Option<&str> =
Some(config.selected_scope.as_str()).filter(|s| !s.is_empty());
let _username: Option<&str> = scope.or(username.as_deref());
let _password: Option<&str> = scope.or(password.as_deref());
@@ -996,7 +1151,7 @@ impl UpstreamManager {
let local_addr = stream.local_addr().ok();
let socks_proxy_addr = stream.peer_addr().ok();
Ok((
stream,
UpstreamStream::Tcp(stream),
UpstreamEgressInfo {
upstream_id,
route_kind: UpstreamRouteKind::Socks5,
@@ -1006,7 +1161,22 @@ impl UpstreamManager {
socks_proxy_addr,
},
))
},
}
UpstreamType::Shadowsocks { url, interface } => {
let stream = connect_shadowsocks(url, interface, target, connect_timeout).await?;
let local_addr = stream.get_ref().local_addr().ok();
Ok((
UpstreamStream::Shadowsocks(Box::new(stream)),
UpstreamEgressInfo {
upstream_id,
route_kind: UpstreamRouteKind::Shadowsocks,
local_addr,
direct_bind_ip: None,
socks_bound_addr: None,
socks_proxy_addr: None,
},
))
}
}
}
@@ -1023,7 +1193,9 @@ impl UpstreamManager {
) -> Vec<StartupPingResult> {
let upstreams: Vec<(usize, UpstreamConfig, Arc<AtomicUsize>)> = {
let guard = self.upstreams.read().await;
guard.iter().enumerate()
guard
.iter()
.enumerate()
.map(|(i, u)| (i, u.config.clone(), u.bind_rr.clone()))
.collect()
};
@@ -1051,6 +1223,11 @@ impl UpstreamManager {
}
UpstreamType::Socks4 { address, .. } => format!("socks4://{}", address),
UpstreamType::Socks5 { address, .. } => format!("socks5://{}", address),
UpstreamType::Shadowsocks { url, .. } => {
let address =
sanitize_shadowsocks_url(url).unwrap_or_else(|_| "invalid".to_string());
format!("shadowsocks://{address}")
}
};
let mut v6_results = Vec::with_capacity(NUM_DCS);
@@ -1061,8 +1238,14 @@ impl UpstreamManager {
let result = tokio::time::timeout(
Duration::from_secs(DC_PING_TIMEOUT_SECS),
self.ping_single_dc(*upstream_idx, upstream_config, Some(bind_rr.clone()), addr_v6)
).await;
self.ping_single_dc(
*upstream_idx,
upstream_config,
Some(bind_rr.clone()),
addr_v6,
),
)
.await;
let ping_result = match result {
Ok(Ok(rtt_ms)) => {
@@ -1112,8 +1295,14 @@ impl UpstreamManager {
let result = tokio::time::timeout(
Duration::from_secs(DC_PING_TIMEOUT_SECS),
self.ping_single_dc(*upstream_idx, upstream_config, Some(bind_rr.clone()), addr_v4)
).await;
self.ping_single_dc(
*upstream_idx,
upstream_config,
Some(bind_rr.clone()),
addr_v4,
),
)
.await;
let ping_result = match result {
Ok(Ok(rtt_ms)) => {
@@ -1162,7 +1351,7 @@ impl UpstreamManager {
Err(_) => {
warn!(dc = %dc_key, "Invalid dc_overrides key, skipping");
continue;
},
}
_ => continue,
};
let dc_idx = dc_num as usize;
@@ -1175,8 +1364,14 @@ impl UpstreamManager {
}
let result = tokio::time::timeout(
Duration::from_secs(DC_PING_TIMEOUT_SECS),
self.ping_single_dc(*upstream_idx, upstream_config, Some(bind_rr.clone()), addr)
).await;
self.ping_single_dc(
*upstream_idx,
upstream_config,
Some(bind_rr.clone()),
addr,
),
)
.await;
let ping_result = match result {
Ok(Ok(rtt_ms)) => DcPingResult {
@@ -1205,7 +1400,9 @@ impl UpstreamManager {
v4_results.push(ping_result);
}
}
Err(_) => warn!(dc = %dc_idx, addr = %addr_str, "Invalid dc_overrides address, skipping"),
Err(_) => {
warn!(dc = %dc_idx, addr = %addr_str, "Invalid dc_overrides address, skipping")
}
}
}
}
@@ -1381,12 +1578,8 @@ impl UpstreamManager {
ipv6_enabled: bool,
dc_overrides: HashMap<String, Vec<String>>,
) {
let groups = Self::build_health_check_groups(
prefer_ipv6,
ipv4_enabled,
ipv6_enabled,
&dc_overrides,
);
let groups =
Self::build_health_check_groups(prefer_ipv6, ipv4_enabled, ipv6_enabled, &dc_overrides);
let required_healthy_groups = Self::required_healthy_group_count(groups.len());
let mut endpoint_rotation: HashMap<(usize, i16, bool), usize> = HashMap::new();
@@ -1416,13 +1609,16 @@ impl UpstreamManager {
let mut group_ok = false;
let mut group_rtt_ms = None;
for (is_primary, endpoints) in [(true, &group.primary), (false, &group.fallback)] {
for (is_primary, endpoints) in
[(true, &group.primary), (false, &group.fallback)]
{
if endpoints.is_empty() {
continue;
}
let rotation_key = (i, group.dc_idx, is_primary);
let start_idx = *endpoint_rotation.entry(rotation_key).or_insert(0) % endpoints.len();
let start_idx =
*endpoint_rotation.entry(rotation_key).or_insert(0) % endpoints.len();
let mut next_idx = (start_idx + 1) % endpoints.len();
for step in 0..endpoints.len() {
@@ -1544,8 +1740,7 @@ impl UpstreamManager {
return None;
}
UpstreamState::dc_array_idx(dc_idx)
.map(|idx| guard[0].dc_ip_pref[idx])
UpstreamState::dc_array_idx(dc_idx).map(|idx| guard[0].dc_ip_pref[idx])
}
/// Get preferred DC address based on config preference
@@ -1566,6 +1761,12 @@ impl UpstreamManager {
#[cfg(test)]
mod tests {
use super::*;
use std::sync::Arc;
use crate::stats::Stats;
const TEST_SHADOWSOCKS_URL: &str =
"ss://2022-blake3-aes-256-gcm:MDEyMzQ1Njc4OTAxMjM0NTY3ODkwMTIzNDU2Nzg5MDE=@127.0.0.1:8388";
#[test]
fn required_healthy_group_count_applies_three_group_threshold() {
@@ -1596,15 +1797,18 @@ mod tests {
assert!(dc2.primary.iter().all(|addr| addr.is_ipv6()));
assert!(dc2.fallback.iter().all(|addr| addr.is_ipv4()));
assert!(dc2
.primary
.contains(&"[2001:db8::10]:443".parse::<SocketAddr>().unwrap()));
assert!(dc2
.fallback
.contains(&"203.0.113.10:443".parse::<SocketAddr>().unwrap()));
assert!(dc2
.fallback
.contains(&"203.0.113.11:443".parse::<SocketAddr>().unwrap()));
assert!(
dc2.primary
.contains(&"[2001:db8::10]:443".parse::<SocketAddr>().unwrap())
);
assert!(
dc2.fallback
.contains(&"203.0.113.10:443".parse::<SocketAddr>().unwrap())
);
assert!(
dc2.fallback
.contains(&"203.0.113.11:443".parse::<SocketAddr>().unwrap())
);
}
#[test]
@@ -1626,12 +1830,14 @@ mod tests {
.expect("override-only dc group must be present");
assert_eq!(dc9.primary.len(), 2);
assert!(dc9
.primary
.contains(&"198.51.100.1:443".parse::<SocketAddr>().unwrap()));
assert!(dc9
.primary
.contains(&"198.51.100.2:443".parse::<SocketAddr>().unwrap()));
assert!(
dc9.primary
.contains(&"198.51.100.1:443".parse::<SocketAddr>().unwrap())
);
assert!(
dc9.primary
.contains(&"198.51.100.2:443".parse::<SocketAddr>().unwrap())
);
assert!(dc9.fallback.is_empty());
}
@@ -1678,4 +1884,36 @@ mod tests {
assert_eq!(bind, None);
}
#[test]
fn api_snapshot_reports_shadowsocks_as_sanitized_route() {
let manager = UpstreamManager::new(
vec![UpstreamConfig {
upstream_type: UpstreamType::Shadowsocks {
url: TEST_SHADOWSOCKS_URL.to_string(),
interface: None,
},
weight: 2,
enabled: true,
scopes: String::new(),
selected_scope: String::new(),
}],
1,
100,
1000,
1,
false,
Arc::new(Stats::new()),
);
let snapshot = manager.try_api_snapshot().expect("snapshot");
assert_eq!(snapshot.summary.configured_total, 1);
assert_eq!(snapshot.summary.shadowsocks_total, 1);
assert_eq!(snapshot.upstreams.len(), 1);
assert_eq!(
snapshot.upstreams[0].route_kind,
UpstreamRouteKind::Shadowsocks
);
assert_eq!(snapshot.upstreams[0].address, "127.0.0.1:8388");
}
}