2 releases
Uses new Rust 2024
| 0.1.1 | Oct 10, 2025 |
|---|---|
| 0.1.0 | Oct 5, 2025 |
#1288 in Network programming
210KB
4K
SLoC
Trixter – Chaos Monkey TCP Proxy
A high‑performance, runtime‑tunable TCP chaos proxy — a minimal, blazing‑fast alternative to Toxiproxy written in Rust with Tokio. It lets you inject latency, throttle bandwidth, slice writes (to simulate small MTUs/Nagle‑like behavior), corrupt bytes in flight by injecting random bytes, randomly terminate connections, and hard‑timeout sessions – all controllable per connection via a simple REST API.
Why Trixter?
- Zero-friction: one static binary, no external deps.
- Runtime knobs: flip chaos on/off without restarting.
- Per-conn control: target just the flows you want.
- Minimal overhead: adapters are lightweight and composable.
Features
- Fast path:
tokio::io::copy_bidirectionalon a multi‑thread runtime; - Runtime control (per active connection):
- Latency: add/remove delay in ms.
- Throttle: cap bytes/sec.
- Slice: split writes into fixed‑size chunks.
- Corrupt: inject random bytes with a tunable probability.
- Chaos termination: probability [0.0..=1.0] to abort on each read/write.
- Hard timeout: stop a session after N milliseconds.
- REST API to list connections and change settings on the fly.
- Targeted kill: shut down a single connection with a reason.
- Deterministic chaos: seed the RNG for reproducible scenarios.
- RST on chaos: resets (best-effort) when a timeout/termination triggers.
Quick start
1. Run an upstream echo server (demo)
Use any TCP server. Examples:
nc -lk 127.0.0.1 8181
2. Run trixter chaos proxy
with docker:
docker run --network host -it --rm ghcr.io/brk0v/trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:8181 \
--api 127.0.0.1:8888 \
--delay-ms 0 \
--throttle-rate-bytes 0 \
--slice-size-bytes 0 \
--corrupt-probability-rate 0.0 \
--terminate-probability-rate 0.0 \
--connection-duration-ms 0 \
--random-seed 42
or build from scratch:
cd trixter/trixter
cargo build --release
or install with cargo:
cargo install trixter
and run:
RUST_LOG=info \
./target/release/trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:8181 \
--api 127.0.0.1:8888 \
--delay-ms 0 \
--throttle-rate-bytes 0 \
--slice-size-bytes 0 \
--corrupt-probability-rate 0.0 \
--terminate-probability-rate 0.0 \
--connection-duration-ms 0 \
--random-seed 42
3. Test
Now connect your app/CLI to localhost:8080. The proxy forwards to 127.0.0.1:8181.
REST API
Base URL is the --api address, e.g. http://127.0.0.1:8888.
Data model
{
"conn_info": {
"id": "pN7e3y...",
"downstream": "127.0.0.1:59024",
"upstream": "127.0.0.1:8181"
},
"delay": { "secs": 2, "nanos": 500000000 },
"throttle_rate": 10240,
"slice_size": 512,
"terminate_probability_rate": 0.05,
"corrupt_probability_rate": 0.02
}
Notes:
delayserializes as astd::time::Durationobject withsecs/nanosfields (zeroed when the delay is disabled).idis unique per connection; use it to target a single connection.corrupt_probability_ratereports the current per-operation flip probability (0.0when corruption is off).
Health check
curl -s http://127.0.0.1:8888/health
List connections
curl -s http://127.0.0.1:8888/connections | jq
Kill a connection
ID=$(curl -s http://127.0.0.1:8888/connections | jq -r '.[0].conn_info.id')
curl -i -X POST \
http://127.0.0.1:8888/connections/$ID/shutdown \
-H 'Content-Type: application/json' \
-d '{"reason":"test teardown"}'
Kill all connections
curl -i -X POST \
http://127.0.0.1:8888/connections/_all/shutdown \
-H 'Content-Type: application/json' \
-d '{"reason":"test teardown"}'
Set latency (ms)
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/delay \
-H 'Content-Type: application/json' \
-d '{"delay_ms":250}'
# Remove latency
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/delay \
-H 'Content-Type: application/json' \
-d '{"delay_ms":0}'
Throttle bytes/sec
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/throttle \
-H 'Content-Type: application/json' \
-d '{"rate_bytes":10240}' # 10 KiB/s
Slice writes (bytes)
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/slice \
-H 'Content-Type: application/json' \
-d '{"size_bytes":512}'
Randomly terminate reads/writes
# Set 5% probability per read/write operation
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/termination \
-H 'Content-Type: application/json' \
-d '{"probability_rate":0.05}'
Inject random bytes
# Corrupt ~1% of operations
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/corruption \
-H 'Content-Type: application/json' \
-d '{"probability_rate":0.01}'
# Remove corruption
curl -i -X PATCH \
http://127.0.0.1:8888/connections/$ID/corruption \
-H 'Content-Type: application/json' \
-d '{"probability_rate":0.0}'
Error responses
404 Not Found— bad connection ID400 Bad Request— invalid probability (outside 0.0..=1.0) for termination/corruption500 Internal Server Error— internal channel/handler error
CLI flags
--listen <ip:port> # e.g. 0.0.0.0:8080
--upstream <ip:port> # e.g. 127.0.0.1:8181
--api <ip:port> # e.g. 127.0.0.1:8888
--delay-ms <ms> # 0 = off (default)
--throttle-rate-bytes <bytes/s> # 0 = unlimited (default)
--slice-size-bytes <bytes> # 0 = off (default)
--terminate-probability-rate <0..1> # 0.0 = off (default)
--corrupt-probability-rate <0..1> # 0.0 = off (default)
--connection-duration-ms <ms> # 0 = unlimited (default)
--random-seed <u64> # seed RNG for deterministic chaos (optional)
All of the above can be changed per connection at runtime via the REST API, except
--connection-duration-mswhich is a process-wide default applied to new connections.Omit
--random-seedto draw entropy for every run; set it when you want bit-for-bit reproducibility.
How it works (architecture)
Each accepted downstream connection spawns a task that:
-
Connects to the upstream target.
-
Wraps both sides with tunable adapters with
tokio-netem:DelayedWriter→ optional latencyThrottledWriter→ bandwidth capSlicedWriter→ fixed‑size write chunksTerminator→ probabilistic abortsCorrupter→ probabilistic random byte injectorShutdowner(downstream only) → out‑of‑band shutdown via oneshot channel
-
Runs
tokio::io::copy_bidirectionaluntil EOF/error/timeout. -
Tracks the live connection in a
DashMapso the API can query/mutate it.
Use cases
- Flaky networks: simulate 3G/EDGE/satellite latency and low bandwidth.
- MTU/segmentation bugs: force small write slices to uncover packetization assumptions.
- Resilience drills: randomly kill connections during critical paths.
- Data validation: corrupt bytes to exercise checksums and retry logic.
- Timeout tuning: enforce hard upper‑bounds to validate client retry/backoff logic.
- Canary/E2E tests: target only specific connections and tweak dynamically.
- Load/soak: run for hours with varying chaos settings from CI/scripts.
Recipes
Simulate a shaky mobile link
# Add ~250ms latency and 64 KiB/s cap to the first active connection
ID=$(curl -s localhost:8888/connections | jq -r '.[0].conn_info.id')
curl -s -X PATCH localhost:8888/connections/$ID/delay \
-H 'Content-Type: application/json' -d '{"delay_ms":250}'
curl -s -X PATCH localhost:8888/connections/$ID/throttle \
-H 'Content-Type: application/json' -d '{"rate_bytes":65536}'
Force tiny packets (find buffering bugs)
curl -s -X PATCH localhost:8888/connections/$ID/slice \
-H 'Content-Type: application/json' -d '{"size_bytes":256}'
Introduce flakiness (5% ops abort)
curl -s -X PATCH localhost:8888/connections/$ID/termination \
-H 'Content-Type: application/json' -d '{"probability_rate":0.05}'
Add data corruption
curl -s -X PATCH localhost:8888/connections/$ID/corruption \
-H 'Content-Type: application/json' -d '{"probability_rate":0.01}'
Timebox a connection to 5s at startup
./trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:8181 \
--api 127.0.0.1:8888 \
--connection-duration-ms 5000
Kill the slowpoke
curl -s -X POST localhost:8888/connections/$ID/shutdown \
-H 'Content-Type: application/json' -d '{"reason":"too slow"}'
Integration: CI & E2E tests
- Spin up the proxy as a sidecar/container.
- Discover the right connection (by
downstream/upstreampair) viaGET /connections. - Apply chaos during specific test phases with
PATCHcalls. - Always clean up with
POST /connections/{id}/shutdownto free ports quickly.
Reproduce CI failures
Omit --random-seed in CI so each run draws fresh entropy. When a failure hits, check the proxy logs for the random seed: <value> line and replay the scenario locally with that seed:
trixter \
--listen 0.0.0.0:8080 \
--upstream 127.0.0.1:8181 \
--api 127.0.0.1:8888 \
--random-seed 123456789
Performance notes
- Built on
Tokiomulti‑thread runtime; avoid heavy CPU work on the I/O threads. - Throttling and slicing affect throughput by design; set them to
0to disable. - Use loopback or a fast NIC for local tests; network stack/OS settings still apply.
- Logging:
RUST_LOG=info(ordebug) for visibility; turn off for max throughput.
Security
- The API performs no auth; bind it to a trusted interface (e.g.,
127.0.0.1). - The proxy is transparent TCP; apply your own TLS/ACLs at the edges if needed.
Error handling
- Invalid probability returns
400with{ "error": "invalid probability; must be between 0.0 and 1.0" }. - Unknown connection IDs return
404. - Internal channel/handler errors return
500.
License
MIT
Dependencies
~9–22MB
~250K SLoC