2 unstable releases
Uses new Rust 2024
| 0.2.0 | Nov 13, 2025 |
|---|---|
| 0.1.0 | Nov 12, 2025 |
#95 in Simulation
32KB
81 lines
PHASM - Phallible Async State Machines
A Rust framework for building deterministic, testable, and crash-recoverable state machines with async operations and fallible state access.
Build systems that are correct by construction - payments, bookings, workflows, and distributed systems with guarantees that go beyond traditional testing.
Why PHASM?
Traditional state machines break down in production - race conditions, crashes mid-operation, and bugs that only appear under load. PHASM solves this by making correctness verifiable:
- ๐ฏ Deterministic execution - Same inputs always produce same outputs (reproducible bugs!)
- ๐ Crash recovery - Resume from any failure point automatically
- ๐งช Simulation testing - Verify correctness across millions of random operations in seconds
- ๐ Race condition handling - Deterministic conflict resolution built-in
- ๐พ Flexible state - In-memory, database transactions, or hybrid
Real results: The dentist booking example verifies 90,000+ operations (including race conditions, crashes, and payment failures) in ~4 seconds - finding bugs humans would never think to test.
Why PHASM Was Created
The Theory-Practice Gap
Traditional state machine theory is elegant and provably correct - but assumes synchronous, infallible operations and in-memory state. Real systems need:
- Async operations (network calls, database transactions)
- Fallible operations (APIs fail, databases timeout)
- Persistence (crashes happen, state must survive)
- Scale (millions of operations, distributed systems)
Existing frameworks force a choice: theoretical correctness OR practical engineering.
PHASM bridges this gap.
PHASM expands the theoretical state machine model to allow for theoretical correctness while interoperating with real-world engineering:
- Async-first: State transitions can await - database transactions, validation checks
- Fallible state access: Database connections can fail, STF can return errors atomically
- Separation of state and effects: State mutations (including DB writes) remain deterministic; external effects are explicit
- Tracked actions: Theoretical model extended with action results feeding back as inputs
- Crash recovery: Restore function makes the model crash-safe without losing correctness
The result: Build high-performance, scalable systems with the same correctness guarantees as theoretical state machines, but with the flexibility to handle real-world requirements like databases, external APIs, and failures.
PHASM is not a compromise - it's an expansion. You get both theoretical soundness AND practical engineering.
How It Works
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PHASM Architecture โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Input (user request, time, external data)
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ State Transition Function (STF) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โข Validates inputs โ
โ โข Mutates state (incl. database) โ
โ โข Emits action descriptions โ
โ โข Atomic: error = no changes โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Updated State + Actions โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ State persisted (in-memory or DB) โ
โ Actions executed externally โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโบ Tracked Actions (payment, API calls)
โ โ
โ โผ
โ External Systems
โ โ
โ โโโโบ Results feed back as Input โโโ
โ โ
โโโโบ Untracked Actions (notifications) โ
โ
โผ
(loop)
After Crash: restore(state) โ re-emit pending actions
Quick Example
use phasm::{Input, StateMachine};
struct PaymentSystem {
balance: u64,
pending: HashMap<u64, Payment>,
next_id: u64,
}
impl StateMachine for PaymentSystem {
async fn stf(
state: &mut Self::State,
input: Input<Self::TrackedAction, Self::Input>,
actions: &mut Self::Actions,
) -> Result<(), Self::TransitionError> {
match input {
Input::Normal(ProcessPayment { amount, user }) => {
// Validate before mutating
if state.balance < amount {
return Err(InsufficientFunds);
}
// Generate deterministic ID
let payment_id = state.next_id;
state.next_id += 1;
// Store in state before emitting action
state.pending.insert(payment_id, Payment { amount, user });
// Emit tracked action (will be retried on crash)
actions.add(Action::Tracked(
TrackedAction::new(payment_id, ChargeCard { amount })
))?;
Ok(())
}
Input::TrackedActionCompleted { id, res } => {
// Handle payment result
let payment = state.pending.get_mut(&id)?;
payment.status = if res.is_success() {
state.balance -= payment.amount;
Confirmed
} else {
Failed
};
Ok(())
}
}
}
async fn restore(state: &Self::State, actions: &mut Self::Actions) -> Result<(), Self::RestoreError> {
// Recreate pending actions from state after crash
for (id, payment) in &state.pending {
if payment.status == Pending {
actions.add(Action::Tracked(
TrackedAction::new(*id, CheckPaymentStatus { id })
))?;
}
}
Ok(())
}
}
Core Concepts
State Transition Function (STF)
Deterministic function: (State, Input) โ (State', Actions)
- Validates inputs and mutates state (including database writes via
state) - Emits action descriptions (not executions)
- Must be atomic: error = no state changes
Actions
Descriptions of external operations executed after STF succeeds:
-
Tracked: Perfect for long-running background operations that produce results and can fail
- Examples: Payment processing, external API calls, background jobs
- Results feed back as
Input::TrackedActionCompleted - Stored in state for crash recovery and retry
- Use when operation outcome affects system correctness
-
Untracked: Fire-and-forget operations whose execution doesn't affect correctness
- Examples: Logs, metrics, analytics, notifications, UI updates
- Not recovered after crashes
- Use when you need to emit information but don't need confirmation
Restore
Recovers pending operations from state after crashes by reading state and re-emitting actions.
Key Requirements
โ What You Must Do
- Validate before mutating - Check all conditions before changing state
- Use deterministic IDs - Generate from state counters, not
SystemTime::now()or random - Store tracked actions in state - Before emitting, so restore can recreate them
- Return all external data via Input - Time, API responses, database reads (if not via
state) - State atomicity - If STF returns
Err, state must be unchanged
โ What You Must Not Do
- No external side effects in STF - No HTTP calls, no opening new connections
- No randomness - No
rand::random(), no unseeded RNGs - No system time - No
SystemTime::now(), pass time via Input - No external reads - No database connections (unless via
stateparameter)
โจ What's Allowed (Not Side Effects!)
- โ Writing to in-memory data structures
- โ
Writing to database through
stateparameter (e.g.,state.txn.set()) - โ Modifying actions container before returning errors (caller clears it)
State Can Be Anything
// In-memory
struct State {
users: HashMap<u64, User>,
}
// Database transaction
struct State<'txn> {
txn: &'txn mut Transaction,
}
// Both follow the same rules!
Testing
The killer feature - deterministic simulation:
use rand_chacha::ChaCha8Rng;
#[test]
async fn test_correctness() {
let mut rng = ChaCha8Rng::seed_from_u64(12345); // Deterministic!
let mut state = MySystem::new();
for i in 0..100_000 {
let input = generate_random_input(&mut rng);
MySystem::stf(&mut state, input, &mut actions).await.ok();
// Check invariants after EVERY operation
state.check_invariants()
.expect(&format!("Invariant violated at {}", i));
}
}
Same seed = same test execution = reproducible bugs.
When to Use PHASM
โ Great For
- Payment processing
- Reservation systems (hotels, appointments, flights)
- Workflow engines (approvals, multi-step processes)
- E-commerce (inventory, orders)
- Distributed systems requiring correctness
โ Overkill For
- Simple CRUD apps
- Stateless services
- Read-only systems
- Prototypes (unless correctness is critical)
Examples
examples/coffee_shop.rs- Loyalty points redemption with tracked actionsexamples/csm.rs- Simple counter state machinedentist_booking/- Full appointment booking system with comprehensive tests- 5 integration tests + 8 simulation tests
- 90,000+ operations tested in ~4 seconds
- Verifies all bookings match user preferences
Run examples:
cargo run --example coffee_shop
cd dentist_booking && cargo test
Documentation
- Docstrings in
src/lib.rs- Detailed API documentation - Core Concepts - Architecture and examples
- Critical Invariants - Rules for correctness
- Performance Guide - Optimization strategies
- Testing Guide - Simulation testing patterns
- Database State - Using databases as state
Quick Start
Add to Cargo.toml:
[dependencies]
phasm = "0.2"
Where to Start
New to PHASM? Follow this path:
-
Understand the basics (5 min)
- Read the Quick Example above
- Skim Core Concepts to understand STF, Actions, and Restore
-
See it in action (10 min)
cargo run --example coffee_shop- Shows tracked actions for point redemption
- Demonstrates error handling and state atomicity
- Includes crash recovery simulation
-
Learn the rules (15 min)
- Read Key Requirements section above
- These are the critical invariants you must follow
- Understand what's allowed vs forbidden
-
Study a complete example (30 min)
cd dentist_booking cargo test -- --nocapture- Production-ready appointment booking system
- See how preferences are validated
- Observe 90,000+ operations tested in seconds
- Read dentist_booking/README.md
-
Deep dive (1-2 hours)
- Critical Invariants - Detailed rules with examples
- Testing Guide - Simulation testing patterns
- Database State - If you need database-backed state
-
Build your state machine
- Copy the pattern from
dentist_booking/src/lib.rs - Define your State, Input, Actions
- Implement STF with validation-first approach
- Write simulation tests to verify correctness
- Copy the pattern from
Quick Reference: The docstrings in src/lib.rs contain detailed API documentation with inline examples.
Performance
Phasm doesn't affect performance of systems. You can use actions to offload compute or split work across multiple state transitions. You can build correct, testable and performant systems using phasm.
License
MIT OR Apache-2.0
Contributing
See the examples and documentation. When adding features, include:
- Simulation tests demonstrating correctness
- Documentation explaining the "why" not just the "what"
- Examples showing both correct and incorrect usage