Files
sq/todos/SQ-022-simulation-tests.md
2026-02-26 21:52:50 +01:00

50 lines
1.8 KiB
Markdown

# SQ-022: Multi-Node Simulation Tests
**Status:** `[ ] TODO`
**Blocked by:** SQ-021, SQ-019
**Priority:** High
## Description
Full TigerBeetle-inspired simulation test suite. Spin up multiple nodes with virtual I/O, inject faults, verify invariants.
## Files to Create/Modify
- `crates/sq-sim/src/runtime.rs` - test harness for multi-node simulation
- `crates/sq-sim/tests/invariants.rs` - invariant checker functions
- `crates/sq-sim/tests/scenarios/mod.rs`
- `crates/sq-sim/tests/scenarios/single_node.rs` - S01-S04
- `crates/sq-sim/tests/scenarios/multi_node.rs` - S05-S08
- `crates/sq-sim/tests/scenarios/failures.rs` - S09-S12
## Scenarios
- **S01:** Single node, single producer, single consumer - baseline
- **S02:** Single node, concurrent producers - offset ordering
- **S03:** Single node, disk full during write - graceful error
- **S04:** Single node, crash and restart - WAL recovery
- **S05:** Three nodes, normal operation - replication works
- **S06:** Three nodes, one crashes - remaining two continue
- **S07:** Three nodes, network partition (2+1) - majority continues
- **S08:** Three nodes, S3 outage - local WAL accumulates
- **S09:** Consumer group, offset preservation
- **S10:** High throughput burst - no message loss
- **S11:** Slow consumer with WAL trimming - falls back to S3
- **S12:** Node rejoins after long absence - catches up
## Invariants (checked after every step)
1. No acked message is ever lost
2. Offsets strictly monotonic, no gaps
3. CRC integrity on all reads
4. Consumer group offsets never regress
5. After network heal, replicas converge
6. WAL never trimmed before S3 confirmation
## Acceptance Criteria
- [ ] All 12 scenarios pass
- [ ] Each scenario runs with multiple random seeds (at least 10)
- [ ] Invariant violations produce clear diagnostic output
- [ ] Tests complete in < 60 seconds total