49
todos/SQ-022-simulation-tests.md
Normal file
49
todos/SQ-022-simulation-tests.md
Normal file
@@ -0,0 +1,49 @@
|
||||
# SQ-022: Multi-Node Simulation Tests
|
||||
|
||||
**Status:** `[ ] TODO`
|
||||
**Blocked by:** SQ-021, SQ-019
|
||||
**Priority:** High
|
||||
|
||||
## Description
|
||||
|
||||
Full TigerBeetle-inspired simulation test suite. Spin up multiple nodes with virtual I/O, inject faults, verify invariants.
|
||||
|
||||
## Files to Create/Modify
|
||||
|
||||
- `crates/sq-sim/src/runtime.rs` - test harness for multi-node simulation
|
||||
- `crates/sq-sim/tests/invariants.rs` - invariant checker functions
|
||||
- `crates/sq-sim/tests/scenarios/mod.rs`
|
||||
- `crates/sq-sim/tests/scenarios/single_node.rs` - S01-S04
|
||||
- `crates/sq-sim/tests/scenarios/multi_node.rs` - S05-S08
|
||||
- `crates/sq-sim/tests/scenarios/failures.rs` - S09-S12
|
||||
|
||||
## Scenarios
|
||||
|
||||
- **S01:** Single node, single producer, single consumer - baseline
|
||||
- **S02:** Single node, concurrent producers - offset ordering
|
||||
- **S03:** Single node, disk full during write - graceful error
|
||||
- **S04:** Single node, crash and restart - WAL recovery
|
||||
- **S05:** Three nodes, normal operation - replication works
|
||||
- **S06:** Three nodes, one crashes - remaining two continue
|
||||
- **S07:** Three nodes, network partition (2+1) - majority continues
|
||||
- **S08:** Three nodes, S3 outage - local WAL accumulates
|
||||
- **S09:** Consumer group, offset preservation
|
||||
- **S10:** High throughput burst - no message loss
|
||||
- **S11:** Slow consumer with WAL trimming - falls back to S3
|
||||
- **S12:** Node rejoins after long absence - catches up
|
||||
|
||||
## Invariants (checked after every step)
|
||||
|
||||
1. No acked message is ever lost
|
||||
2. Offsets strictly monotonic, no gaps
|
||||
3. CRC integrity on all reads
|
||||
4. Consumer group offsets never regress
|
||||
5. After network heal, replicas converge
|
||||
6. WAL never trimmed before S3 confirmation
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] All 12 scenarios pass
|
||||
- [ ] Each scenario runs with multiple random seeds (at least 10)
|
||||
- [ ] Invariant violations produce clear diagnostic output
|
||||
- [ ] Tests complete in < 60 seconds total
|
||||
Reference in New Issue
Block a user