Internal Testing Plan: Multi-Node Network Validation
Status: Ready to Execute Phase: Pre-Pilot Internal Testing Timeline: 1-2 weeks Prerequisites: Phase 18 Complete ✅
Objectives
Validate the ICN system in realistic multi-node scenarios before pilot deployment:
- Functional Correctness: All components work together as designed
- Byzantine Detection: Misbehavior is detected and isolated correctly
- Performance: System handles realistic workloads efficiently
- Resilience: Recovers gracefully from failures and network partitions
- Monitoring: Metrics and alerts provide actionable operational visibility
- Stability: System runs continuously without crashes or memory leaks
Test Environment Architecture
Network Topology
┌─────────────┐
│ Metrics │
│ (Prometheus│
│ + Grafana) │
└──────┬──────┘
│
┌────────────────┼────────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Node 1 │◄────►│ Node 2 │◄───►│ Node 3 │
│(Honest) │ │(Honest) │ │(Honest) │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└────────────────┼────────────────┘
│
┌────▼────┐
│ Node 4 │
│(Byzantine│
│ Attacker)│
└─────────┘
Node Configuration:
- Nodes 1-3: Honest nodes with full trust relationships
- Node 4: Byzantine node for attack simulation
- Metrics Server: Centralized Prometheus + Grafana on separate host
Infrastructure Setup
Option 1: Docker Compose (Recommended for Development)
Use the production-like configuration in docker-compose.test.yml:
# Build and start
docker build -t icn:latest -f Dockerfile icn/
docker compose -f docker-compose.test.yml up -d
# Check status
docker compose -f docker-compose.test.yml ps
Port mapping summary (see docker-compose.test.yml for complete config):
| Service | P2P | Metrics (host:container) | Gateway |
|---|---|---|---|
| node1 | 5001 | 9091:9100 | 8081 |
| node2 | 5002 | 9092:9100 | 8082 |
| node3 | 5003 | 9093:9100 | 8083 |
| node4 | 5004 | 9094:9100 | 8084 |
| prometheus | - | 9095:9090 | - |
| grafana | - | - | 3000 |
Note: ICN daemon runs metrics on port 9100 internally.
Option 2: Local Processes (Quick Start)
# Terminal 1: Node 1
cat > /tmp/icn-node1.toml <<EOF
data_dir = "/tmp/icn-node1"
[network]
listen_addr = "127.0.0.1:5001"
[observability]
metrics_port = 9101
health_port = 18081
log_level = "info"
EOF
cargo run --release --bin icnd -- --config /tmp/icn-node1.toml
# Terminal 2: Node 2
cat > /tmp/icn-node2.toml <<EOF
data_dir = "/tmp/icn-node2"
[network]
listen_addr = "127.0.0.1:5002"
[observability]
metrics_port = 9102
health_port = 18082
log_level = "info"
EOF
cargo run --release --bin icnd -- --config /tmp/icn-node2.toml
# Terminal 3: Node 3
cat > /tmp/icn-node3.toml <<EOF
data_dir = "/tmp/icn-node3"
[network]
listen_addr = "127.0.0.1:5003"
[observability]
metrics_port = 9103
health_port = 18083
log_level = "info"
EOF
cargo run --release --bin icnd -- --config /tmp/icn-node3.toml
# Terminal 4: Node 4 (Byzantine)
cat > /tmp/icn-node4.toml <<EOF
data_dir = "/tmp/icn-node4"
[network]
listen_addr = "127.0.0.1:5004"
[observability]
metrics_port = 9104
health_port = 18084
log_level = "info"
EOF
cargo run --release --bin icnd -- --config /tmp/icn-node4.toml
# Terminal 5: Prometheus
prometheus --config.file=monitoring/prometheus.yml
# Terminal 6: Grafana
grafana-server --config monitoring/grafana.ini
Option 3: Kubernetes (Production-like)
- Deploy to local k8s cluster (minikube/kind)
- 4 ICN pods with persistent volumes
- Prometheus operator for metrics
- Grafana for dashboards
Test Scenarios
1. Baseline Functionality Tests (2 days)
Goal: Verify all components work correctly in normal operation
1.1 Network Formation
- Setup: Start all 4 nodes sequentially
- Test:
- Nodes discover each other via mDNS
- QUIC/TLS connections established
- DID-TLS binding verified
- X25519 key exchange completed
- Success Criteria:
- All nodes see 3 peers in
icnctl network peers - No connection errors in logs
icn_network_connections_active= 3 per node
- All nodes see 3 peers in
1.2 Trust Graph Sync
- Setup: Node1 sets trust edges
- Test:
- Node1: Trust Node2=0.8, Node3=0.7, Node4=0.3
- Wait for gossip propagation (30s)
- Query trust from other nodes
- Success Criteria:
- All nodes have consistent trust graph
- Trust class calculations correct (Partner, Federated, Isolated)
icn_trust_edges_totalmatches across nodes
1.3 Gossip Message Propagation
- Setup: Nodes 1-3 subscribed to topic "test:messages"
- Test:
- Node1 publishes 100 messages to "test:messages"
- Measure time to convergence
- Success Criteria:
- All nodes receive all 100 messages within 5 seconds
- Vector clocks show correct causal ordering
- No duplicate message processing
icn_gossip_announces_received_total= 100 on nodes 2-3
1.4 Ledger Transaction Sync
- Setup: Initialize ledgers on all nodes
- Test:
- Node1 → Node2: Transfer 50 credits
- Node2 → Node3: Transfer 30 credits
- Node3 → Node1: Transfer 20 credits
- Wait for gossip sync (60s)
- Success Criteria:
- All nodes have identical ledger state
- Balances correct: Node1=-30, Node2=+20, Node3=+10
- No quarantined entries
icn_ledger_entries_total= 3 on all nodes
1.5 Compute Task Execution
- Setup: Node1 submits task, Node2 configured as executor
- Test:
- Submit CCL contract:
rule example() { return 42; } - Node2 claims and executes
- Result propagated via gossip
- Submit CCL contract:
- Success Criteria:
- Task completes within 10 seconds
- Result verified: output = 42
- Payment settled: Node1 → Node2
icn_compute_tasks_completed_total= 1
1.6 Governance Domain Creation & Sync
- Setup: All nodes running
- Test:
- Node1 creates governance domain "test-coop" with members: Node1, Node2, Node3
- Wait for gossip propagation (30s)
- Query domain from all nodes
- Success Criteria:
- All nodes see the same domain configuration
- Membership list correct (3 members)
- Governance profile = cooperative_default (1-member-1-vote)
- Domain created event in gossip logs
1.7 Proposal Lifecycle (Simple Majority)
- Setup: Governance domain "test-coop" with 3 members
- Test:
- Node1 creates text proposal: "Should we upgrade to Protocol v2?"
- Node1 opens proposal for voting
- Node1 votes: For
- Node2 votes: For
- Node3 votes: Against
- Node1 closes proposal
- Check outcome
- Success Criteria:
- Proposal created and synced to all nodes
- All votes recorded (3 total)
- Tally: 2 For, 1 Against, 0 Abstain
- Outcome: Accepted (66% approval > 50% threshold)
- Proposal state transitions: Draft → Open → Voting → Closed
- All events propagated via governance gossip
1.8 Proposal Lifecycle (Quorum Failure)
- Setup: Governance domain "test-coop" with 3 members, quorum = 100%
- Test:
- Node1 creates budget proposal
- Node1 opens proposal
- Only Node1 and Node2 vote (2/3 = 66% turnout)
- Node1 closes proposal
- Success Criteria:
- Tally recorded correctly
- Outcome: Rejected (failed quorum requirement)
- Rejection reason: "quorum not met (66% < 100%)"
- All nodes see consistent outcome
1.9 Governance WebSocket Events
- Setup: Gateway running, WebSocket client connected
- Test:
- Create domain, proposal, cast votes
- Monitor WebSocket for events
- Success Criteria:
- Client receives: GovernanceDomainCreated event
- Client receives: GovernanceProposalCreated event
- Client receives: GovernanceProposalOpened event
- Client receives: GovernanceVoteCast events (3 total)
- Client receives: GovernanceProposalClosed event
- All events have correct timestamps and payload
1.10 Graceful Restart
- Setup: Nodes running with active workload
- Test:
- Send SIGTERM to Node2
- Wait for graceful shutdown
- Restart Node2
- Resume workload
- Success Criteria:
- State snapshot saved (vector clocks, subscriptions, X25519 keys)
- Node2 rejoins network within 30 seconds
- No message loss or duplicates
icn_snapshot_save_duration_seconds< 0.1s
2. Byzantine Behavior Detection Tests (3 days)
Goal: Verify misbehavior is detected and isolated correctly
2.1 Invalid Signature Attack
- Setup: Node4 (Byzantine) attempts to forge signatures
- Test:
- Modify Node4 to send messages with invalid Ed25519 signatures
- Send 5 forged messages to Node1
- Expected Behavior:
- Node1 detects InvalidSignature violations (5 total)
- Node1's misbehavior detector records violations
- Node4's reputation drops (1.0 → 0.75 after 5 violations)
- Node4 NOT quarantined yet (threshold = 0.5)
- Success Criteria:
icn_misbehavior_violations_total{violation_type="InvalidSignature"}= 5- Node1 reputation for Node4 = 0.75 ± 0.01
- Grafana panel shows violations
2.2 Replay Attack Detection
- Setup: Node4 attempts to replay captured messages
- Test:
- Capture signed message from Node2
- Node4 replays same message 3 times to Node1
- Expected Behavior:
- First message accepted (valid)
- Subsequent replays detected by sequence number tracking
- ReplayAttack violation recorded (severity 10, auto-ban)
- Node4 immediately banned (reputation → 0.0)
- Success Criteria:
icn_misbehavior_violations_total{violation_type="ReplayAttack"}≥ 1icn_misbehavior_banned_peers= 1icn_misbehavior_auto_bans_total= 1- Node4 isolated from network (no further messages accepted)
2.3 Ledger Fork Attack
- Setup: Node4 attempts double-spending
- Test:
- Node4 creates two conflicting ledger entries with same parent
- Entry A: Transfer 100 credits to Node1
- Entry B: Transfer 100 credits to Node2 (conflicting)
- Gossip both entries to network
- Expected Behavior:
- First entry accepted by honest nodes
- Second entry detected as conflict
- ConflictingLedgerEntries violation (severity 10, auto-ban)
- Conflicting entry quarantined
- Node4 auto-banned
- Success Criteria:
icn_ledger_entries_quarantined= 1icn_misbehavior_violations_total{violation_type="ConflictingLedgerEntries"}= 1- Node4 banned on all honest nodes
- Ledger state consistent across Node1-3
2.4 Compute Result Forgery
- Setup: Node4 claims task but returns forged result
- Test:
- Node1 submits task with known result (e.g., hash computation)
- Node4 claims task, returns incorrect result with invalid signature
- Node1 verifies result
- Expected Behavior:
- Signature verification fails
- FailedComputeVerification violation recorded (severity 5)
- Node4's reputation decreases
- No payment issued (verification failed)
- Success Criteria:
icn_compute_signatures_invalid_total= 1icn_misbehavior_violations_total{violation_type="FailedComputeVerification"}= 1- Node4 reputation reduced
- Task remains in "failed" state
2.5 ACL Violation Spam
- Setup: Node4 attempts rapid unauthorized subscriptions
- Test:
- Node1 creates private topic (TrustClass::Partner, requires trust > 0.9)
- Node4 (trust 0.3) attempts 15 subscription requests in 10 seconds
- Expected Behavior:
- All subscription attempts rejected (ACL violation)
- 15 violations recorded
- After 10 violations in 1 hour: Node4 quarantined
- Node4's trust reduced via trust penalty callback
- Success Criteria:
icn_misbehavior_violations_total= 15icn_misbehavior_quarantined_peers= 1- Node4 quarantined (reputation < 0.5)
- Grafana shows rate-limit quarantine event
2.6 Multi-Node Byzantine Isolation
- Setup: Node4 sends conflicting statements to different nodes
- Test:
- Node4 → Node1: "Balance(Alice) = 100"
- Node4 → Node2: "Balance(Alice) = 200" (conflicting)
- Nodes gossip received statements
- Expected Behavior:
- Both Node1 and Node2 independently detect conflict
- ConflictingSignedStatements violation (severity 10, auto-ban)
- Node4 banned on both nodes
- Node3 learns of ban via reputation gossip (future Phase 19)
- Success Criteria:
- Node1 and Node2 both ban Node4 independently
icn_misbehavior_auto_bans_total≥ 2 (across network)- Node4 isolated from all honest nodes
2.7 Governance Vote Manipulation
- Setup: Governance domain with Node1, Node2, Node3; Node4 NOT a member
- Test:
- Node1 creates proposal
- Node1 opens proposal
- Node4 attempts to vote (not a member)
- Expected Behavior:
- Node4's vote rejected (not in membership list)
- Vote not recorded in tally
- Potential violation recorded (attempted unauthorized action)
- Success Criteria:
- Vote count remains 0
- Node4's vote not in proposal.votes map
- Error logged: "unauthorized voter"
- Proposal outcome unaffected
2.8 Governance Double Voting Attack
- Setup: Governance domain with Node1, Node2, Node3
- Test:
- Node1 creates proposal
- Node1 opens proposal
- Node2 votes: For
- Node2 attempts to vote again: Against (double vote)
- Expected Behavior:
- First vote accepted and recorded
- Second vote rejected (already voted)
- Tally shows only 1 vote from Node2
- Warning logged: "double vote attempt"
- Success Criteria:
- Vote count = 1 for Node2 (not 2)
- Tally: 1 For, 0 Against (first vote wins)
- Double vote attempt logged
- No violation recorded (benign error, could be network duplicate)
2.9 Governance Proposal Spam
- Setup: Node4 (low trust) creates governance domain
- Test:
- Node4 creates 50 proposals in 60 seconds
- All proposals in same domain
- Expected Behavior:
- Proposals accepted (governance has no built-in rate limit yet)
- Gossip propagates all proposals
- Note: This validates current behavior; Phase 19 may add rate limits
- Success Criteria:
- 50 proposals created successfully
- No crashes or out-of-memory errors
- Gossip convergence time measured (should be < 2 minutes)
- Resource usage within acceptable bounds
2.10 Governance Conflicting Outcomes
- Setup: Network partition scenario with governance
- Test:
- Create governance domain with Node1, Node2, Node3
- Create proposal
- Partition: [Node1, Node2] vs [Node3]
- Node1 and Node2 vote: For (2/3 = 66%)
- Node1 closes proposal (sees Accepted with 2/3 votes)
- Node3 (in partition) votes: Against
- Heal partition
- Both sides have different tallies
- Expected Behavior:
- Before healing: Both partitions see different state
- After healing: Gossip reconciles votes
- Final tally: 2 For, 1 Against
- Outcome recalculated if needed (may require manual review)
- Success Criteria:
- All votes eventually recorded (3 total after healing)
- Conflicting outcomes detected (if any)
- Operator alerted to review proposal
- Note: This is a known edge case; Phase 19 may add partition-aware voting
3. Performance & Load Tests (2 days)
Goal: Validate system performance under realistic and stress conditions
3.1 Gossip Throughput
- Setup: 3-node network
- Test:
- Node1 publishes 1000 messages/sec to 5 topics
- Measure propagation latency and throughput
- Run for 10 minutes
- Success Criteria:
- Median latency < 100ms
- P99 latency < 500ms
- No message loss
- CPU usage < 50% per node
- Memory growth < 100 MB over 10 minutes
3.2 Ledger Transaction Volume
- Setup: 3-node network
- Test:
- Simulate 100 concurrent users making transactions
- 50 transactions/sec sustained for 5 minutes
- Random transaction amounts and participants
- Success Criteria:
- All transactions processed without conflicts
- Ledger convergence within 60 seconds
- No quarantined entries (all valid)
icn_ledger_entries_total= 15,000 (50 tx/s × 300s)- Balances sum to zero (double-entry invariant)
3.3 Compute Task Queue
- Setup: 1 submitter (Node1), 2 executors (Node2, Node3)
- Test:
- Submit 500 compute tasks with varying fuel limits
- Tasks include: math operations, string parsing, conditional logic
- Measure task completion rate and latency
- Success Criteria:
- All 500 tasks complete successfully
- Median completion time < 5 seconds
- Tasks distributed evenly (Node2 ≈ 250, Node3 ≈ 250)
- No task timeouts or executor crashes
- All payments settled correctly
3.4 Byzantine Detection Under Load
- Setup: 3 honest nodes + 1 Byzantine node
- Test:
- Normal workload: 100 tx/sec + 50 compute tasks/min
- Byzantine workload: 10 violations/sec (mixed types)
- Run for 30 minutes
- Success Criteria:
- Byzantine node quarantined within 1 minute
- Byzantine node auto-banned after critical violation
- Honest nodes maintain throughput (< 10% degradation)
- No false positives (honest nodes not flagged)
icn_misbehavior_violations_total> 600 (10/s × 60s)
3.5 Governance Load Test
- Setup: 3-node network with 1 governance domain (3 members)
- Test:
- Create 100 proposals concurrently
- Each node opens 33-34 proposals
- All 3 nodes vote on all proposals (300 votes total)
- Close all proposals
- Measure convergence time
- Success Criteria:
- All 100 proposals created successfully
- All 300 votes recorded correctly
- All proposals reach consistent outcome across nodes
- Convergence time < 5 minutes
- No vote loss or duplication
- Gossip overhead acceptable (< 50% CPU)
3.6 Memory Leak Detection
- Setup: 4-node network with continuous workload
- Test:
- Run for 24 hours with:
- 10 tx/sec ledger transactions
- 5 compute tasks/min
- 100 gossip messages/sec
- Monitor memory usage every hour
- Run for 24 hours with:
- Success Criteria:
- Memory growth < 500 MB over 24 hours
- No unbounded growth (linear or exponential)
- Resident set size (RSS) stable after initial ramp-up
- No out-of-memory crashes
4. Resilience & Fault Tolerance Tests (2 days)
Goal: Verify system recovers gracefully from failures
4.1 Node Crash Recovery
- Setup: 4-node network with active workload
- Test:
- Kill Node2 with SIGKILL (unclean shutdown)
- Wait 2 minutes
- Restart Node2
- Verify recovery
- Success Criteria:
- Node2 rejoins network within 60 seconds
- Gossip anti-entropy fetches missed messages
- Ledger state restored via sync
- No data loss or corruption
- Workload resumes normally
4.2 Network Partition
- Setup: 4-node network split into 2 partitions
- Test:
- Partition 1: Node1, Node2
- Partition 2: Node3, Node4
- Block traffic between partitions for 5 minutes
- Heal partition
- Measure convergence time
- Success Criteria:
- Nodes detect partition via heartbeat timeouts
- Both partitions continue operating independently
- After healing: gossip anti-entropy reconciles state
- Ledger conflicts detected and quarantined (if any)
- Full convergence within 2 minutes of healing
4.3 Byzantine Node Recovery
- Setup: Node4 quarantined due to violations
- Test:
- Stop Node4
- Upgrade Node4 to honest behavior
- Restart Node4
- Wait for reputation decay
- Expected Behavior:
- Node4 starts with quarantined reputation (loaded from snapshot - Phase 19)
- Reputation decays at 0.01 points/hour
- After ~50 hours: reputation > 0.5 (out of quarantine)
- Node4 regains network privileges gradually
- Success Criteria:
- Reputation decay works as expected
- Node4 can rejoin network after sufficient decay
- No manual intervention required (automatic recovery)
4.4 Disk Full Scenario
- Setup: Node2 with limited disk quota (1 GB)
- Test:
- Fill disk with gossip entries, ledger data
- Monitor behavior as disk approaches full
- Expected Behavior:
- Node logs disk space warnings
- Gossip entries evicted (LRU) to free space
- Node does NOT crash
- Graceful degradation (may miss some gossip entries)
- Success Criteria:
- No crashes or panics
- Node continues operating with reduced capacity
- Alerts triggered in Grafana
- Operator notified to add capacity
4.5 Prometheus/Grafana Failure
- Setup: Running network with monitoring
- Test:
- Stop Prometheus server
- Continue workload for 10 minutes
- Restart Prometheus
- Success Criteria:
- ICN nodes continue operating normally (monitoring is non-critical)
- No crashes due to metrics export failures
- After Prometheus restart: metrics collection resumes
- No data loss (metrics buffered or dropped gracefully)
5. Operational Procedures Tests (1 day)
Goal: Validate operational workflows documented in deployment guide
5.1 Backup & Restore
- Test:
- Create backup using
icnctl backup create /tmp/backup.tar.gz.age - Verify backup contains: keystore, store, config, state.snapshot
- Corrupt Node2's data directory
- Restore using
icnctl backup restore /tmp/backup.tar.gz.age
- Create backup using
- Success Criteria:
- Backup completes without errors
- Restore recovers all data
- Node2 rejoins network successfully
- No data loss (ledger, trust graph, gossip subscriptions)
5.2 Version Upgrade
- Test:
- Build new version with protocol version bump
- Rolling upgrade: Node1 → Node2 → Node3 → Node4
- Verify version negotiation and compatibility
- Success Criteria:
- Nodes negotiate correct protocol version
- Backward compatibility maintained (if within compatibility window)
- No downtime for network (rolling upgrade successful)
icn_network_version_negotiation_success_totalincrements
5.3 Security Incident Response
- Test:
- Detect Node4 compromised (simulated via intentional violations)
- Follow incident response playbook:
- Identify compromised node via Grafana alerts
- Investigate logs and violation records
- Confirm Byzantine behavior
- Verify automatic isolation (ban)
- Document incident
- Success Criteria:
- Incident detected within 1 minute (via alert)
- Compromised node automatically banned
- No manual intervention required for isolation
- Playbook steps executable and accurate
5.4 Capacity Planning
- Test:
- Monitor resource usage under load
- Calculate capacity limits:
- Max transactions/sec before latency degrades
- Max gossip topics before memory pressure
- Max concurrent compute tasks per executor
- Success Criteria:
- Baseline capacity metrics documented
- Recommendations for scaling (add nodes, increase resources)
- Grafana dashboards show capacity utilization
Monitoring Setup
Prometheus Configuration
# monitoring/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'icn-nodes'
static_configs:
- targets:
- 'node1:9090'
- 'node2:9090'
- 'node3:9090'
- 'node4:9090'
relabel_configs:
- source_labels: [__address__]
target_label: instance
regex: '([^:]+):.*'
replacement: '$1'
Key Metrics to Monitor
Byzantine Detection:
icn_misbehavior_violations_total(by violation_type, did)icn_misbehavior_quarantined_peersicn_misbehavior_banned_peersicn_misbehavior_auto_bans_total
Network Health:
icn_network_connections_activeicn_network_messages_sent_totalicn_network_messages_received_totalicn_network_messages_rate_limited_total
Gossip Performance:
icn_gossip_announces_sent_totalicn_gossip_requests_sent_totalicn_gossip_entries_total(per topic)icn_gossip_vector_clock_updates_total
Ledger Consistency:
icn_ledger_entries_totalicn_ledger_entries_quarantinedicn_ledger_balances_total
Compute Layer:
icn_compute_tasks_submitted_totalicn_compute_tasks_completed_totalicn_compute_task_duration_secondsicn_compute_signatures_invalid_total
Governance:
icn_governance_domains_totalicn_governance_proposals_total(by state: draft, open, closed)icn_governance_votes_totalicn_governance_proposals_accepted_totalicn_governance_proposals_rejected_total
Alert Rules
# monitoring/alert_rules.yml
groups:
- name: byzantine_detection
interval: 10s
rules:
- alert: ByzantineNodeDetected
expr: icn_misbehavior_quarantined_peers > 0
for: 1m
annotations:
summary: "Byzantine node quarantined"
description: "{{ $value }} nodes have been quarantined due to misbehavior"
- alert: AutoBanTriggered
expr: increase(icn_misbehavior_auto_bans_total[5m]) > 0
annotations:
summary: "Critical violation auto-ban"
description: "A node has been auto-banned for critical violations"
- alert: HighViolationRate
expr: rate(icn_misbehavior_violations_total[5m]) > 1
for: 5m
annotations:
summary: "High violation rate detected"
description: "{{ $value }} violations/sec detected"
- name: network_health
interval: 10s
rules:
- alert: NetworkPartition
expr: icn_network_connections_active < 2
for: 2m
annotations:
summary: "Possible network partition"
description: "Node has less than 2 active connections"
- alert: HighMessageLoss
expr: rate(icn_network_messages_failed_total[5m]) > 0.1
for: 5m
annotations:
summary: "High message failure rate"
description: "{{ $value }} messages/sec failing"
- name: ledger_consistency
interval: 30s
rules:
- alert: LedgerConflict
expr: icn_ledger_entries_quarantined > 0
annotations:
summary: "Ledger entries quarantined"
description: "{{ $value }} ledger entries in quarantine (possible fork attack)"
Test Execution Plan
Week 1: Infrastructure & Baseline Tests
Day 1-2: Environment Setup
- Create Docker Compose configuration
- Build ICN Docker image
- Set up Prometheus + Grafana
- Deploy 4-node test network
- Verify metrics collection
Day 3-4: Baseline Functionality Tests
- Network formation (1.1)
- Trust graph sync (1.2)
- Gossip propagation (1.3)
- Ledger sync (1.4)
- Compute execution (1.5)
- Graceful restart (1.6)
Day 5: Performance Baseline
- Gossip throughput (3.1)
- Ledger transaction volume (3.2)
- Establish performance baselines
Week 2: Byzantine Detection & Stress Tests
Day 6-8: Byzantine Behavior Tests
- Invalid signature attack (2.1)
- Replay attack detection (2.2)
- Ledger fork attack (2.3)
- Compute result forgery (2.4)
- ACL violation spam (2.5)
- Multi-node isolation (2.6)
Day 9: Performance Under Load
- Compute task queue (3.3)
- Byzantine detection under load (3.4)
Day 10: Resilience Tests
- Node crash recovery (4.1)
- Network partition (4.2)
- Byzantine node recovery (4.3)
Day 11: Operational Procedures
- Backup & restore (5.1)
- Security incident response (5.3)
- Capacity planning (5.4)
Day 12: Soak Test
- Memory leak detection (3.5)
- 24-hour stability test
- Final validation
Success Criteria
Mandatory Requirements (Blockers for Pilot)
- [ ] All 38 test scenarios pass without failures (10 baseline + 10 Byzantine + 6 performance + 5 resilience + 4 operational + 3 governance)
- [ ] No crashes or panics during any test
- [ ] Byzantine nodes detected and isolated within SLA (1 minute for critical violations)
- [ ] No false positives (honest nodes never quarantined/banned)
- [ ] Ledger consistency maintained across all nodes (no undetected forks)
- [ ] Governance voting works correctly (no vote loss, correct outcomes)
- [ ] Graceful restart preserves all critical state
- [ ] Network recovers from partitions within 2 minutes
- [ ] 24-hour soak test completes with stable memory usage
Performance Benchmarks (Targets)
- [ ] Gossip latency: median < 100ms, P99 < 500ms
- [ ] Ledger transactions: 50 tx/sec sustained
- [ ] Compute tasks: 10 tasks/min per executor
- [ ] Byzantine detection overhead: < 0.1% CPU
- [ ] Memory overhead: < 500 MB growth over 24 hours
- [ ] Network partition recovery: < 2 minutes to full convergence
Optional Goals (Nice-to-Have)
- [ ] 1000 tx/sec ledger throughput (stretch goal)
- [ ] 1-week soak test (extended stability validation)
- [ ] Chaos testing with random node failures
- [ ] Performance comparison vs. baseline (pre-Phase 18)
Test Artifacts
Required Deliverables
- Test Execution Log - Detailed results for each scenario
- Performance Report - Throughput, latency, resource usage metrics
- Bug Report - Any issues discovered with severity classification
- Grafana Screenshots - Key metrics during Byzantine attacks
- Incident Timeline - Step-by-step analysis of Byzantine detection events
- Capacity Recommendations - Resource requirements for pilot deployment
- Go/No-Go Decision - Final readiness assessment
Bug Tracking Template
## Bug Report: [Title]
**Severity**: Critical / Major / Minor
**Test Scenario**: [e.g., 2.2 Replay Attack Detection]
**Environment**: [Docker Compose / Local Processes]
**Steps to Reproduce**:
1. Start 4-node network
2. ...
**Expected Behavior**:
[What should happen]
**Actual Behavior**:
[What actually happened]
**Logs**:
[Relevant log excerpts]
**Metrics**:
[Screenshots or PromQL queries]
**Impact**:
[Pilot blocker? Workaround available?]
**Root Cause Analysis**:
[If known]
**Proposed Fix**:
[If known]
Risk Assessment
High-Risk Areas
Reputation Persistence (Known Limitation)
- Risk: Reputation reset on restart enables attackers to rejoin
- Mitigation: Phase 19 will add persistent storage; for testing, document workaround (manual ban via config)
Cross-Node Reputation Sync (Known Limitation)
- Risk: Byzantine node could exploit different reputations on different nodes
- Mitigation: Test multi-node isolation (scenario 2.6) validates independent detection
Network Partition Handling
- Risk: Ledger forks during partition may not be detected immediately
- Mitigation: Quarantine mechanism catches conflicts on partition healing
Compute Task Timeouts
- Risk: Long-running tasks may not be killed after timeout
- Mitigation: Verify timeout enforcement in scenario 3.3
Medium-Risk Areas
Gossip Convergence Time
- Risk: Large networks may have slow convergence
- Mitigation: Measure and document convergence time in scenario 1.3
Trust Graph Scalability
- Risk: Trust computation may be slow with many edges
- Mitigation: Performance test with realistic trust graph size
Metrics Export Overhead
- Risk: High-cardinality metrics may impact performance
- Mitigation: Monitor CPU usage during load tests
Timeline
Week 1: Infrastructure setup + baseline tests Week 2: Byzantine detection + stress tests
Total Duration: 2 weeks (12 working days)
Go/No-Go Decision: End of Week 2
Next Steps
- Create Docker Compose setup - Start with Option 1 (recommended)
- Build test automation scripts - Bash scripts for each test scenario
- Set up CI/CD integration - Automated nightly test runs
- Establish baseline metrics - Run baseline tests first to set performance targets
- Execute test plan systematically - Follow day-by-day schedule
- Document all findings - Comprehensive test report
- Make Go/No-Go decision - Ready for pilot or additional hardening needed?
Status: Ready to Execute Owner: [Assign owner] Start Date: [TBD] Target Completion: [Start Date + 2 weeks]