Internal Testing Plan: Multi-Node Network Validation

Status: Ready to Execute Phase: Pre-Pilot Internal Testing Timeline: 1-2 weeks Prerequisites: Phase 18 Complete ✅

Objectives

Validate the ICN system in realistic multi-node scenarios before pilot deployment:

Functional Correctness: All components work together as designed
Byzantine Detection: Misbehavior is detected and isolated correctly
Performance: System handles realistic workloads efficiently
Resilience: Recovers gracefully from failures and network partitions
Monitoring: Metrics and alerts provide actionable operational visibility
Stability: System runs continuously without crashes or memory leaks

Test Environment Architecture

Network Topology

                  ┌─────────────┐
                  │   Metrics   │
                  │  (Prometheus│
                  │  + Grafana) │
                  └──────┬──────┘
                         │
        ┌────────────────┼────────────────┐
        │                │                │
   ┌────▼────┐      ┌────▼────┐     ┌────▼────┐
   │ Node 1  │◄────►│ Node 2  │◄───►│ Node 3  │
   │(Honest) │      │(Honest) │     │(Honest) │
   └────┬────┘      └────┬────┘     └────┬────┘
        │                │                │
        └────────────────┼────────────────┘
                         │
                    ┌────▼────┐
                    │ Node 4  │
                    │(Byzantine│
                    │ Attacker)│
                    └─────────┘

Node Configuration:

Nodes 1-3: Honest nodes with full trust relationships
Node 4: Byzantine node for attack simulation
Metrics Server: Centralized Prometheus + Grafana on separate host

Infrastructure Setup

Option 1: Docker Compose (Recommended for Development)

Use the production-like configuration in docker-compose.test.yml:

# Build and start
docker build -t icn:latest -f Dockerfile icn/
docker compose -f docker-compose.test.yml up -d

# Check status
docker compose -f docker-compose.test.yml ps

Port mapping summary (see docker-compose.test.yml for complete config):

Service	P2P	Metrics (host:container)	Gateway
node1	5001	9091:9100	8081
node2	5002	9092:9100	8082
node3	5003	9093:9100	8083
node4	5004	9094:9100	8084
prometheus	-	9095:9090	-
grafana	-	-	3000

Note: ICN daemon runs metrics on port 9100 internally.

Option 2: Local Processes (Quick Start)

# Terminal 1: Node 1
cat > /tmp/icn-node1.toml <<EOF
data_dir = "/tmp/icn-node1"
[network]
listen_addr = "127.0.0.1:5001"
[observability]
metrics_port = 9101
health_port = 18081
log_level = "info"
EOF
cargo run --release --bin icnd -- --config /tmp/icn-node1.toml

# Terminal 2: Node 2
cat > /tmp/icn-node2.toml <<EOF
data_dir = "/tmp/icn-node2"
[network]
listen_addr = "127.0.0.1:5002"
[observability]
metrics_port = 9102
health_port = 18082
log_level = "info"
EOF
cargo run --release --bin icnd -- --config /tmp/icn-node2.toml

# Terminal 3: Node 3
cat > /tmp/icn-node3.toml <<EOF
data_dir = "/tmp/icn-node3"
[network]
listen_addr = "127.0.0.1:5003"
[observability]
metrics_port = 9103
health_port = 18083
log_level = "info"
EOF
cargo run --release --bin icnd -- --config /tmp/icn-node3.toml

# Terminal 4: Node 4 (Byzantine)
cat > /tmp/icn-node4.toml <<EOF
data_dir = "/tmp/icn-node4"
[network]
listen_addr = "127.0.0.1:5004"
[observability]
metrics_port = 9104
health_port = 18084
log_level = "info"
EOF
cargo run --release --bin icnd -- --config /tmp/icn-node4.toml

# Terminal 5: Prometheus
prometheus --config.file=monitoring/prometheus.yml

# Terminal 6: Grafana
grafana-server --config monitoring/grafana.ini

Option 3: Kubernetes (Production-like)

Deploy to local k8s cluster (minikube/kind)
4 ICN pods with persistent volumes
Prometheus operator for metrics
Grafana for dashboards

Test Scenarios

1. Baseline Functionality Tests (2 days)

Goal: Verify all components work correctly in normal operation

1.1 Network Formation

Setup: Start all 4 nodes sequentially
Test:
- Nodes discover each other via mDNS
- QUIC/TLS connections established
- DID-TLS binding verified
- X25519 key exchange completed
Success Criteria:
- All nodes see 3 peers in icnctl network peers
- No connection errors in logs
- icn_network_connections_active = 3 per node

1.2 Trust Graph Sync

Setup: Node1 sets trust edges
Test:
- Node1: Trust Node2=0.8, Node3=0.7, Node4=0.3
- Wait for gossip propagation (30s)
- Query trust from other nodes
Success Criteria:
- All nodes have consistent trust graph
- Trust class calculations correct (Partner, Federated, Isolated)
- icn_trust_edges_total matches across nodes

1.3 Gossip Message Propagation

Setup: Nodes 1-3 subscribed to topic "test:messages"
Test:
- Node1 publishes 100 messages to "test:messages"
- Measure time to convergence
Success Criteria:
- All nodes receive all 100 messages within 5 seconds
- Vector clocks show correct causal ordering
- No duplicate message processing
- icn_gossip_announces_received_total = 100 on nodes 2-3

1.4 Ledger Transaction Sync

Setup: Initialize ledgers on all nodes
Test:
- Node1 → Node2: Transfer 50 credits
- Node2 → Node3: Transfer 30 credits
- Node3 → Node1: Transfer 20 credits
- Wait for gossip sync (60s)
Success Criteria:
- All nodes have identical ledger state
- Balances correct: Node1=-30, Node2=+20, Node3=+10
- No quarantined entries
- icn_ledger_entries_total = 3 on all nodes

1.5 Compute Task Execution

Setup: Node1 submits task, Node2 configured as executor
Test:
- Submit CCL contract: rule example() { return 42; }
- Node2 claims and executes
- Result propagated via gossip
Success Criteria:
- Task completes within 10 seconds
- Result verified: output = 42
- Payment settled: Node1 → Node2
- icn_compute_tasks_completed_total = 1

1.6 Governance Domain Creation & Sync

Setup: All nodes running
Test:
- Node1 creates governance domain "test-coop" with members: Node1, Node2, Node3
- Wait for gossip propagation (30s)
- Query domain from all nodes
Success Criteria:
- All nodes see the same domain configuration
- Membership list correct (3 members)
- Governance profile = cooperative_default (1-member-1-vote)
- Domain created event in gossip logs

1.7 Proposal Lifecycle (Simple Majority)

Setup: Governance domain "test-coop" with 3 members
Test:
- Node1 creates text proposal: "Should we upgrade to Protocol v2?"
- Node1 opens proposal for voting
- Node1 votes: For
- Node2 votes: For
- Node3 votes: Against
- Node1 closes proposal
- Check outcome
Success Criteria:
- Proposal created and synced to all nodes
- All votes recorded (3 total)
- Tally: 2 For, 1 Against, 0 Abstain
- Outcome: Accepted (66% approval > 50% threshold)
- Proposal state transitions: Draft → Open → Voting → Closed
- All events propagated via governance gossip

1.8 Proposal Lifecycle (Quorum Failure)

Setup: Governance domain "test-coop" with 3 members, quorum = 100%
Test:
- Node1 creates budget proposal
- Node1 opens proposal
- Only Node1 and Node2 vote (2/3 = 66% turnout)
- Node1 closes proposal
Success Criteria:
- Tally recorded correctly
- Outcome: Rejected (failed quorum requirement)
- Rejection reason: "quorum not met (66% < 100%)"
- All nodes see consistent outcome

1.9 Governance WebSocket Events

Setup: Gateway running, WebSocket client connected
Test:
- Create domain, proposal, cast votes
- Monitor WebSocket for events
Success Criteria:
- Client receives: GovernanceDomainCreated event
- Client receives: GovernanceProposalCreated event
- Client receives: GovernanceProposalOpened event
- Client receives: GovernanceVoteCast events (3 total)
- Client receives: GovernanceProposalClosed event
- All events have correct timestamps and payload

1.10 Graceful Restart

Setup: Nodes running with active workload
Test:
- Send SIGTERM to Node2
- Wait for graceful shutdown
- Restart Node2
- Resume workload
Success Criteria:
- State snapshot saved (vector clocks, subscriptions, X25519 keys)
- Node2 rejoins network within 30 seconds
- No message loss or duplicates
- icn_snapshot_save_duration_seconds < 0.1s

2. Byzantine Behavior Detection Tests (3 days)

Goal: Verify misbehavior is detected and isolated correctly

2.1 Invalid Signature Attack

Setup: Node4 (Byzantine) attempts to forge signatures
Test:
- Modify Node4 to send messages with invalid Ed25519 signatures
- Send 5 forged messages to Node1
Expected Behavior:
- Node1 detects InvalidSignature violations (5 total)
- Node1's misbehavior detector records violations
- Node4's reputation drops (1.0 → 0.75 after 5 violations)
- Node4 NOT quarantined yet (threshold = 0.5)
Success Criteria:
- icn_misbehavior_violations_total{violation_type="InvalidSignature"} = 5
- Node1 reputation for Node4 = 0.75 ± 0.01
- Grafana panel shows violations

2.2 Replay Attack Detection

Setup: Node4 attempts to replay captured messages
Test:
- Capture signed message from Node2
- Node4 replays same message 3 times to Node1
Expected Behavior:
- First message accepted (valid)
- Subsequent replays detected by sequence number tracking
- ReplayAttack violation recorded (severity 10, auto-ban)
- Node4 immediately banned (reputation → 0.0)
Success Criteria:
- icn_misbehavior_violations_total{violation_type="ReplayAttack"} ≥ 1
- icn_misbehavior_banned_peers = 1
- icn_misbehavior_auto_bans_total = 1
- Node4 isolated from network (no further messages accepted)

2.3 Ledger Fork Attack

Setup: Node4 attempts double-spending
Test:
- Node4 creates two conflicting ledger entries with same parent
- Entry A: Transfer 100 credits to Node1
- Entry B: Transfer 100 credits to Node2 (conflicting)
- Gossip both entries to network
Expected Behavior:
- First entry accepted by honest nodes
- Second entry detected as conflict
- ConflictingLedgerEntries violation (severity 10, auto-ban)
- Conflicting entry quarantined
- Node4 auto-banned
Success Criteria:
- icn_ledger_entries_quarantined = 1
- icn_misbehavior_violations_total{violation_type="ConflictingLedgerEntries"} = 1
- Node4 banned on all honest nodes
- Ledger state consistent across Node1-3

2.4 Compute Result Forgery

Setup: Node4 claims task but returns forged result
Test:
- Node1 submits task with known result (e.g., hash computation)
- Node4 claims task, returns incorrect result with invalid signature
- Node1 verifies result
Expected Behavior:
- Signature verification fails
- FailedComputeVerification violation recorded (severity 5)
- Node4's reputation decreases
- No payment issued (verification failed)
Success Criteria:
- icn_compute_signatures_invalid_total = 1
- icn_misbehavior_violations_total{violation_type="FailedComputeVerification"} = 1
- Node4 reputation reduced
- Task remains in "failed" state

2.5 ACL Violation Spam

Setup: Node4 attempts rapid unauthorized subscriptions
Test:
- Node1 creates private topic (TrustClass::Partner, requires trust > 0.9)
- Node4 (trust 0.3) attempts 15 subscription requests in 10 seconds
Expected Behavior:
- All subscription attempts rejected (ACL violation)
- 15 violations recorded
- After 10 violations in 1 hour: Node4 quarantined
- Node4's trust reduced via trust penalty callback
Success Criteria:
- icn_misbehavior_violations_total = 15
- icn_misbehavior_quarantined_peers = 1
- Node4 quarantined (reputation < 0.5)
- Grafana shows rate-limit quarantine event

2.6 Multi-Node Byzantine Isolation

Setup: Node4 sends conflicting statements to different nodes
Test:
- Node4 → Node1: "Balance(Alice) = 100"
- Node4 → Node2: "Balance(Alice) = 200" (conflicting)
- Nodes gossip received statements
Expected Behavior:
- Both Node1 and Node2 independently detect conflict
- ConflictingSignedStatements violation (severity 10, auto-ban)
- Node4 banned on both nodes
- Node3 learns of ban via reputation gossip (future Phase 19)
Success Criteria:
- Node1 and Node2 both ban Node4 independently
- icn_misbehavior_auto_bans_total ≥ 2 (across network)
- Node4 isolated from all honest nodes

2.7 Governance Vote Manipulation

Setup: Governance domain with Node1, Node2, Node3; Node4 NOT a member
Test:
- Node1 creates proposal
- Node1 opens proposal
- Node4 attempts to vote (not a member)
Expected Behavior:
- Node4's vote rejected (not in membership list)
- Vote not recorded in tally
- Potential violation recorded (attempted unauthorized action)
Success Criteria:
- Vote count remains 0
- Node4's vote not in proposal.votes map
- Error logged: "unauthorized voter"
- Proposal outcome unaffected

2.8 Governance Double Voting Attack

Setup: Governance domain with Node1, Node2, Node3
Test:
- Node1 creates proposal
- Node1 opens proposal
- Node2 votes: For
- Node2 attempts to vote again: Against (double vote)
Expected Behavior:
- First vote accepted and recorded
- Second vote rejected (already voted)
- Tally shows only 1 vote from Node2
- Warning logged: "double vote attempt"
Success Criteria:
- Vote count = 1 for Node2 (not 2)
- Tally: 1 For, 0 Against (first vote wins)
- Double vote attempt logged
- No violation recorded (benign error, could be network duplicate)

2.9 Governance Proposal Spam

Setup: Node4 (low trust) creates governance domain
Test:
- Node4 creates 50 proposals in 60 seconds
- All proposals in same domain
Expected Behavior:
- Proposals accepted (governance has no built-in rate limit yet)
- Gossip propagates all proposals
- Note: This validates current behavior; Phase 19 may add rate limits
Success Criteria:
- 50 proposals created successfully
- No crashes or out-of-memory errors
- Gossip convergence time measured (should be < 2 minutes)
- Resource usage within acceptable bounds

2.10 Governance Conflicting Outcomes

Setup: Network partition scenario with governance
Test:
- Create governance domain with Node1, Node2, Node3
- Create proposal
- Partition: [Node1, Node2] vs [Node3]
- Node1 and Node2 vote: For (2/3 = 66%)
- Node1 closes proposal (sees Accepted with 2/3 votes)
- Node3 (in partition) votes: Against
- Heal partition
- Both sides have different tallies
Expected Behavior:
- Before healing: Both partitions see different state
- After healing: Gossip reconciles votes
- Final tally: 2 For, 1 Against
- Outcome recalculated if needed (may require manual review)
Success Criteria:
- All votes eventually recorded (3 total after healing)
- Conflicting outcomes detected (if any)
- Operator alerted to review proposal
- Note: This is a known edge case; Phase 19 may add partition-aware voting

3. Performance & Load Tests (2 days)

Goal: Validate system performance under realistic and stress conditions

3.1 Gossip Throughput

Setup: 3-node network
Test:
- Node1 publishes 1000 messages/sec to 5 topics
- Measure propagation latency and throughput
- Run for 10 minutes
Success Criteria:
- Median latency < 100ms
- P99 latency < 500ms
- No message loss
- CPU usage < 50% per node
- Memory growth < 100 MB over 10 minutes

3.2 Ledger Transaction Volume

Setup: 3-node network
Test:
- Simulate 100 concurrent users making transactions
- 50 transactions/sec sustained for 5 minutes
- Random transaction amounts and participants
Success Criteria:
- All transactions processed without conflicts
- Ledger convergence within 60 seconds
- No quarantined entries (all valid)
- icn_ledger_entries_total = 15,000 (50 tx/s × 300s)
- Balances sum to zero (double-entry invariant)

3.3 Compute Task Queue

Setup: 1 submitter (Node1), 2 executors (Node2, Node3)
Test:
- Submit 500 compute tasks with varying fuel limits
- Tasks include: math operations, string parsing, conditional logic
- Measure task completion rate and latency
Success Criteria:
- All 500 tasks complete successfully
- Median completion time < 5 seconds
- Tasks distributed evenly (Node2 ≈ 250, Node3 ≈ 250)
- No task timeouts or executor crashes
- All payments settled correctly

3.4 Byzantine Detection Under Load

Setup: 3 honest nodes + 1 Byzantine node
Test:
- Normal workload: 100 tx/sec + 50 compute tasks/min
- Byzantine workload: 10 violations/sec (mixed types)
- Run for 30 minutes
Success Criteria:
- Byzantine node quarantined within 1 minute
- Byzantine node auto-banned after critical violation
- Honest nodes maintain throughput (< 10% degradation)
- No false positives (honest nodes not flagged)
- icn_misbehavior_violations_total > 600 (10/s × 60s)

3.5 Governance Load Test

Setup: 3-node network with 1 governance domain (3 members)
Test:
- Create 100 proposals concurrently
- Each node opens 33-34 proposals
- All 3 nodes vote on all proposals (300 votes total)
- Close all proposals
- Measure convergence time
Success Criteria:
- All 100 proposals created successfully
- All 300 votes recorded correctly
- All proposals reach consistent outcome across nodes
- Convergence time < 5 minutes
- No vote loss or duplication
- Gossip overhead acceptable (< 50% CPU)

3.6 Memory Leak Detection

Setup: 4-node network with continuous workload
Test:
- Run for 24 hours with:
  - 10 tx/sec ledger transactions
  - 5 compute tasks/min
  - 100 gossip messages/sec
- Monitor memory usage every hour
Success Criteria:
- Memory growth < 500 MB over 24 hours
- No unbounded growth (linear or exponential)
- Resident set size (RSS) stable after initial ramp-up
- No out-of-memory crashes

4. Resilience & Fault Tolerance Tests (2 days)

Goal: Verify system recovers gracefully from failures

4.1 Node Crash Recovery

Setup: 4-node network with active workload
Test:
- Kill Node2 with SIGKILL (unclean shutdown)
- Wait 2 minutes
- Restart Node2
- Verify recovery
Success Criteria:
- Node2 rejoins network within 60 seconds
- Gossip anti-entropy fetches missed messages
- Ledger state restored via sync
- No data loss or corruption
- Workload resumes normally

4.2 Network Partition

Setup: 4-node network split into 2 partitions
Test:
- Partition 1: Node1, Node2
- Partition 2: Node3, Node4
- Block traffic between partitions for 5 minutes
- Heal partition
- Measure convergence time
Success Criteria:
- Nodes detect partition via heartbeat timeouts
- Both partitions continue operating independently
- After healing: gossip anti-entropy reconciles state
- Ledger conflicts detected and quarantined (if any)
- Full convergence within 2 minutes of healing

4.3 Byzantine Node Recovery

Setup: Node4 quarantined due to violations
Test:
- Stop Node4
- Upgrade Node4 to honest behavior
- Restart Node4
- Wait for reputation decay
Expected Behavior:
- Node4 starts with quarantined reputation (loaded from snapshot - Phase 19)
- Reputation decays at 0.01 points/hour
- After ~50 hours: reputation > 0.5 (out of quarantine)
- Node4 regains network privileges gradually
Success Criteria:
- Reputation decay works as expected
- Node4 can rejoin network after sufficient decay
- No manual intervention required (automatic recovery)

4.4 Disk Full Scenario

Setup: Node2 with limited disk quota (1 GB)
Test:
- Fill disk with gossip entries, ledger data
- Monitor behavior as disk approaches full
Expected Behavior:
- Node logs disk space warnings
- Gossip entries evicted (LRU) to free space
- Node does NOT crash
- Graceful degradation (may miss some gossip entries)
Success Criteria:
- No crashes or panics
- Node continues operating with reduced capacity
- Alerts triggered in Grafana
- Operator notified to add capacity

4.5 Prometheus/Grafana Failure

Setup: Running network with monitoring
Test:
- Stop Prometheus server
- Continue workload for 10 minutes
- Restart Prometheus
Success Criteria:
- ICN nodes continue operating normally (monitoring is non-critical)
- No crashes due to metrics export failures
- After Prometheus restart: metrics collection resumes
- No data loss (metrics buffered or dropped gracefully)

5. Operational Procedures Tests (1 day)

Goal: Validate operational workflows documented in deployment guide

5.1 Backup & Restore

Test:
- Create backup using icnctl backup create /tmp/backup.tar.gz.age
- Verify backup contains: keystore, store, config, state.snapshot
- Corrupt Node2's data directory
- Restore using icnctl backup restore /tmp/backup.tar.gz.age
Success Criteria:
- Backup completes without errors
- Restore recovers all data
- Node2 rejoins network successfully
- No data loss (ledger, trust graph, gossip subscriptions)

5.2 Version Upgrade

Test:
- Build new version with protocol version bump
- Rolling upgrade: Node1 → Node2 → Node3 → Node4
- Verify version negotiation and compatibility
Success Criteria:
- Nodes negotiate correct protocol version
- Backward compatibility maintained (if within compatibility window)
- No downtime for network (rolling upgrade successful)
- icn_network_version_negotiation_success_total increments

5.3 Security Incident Response

Test:
- Detect Node4 compromised (simulated via intentional violations)
- Follow incident response playbook:
  1. Identify compromised node via Grafana alerts
  2. Investigate logs and violation records
  3. Confirm Byzantine behavior
  4. Verify automatic isolation (ban)
  5. Document incident
Success Criteria:
- Incident detected within 1 minute (via alert)
- Compromised node automatically banned
- No manual intervention required for isolation
- Playbook steps executable and accurate

5.4 Capacity Planning

Test:
- Monitor resource usage under load
- Calculate capacity limits:
  - Max transactions/sec before latency degrades
  - Max gossip topics before memory pressure
  - Max concurrent compute tasks per executor
Success Criteria:
- Baseline capacity metrics documented
- Recommendations for scaling (add nodes, increase resources)
- Grafana dashboards show capacity utilization

Monitoring Setup

Prometheus Configuration

# monitoring/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'icn-nodes'
    static_configs:
      - targets:
          - 'node1:9090'
          - 'node2:9090'
          - 'node3:9090'
          - 'node4:9090'
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance
        regex: '([^:]+):.*'
        replacement: '$1'

Key Metrics to Monitor

Byzantine Detection:

icn_misbehavior_violations_total (by violation_type, did)
icn_misbehavior_quarantined_peers
icn_misbehavior_banned_peers
icn_misbehavior_auto_bans_total

Network Health:

icn_network_connections_active
icn_network_messages_sent_total
icn_network_messages_received_total
icn_network_messages_rate_limited_total

Gossip Performance:

icn_gossip_announces_sent_total
icn_gossip_requests_sent_total
icn_gossip_entries_total (per topic)
icn_gossip_vector_clock_updates_total

Ledger Consistency:

icn_ledger_entries_total
icn_ledger_entries_quarantined
icn_ledger_balances_total

Compute Layer:

icn_compute_tasks_submitted_total
icn_compute_tasks_completed_total
icn_compute_task_duration_seconds
icn_compute_signatures_invalid_total

Governance:

icn_governance_domains_total
icn_governance_proposals_total (by state: draft, open, closed)
icn_governance_votes_total
icn_governance_proposals_accepted_total
icn_governance_proposals_rejected_total

Alert Rules

# monitoring/alert_rules.yml
groups:
  - name: byzantine_detection
    interval: 10s
    rules:
      - alert: ByzantineNodeDetected
        expr: icn_misbehavior_quarantined_peers > 0
        for: 1m
        annotations:
          summary: "Byzantine node quarantined"
          description: "{{ $value }} nodes have been quarantined due to misbehavior"

      - alert: AutoBanTriggered
        expr: increase(icn_misbehavior_auto_bans_total[5m]) > 0
        annotations:
          summary: "Critical violation auto-ban"
          description: "A node has been auto-banned for critical violations"

      - alert: HighViolationRate
        expr: rate(icn_misbehavior_violations_total[5m]) > 1
        for: 5m
        annotations:
          summary: "High violation rate detected"
          description: "{{ $value }} violations/sec detected"

  - name: network_health
    interval: 10s
    rules:
      - alert: NetworkPartition
        expr: icn_network_connections_active < 2
        for: 2m
        annotations:
          summary: "Possible network partition"
          description: "Node has less than 2 active connections"

      - alert: HighMessageLoss
        expr: rate(icn_network_messages_failed_total[5m]) > 0.1
        for: 5m
        annotations:
          summary: "High message failure rate"
          description: "{{ $value }} messages/sec failing"

  - name: ledger_consistency
    interval: 30s
    rules:
      - alert: LedgerConflict
        expr: icn_ledger_entries_quarantined > 0
        annotations:
          summary: "Ledger entries quarantined"
          description: "{{ $value }} ledger entries in quarantine (possible fork attack)"

Test Execution Plan

Week 1: Infrastructure & Baseline Tests

Day 1-2: Environment Setup

Create Docker Compose configuration
Build ICN Docker image
Set up Prometheus + Grafana
Deploy 4-node test network
Verify metrics collection

Day 3-4: Baseline Functionality Tests

Network formation (1.1)
Trust graph sync (1.2)
Gossip propagation (1.3)
Ledger sync (1.4)
Compute execution (1.5)
Graceful restart (1.6)

Day 5: Performance Baseline

Gossip throughput (3.1)
Ledger transaction volume (3.2)
Establish performance baselines

Week 2: Byzantine Detection & Stress Tests

Day 6-8: Byzantine Behavior Tests

Invalid signature attack (2.1)
Replay attack detection (2.2)
Ledger fork attack (2.3)
Compute result forgery (2.4)
ACL violation spam (2.5)
Multi-node isolation (2.6)

Day 9: Performance Under Load

Compute task queue (3.3)
Byzantine detection under load (3.4)

Day 10: Resilience Tests

Node crash recovery (4.1)
Network partition (4.2)
Byzantine node recovery (4.3)

Day 11: Operational Procedures

Backup & restore (5.1)
Security incident response (5.3)
Capacity planning (5.4)

Day 12: Soak Test

Memory leak detection (3.5)
24-hour stability test
Final validation

Success Criteria

Mandatory Requirements (Blockers for Pilot)

[ ] All 38 test scenarios pass without failures (10 baseline + 10 Byzantine + 6 performance + 5 resilience + 4 operational + 3 governance)
[ ] No crashes or panics during any test
[ ] Byzantine nodes detected and isolated within SLA (1 minute for critical violations)
[ ] No false positives (honest nodes never quarantined/banned)
[ ] Ledger consistency maintained across all nodes (no undetected forks)
[ ] Governance voting works correctly (no vote loss, correct outcomes)
[ ] Graceful restart preserves all critical state
[ ] Network recovers from partitions within 2 minutes
[ ] 24-hour soak test completes with stable memory usage

Performance Benchmarks (Targets)

[ ] Gossip latency: median < 100ms, P99 < 500ms
[ ] Ledger transactions: 50 tx/sec sustained
[ ] Compute tasks: 10 tasks/min per executor
[ ] Byzantine detection overhead: < 0.1% CPU
[ ] Memory overhead: < 500 MB growth over 24 hours
[ ] Network partition recovery: < 2 minutes to full convergence

Optional Goals (Nice-to-Have)

[ ] 1000 tx/sec ledger throughput (stretch goal)
[ ] 1-week soak test (extended stability validation)
[ ] Chaos testing with random node failures
[ ] Performance comparison vs. baseline (pre-Phase 18)

Test Artifacts

Required Deliverables

Test Execution Log - Detailed results for each scenario
Performance Report - Throughput, latency, resource usage metrics
Bug Report - Any issues discovered with severity classification
Grafana Screenshots - Key metrics during Byzantine attacks
Incident Timeline - Step-by-step analysis of Byzantine detection events
Capacity Recommendations - Resource requirements for pilot deployment
Go/No-Go Decision - Final readiness assessment

Bug Tracking Template

## Bug Report: [Title]

**Severity**: Critical / Major / Minor
**Test Scenario**: [e.g., 2.2 Replay Attack Detection]
**Environment**: [Docker Compose / Local Processes]

**Steps to Reproduce**:
1. Start 4-node network
2. ...

**Expected Behavior**:
[What should happen]

**Actual Behavior**:
[What actually happened]

**Logs**:

[Relevant log excerpts]


**Metrics**:
[Screenshots or PromQL queries]

**Impact**:
[Pilot blocker? Workaround available?]

**Root Cause Analysis**:
[If known]

**Proposed Fix**:
[If known]

Risk Assessment

High-Risk Areas

Reputation Persistence (Known Limitation)
- Risk: Reputation reset on restart enables attackers to rejoin
- Mitigation: Phase 19 will add persistent storage; for testing, document workaround (manual ban via config)
Cross-Node Reputation Sync (Known Limitation)
- Risk: Byzantine node could exploit different reputations on different nodes
- Mitigation: Test multi-node isolation (scenario 2.6) validates independent detection
Network Partition Handling
- Risk: Ledger forks during partition may not be detected immediately
- Mitigation: Quarantine mechanism catches conflicts on partition healing
Compute Task Timeouts
- Risk: Long-running tasks may not be killed after timeout
- Mitigation: Verify timeout enforcement in scenario 3.3

Medium-Risk Areas

Gossip Convergence Time
- Risk: Large networks may have slow convergence
- Mitigation: Measure and document convergence time in scenario 1.3
Trust Graph Scalability
- Risk: Trust computation may be slow with many edges
- Mitigation: Performance test with realistic trust graph size
Metrics Export Overhead
- Risk: High-cardinality metrics may impact performance
- Mitigation: Monitor CPU usage during load tests

Timeline

Week 1: Infrastructure setup + baseline tests Week 2: Byzantine detection + stress tests

Total Duration: 2 weeks (12 working days)

Go/No-Go Decision: End of Week 2

Next Steps

Create Docker Compose setup - Start with Option 1 (recommended)
Build test automation scripts - Bash scripts for each test scenario
Set up CI/CD integration - Automated nightly test runs
Establish baseline metrics - Run baseline tests first to set performance targets
Execute test plan systematically - Follow day-by-day schedule
Document all findings - Comprehensive test report
Make Go/No-Go decision - Ready for pilot or additional hardening needed?

Status: Ready to Execute Owner: [Assign owner] Start Date: [TBD] Target Completion: [Start Date + 2 weeks]