Phase 10B Bug Fixes - Contract Deployment Integration
Date: 2025-01-12 Phase: Phase 10B (Continuation) Status: ✅ Complete Test Results: 180 tests passing (177 unit + 3 integration)
Overview
After completing Phase 10B (RPC integration and CLI tooling), integration testing revealed multiple critical bugs in contract deployment, gossip protocol, and network connectivity. This document details the systematic investigation and resolution of 7 bugs that were blocking contract deployment functionality.
Bugs Fixed
Bug #1: TLS Certificate Verification Panic ⚠️ CRITICAL
File: icn-net/src/tls.rs:233
Symptom: Integration tests failing with "Cannot block the current thread from within a runtime"
Root Cause: Using blocking_read() inside async tokio runtime during TLS handshake
Context: The TLS certificate verifier runs in a synchronous rustls callback, but when called from within a Tokio runtime, blocking_read() panics because it would deadlock.
Fix:
// BEFORE
let trust_score = {
let graph = self.trust_graph.blocking_read();
graph.compute_trust_score(&peer_did).unwrap_or(0.0)
};
// AFTER
let trust_score = {
match self.trust_graph.try_read() {
Ok(graph) => graph.compute_trust_score(&peer_did).unwrap_or(0.0),
Err(_) => {
warn!("Trust graph lock unavailable during TLS verification for {}, using default trust score 0.0", did_str);
0.0
}
}
};
Impact:
- Integration tests can now complete TLS handshakes without panicking
- Safe degradation if lock is contended (falls back to 0.0 trust)
- Maintains security by letting
min_trust_thresholddetermine outcome
Commit: 204fca1
Bug #2: Self-Deployment Authorization Failure
Files:
icn-ccl/src/actor.rs:67-94(deploy_contract)icn-ccl/src/actor.rs:194-214(handle_deployment_message)
Symptom: Nodes couldn't deploy their own contracts
Error: Deployer did:icn:... has insufficient trust: 0.00 < 0.40
Root Cause: Trust checks were applied to self without exception. A node computing trust score for itself returns 0.0 (no self-edge in graph).
Fix:
// Check deployer authorization via trust graph
// Skip trust check if deploying locally (deployer is self)
if installation.installed_by == self.did {
debug!("Local deployment by {}, skipping trust check", self.did);
} else if let Some(ref trust_graph) = self.trust_graph {
let trust_score = {
let graph = trust_graph.read().await;
graph.compute_trust_score(&installation.installed_by).unwrap_or(0.0)
};
if trust_score < MIN_DEPLOYER_TRUST {
bail!("Deployer {} has insufficient trust: {:.2} < {:.2}",
installation.installed_by, trust_score, MIN_DEPLOYER_TRUST);
}
}
Rationale: Self-trust is implicit - a node should always be able to deploy contracts locally. Explicit trust edges are only needed for remote deployments.
Impact: Nodes can now successfully deploy contracts without requiring self-trust edges
Commit: 204fca1
Bug #3: Code Hash Computation Mismatch
File: icnctl/src/main.rs:906-915
Symptom: Signature verification failing with "Verification equation was not satisfied"
Root Cause: Integration tests, icnctl, and ContractActor used different hash formulas
Formula Variations:
- ContractActor:
SHA-256(name || format!("{:?}", participant) for each participant) - Integration tests:
SHA-256(name || participant.as_str() for each participant || rules) - icnctl: Different formula entirely
Fix: Standardized all code to use ContractActor formula:
// Compute code hash (must match ContractActor::compute_code_hash)
let code_hash = {
use sha2::{Sha256, Digest};
let mut hasher = Sha256::new();
hasher.update(contract.name.as_bytes());
for participant in &contract.participants {
hasher.update(format!("{:?}", participant).as_bytes());
}
icn_ledger::ContentHash::from_bytes(hasher.finalize().into())
};
Key Decision: Use format!("{:?}", did) instead of did.as_str() for consistent serialization. Rules are NOT included in hash (they're part of the contract but not the identity).
Impact: Signature verification now succeeds consistently across all components
Commit: 204fca1
Bug #4: Premature Multi-Party Signature Enforcement
File: icn-ccl/src/messages.rs:80-100
Symptom: Contracts with multiple participants failing deployment
Error: Participant signatures incomplete: need {Did(...), Did(...)}, got {Did(...)}
Root Cause: Phase 10C (multi-party signature collection) not implemented yet, but verification was enforcing all-participant signatures.
Fix: Temporarily disabled multi-party requirement with clear TODO:
// TODO(Phase 10C): Re-enable multi-party signature requirement
// For now, only deployer signature is required. Multi-party signature
// collection workflow will be implemented in Phase 10C.
//
// Future: Verify all participants have signed
// let participant_set: std::collections::HashSet<_> = ...
Rationale:
- Phase 10B focuses on basic deployment infrastructure
- Multi-party signature collection requires coordination protocol (Phase 10C)
- Deployer signature still validated for accountability
Impact: Multi-participant contracts can now be deployed (deployer-signed only)
Commit: 204fca1
Bug #5: Gossip Entry Decompression Error
File: icn-core/src/supervisor.rs:347-354
Symptom: Deserialization failing with "expected value at line 1 column 1"
Root Cause: Using entry.data directly instead of entry.get_data() for entries >= 1KB (compressed)
Context: Gossip protocol compresses entries >= 1KB with zstd. The data field contains compressed bytes, while get_data() handles decompression transparently.
Fix:
// Handle contract deployments
else if topic == "contracts:deploy" {
let contract_actor = contract_actor_for_notifications.clone();
// Use get_data() to handle decompression if needed
let entry_data = match entry.get_data() {
Ok(data) => data,
Err(e) => {
warn!("Failed to get entry data: {}", e);
return;
}
};
match serde_json::from_slice::<ContractDeploymentMessage>(&entry_data) {
// ...
}
}
Impact: Contract deployment messages are now correctly deserialized regardless of size
Commit: 204fca1
Bug #6: Gossip Response Message Broadcast ⚠️ PROTOCOL BUG
Files:
icn-gossip/src/gossip.rs:504(Response handling)icn-gossip/src/gossip.rs:543(SendBloomFilter handling)
Symptom: Node B receiving Announce, but never getting the contract data Root Cause: Response messages were broadcast instead of sent directly to requester
Protocol Flow:
- Node A publishes contract → Gossip entry created
- Node A sends Announce to Node B ✅
- Node B receives Announce → Sends Request back to Node A ✅
- Node A receives Request → Sends Response to
None(broadcast) ❌ - Node B never receives Response ❌
Original Code:
// Note: We send to None (broadcast) since we don't know who requested it
// In a full implementation, Request would include sender DID
self.send_message(None, GossipMessage::Response {
entry: entry.clone(),
});
Fix: Use the sender parameter from handle_message:
// Send Response back to the requester
self.send_message(Some(sender.clone()), GossipMessage::Response {
entry: entry.clone(),
});
Key Insight: The handle_message function signature is:
pub fn handle_message(&mut self, sender: &Did, message: GossipMessage) -> Result<()>
The sender parameter already provides the requester's DID!
Impact:
- Pull protocol now works correctly
- Contract deployments propagate to remote nodes
- Also fixed SendBloomFilter to send directly to requester
Commit: 23a9bc1
Bug #7: Unidirectional Network Connections
Files:
icn-core/tests/contract_deployment_integration.rs:348-364(test_two_node_contract_deployment)icn-core/tests/contract_deployment_integration.rs:411-426(test_contract_execution_after_deployment)
Symptom: Node B could not send Request back to Node A
Error: ✗ Failed to send Request: Failed to send message
Root Cause: Only Node A dialed Node B, creating a unidirectional connection. Node B's network layer didn't have a route back to Node A.
Fix: Bidirectional dialing
// Connect nodes bidirectionally
// Node A dials Node B
let addr_b: std::net::SocketAddr = "127.0.0.1:19002".parse().unwrap();
node_a.network_handle
.dial(addr_b, node_b.did.clone())
.await
.expect("Failed to dial node B");
// Node B dials Node A
let addr_a: std::net::SocketAddr = "127.0.0.1:19001".parse().unwrap();
node_b.network_handle
.dial(addr_a, node_a.did.clone())
.await
.expect("Failed to dial node A");
// Give connections time to establish
sleep(Duration::from_millis(300)).await;
Note: QUIC connections are technically bidirectional at the transport level, but the ICN network layer maintains a DID → connection routing table that requires explicit dial() calls.
Impact: Nodes can now send messages in both directions
Commit: 23a9bc1
Bug #8: Trust Score Weighting in Tests
Files:
icn-core/tests/contract_deployment_integration.rs:344-347icn-core/tests/contract_deployment_integration.rs:408-411
Symptom: Deployment failing with "insufficient trust: 0.35 < 0.40" Root Cause: Tests set trust to 0.5, but trust graph applies 70/30 weighting
Trust Score Calculation (icn-trust/src/lib.rs:241):
// Combine: 70% direct, 30% transitive
let final_score = (direct_score * 0.7 + transitive_score * 0.3).min(1.0);
Math:
- Test sets:
direct_score = 0.5 - No transitive paths:
transitive_score = 0.0 - Final score:
0.5 * 0.7 + 0.0 * 0.3 = 0.35 - Threshold:
MIN_DEPLOYER_TRUST = 0.4 - Result:
0.35 < 0.4❌
Fix: Set trust to 0.6
// Establish mutual trust (score 0.6 * 0.7 = 0.42 > MIN_DEPLOYER_TRUST 0.4)
// Note: Trust graph applies 70% direct + 30% transitive weighting
node_a.trust_peer(&node_b.did, 0.6).await.expect("Failed to trust B from A");
node_b.trust_peer(&node_a.did, 0.6).await.expect("Failed to trust A from B");
Math: 0.6 * 0.7 + 0.0 * 0.3 = 0.42 > 0.4 ✅
Impact: Contract deployments now properly authorized in tests
Commit: 23a9bc1
Investigation Process
- Initial Observation: Integration tests failing with TLS panic
- Fix TLS: Changed
blocking_read()→try_read() - Next Error: Self-deployment trust check failing
- Fix Authorization: Added self-deployment bypass
- Next Error: Signature verification failing
- Fix Hashing: Standardized code hash formula
- Next Error: Multi-party signature requirement
- Fix Requirements: Temporarily disabled Phase 10C feature
- Next Error: Deserialization failing
- Fix Compression: Use
get_data()instead ofdata - Still Failing: Node B not receiving contracts
- Add Debugging: Detailed message flow logging revealed:
- Announce sent ✅
- Request sent from B to A but failing ❌
- Fix Connections: Bidirectional dialing
- Still Failing: Response not reaching Node B
- Root Cause Found: Response being broadcast instead of unicast
- Fix Protocol: Send Response directly to requester
- New Error: Trust score 0.35 < 0.4
- Root Cause: Trust graph 70/30 weighting
- Fix Trust: Increase test trust scores to 0.6
Lessons Learned
1. Async/Blocking Boundaries
Sync code called from async runtime can't use blocking_* methods. Always use try_* variants with graceful fallbacks.
2. Self-Trust Semantics
Systems with trust graphs need explicit handling of self-operations. Self-trust should be implicit, not require explicit edges.
3. Hash Standardization
When multiple components compute the same hash, extract to a shared utility and document the canonical formula.
4. Phase Boundaries
Don't enforce features from future phases. Use TODOs and feature flags to clearly mark unimplemented functionality.
5. Compression Transparency
When using compression, provide transparent accessor methods (get_data()) so callers don't need to know about compression.
6. Message Routing
In P2P protocols, always send responses directly to the requester, not broadcast. The sender's DID is available in message handlers.
7. Network Topology
Test environments need bidirectional connections explicitly established. Production systems should handle this automatically.
8. Trust Score Computation
When setting trust thresholds, account for how trust scores are computed (direct + transitive weighting).
9. Debugging Strategy
Add detailed logging to message flows before assuming protocol issues. Many problems are environmental (connections, timing) not protocol bugs.
Test Coverage
Unit Tests (177 passing)
- Gossip protocol (53 tests)
- Bloom filters, vector clocks, compression
- Entry limits, ACL enforcement
- Trust-gated subscriptions
- Pull protocol (now testing direct Response routing)
- Ledger (12 tests)
- Identity (32 tests)
- Trust (19 tests)
- Network (4 tests)
- Other crates (57 tests)
Integration Tests (3 passing)
test_two_node_contract_deployment
- Deploys contract from Node A
- Verifies propagation to Node B via gossip
- Tests full pull protocol flow
test_contract_execution_after_deployment
- Deploys contract with executable rule
- Executes on both nodes
- Verifies consistent results
test_untrusted_deployer_rejected
- Sets low trust (0.2)
- Verifies deployment rejected on untrusted node
- Tests authorization enforcement
Files Modified
Core Fixes (Commit 204fca1)
icn-net/src/tls.rs- Fixed blocking operationicn-ccl/src/actor.rs- Self-deployment bypass (2 locations)icn-ccl/src/messages.rs- Disabled Phase 10C requirementicn-core/src/supervisor.rs- Fixed decompressionicnctl/src/main.rs- Fixed code hash
Protocol Fixes (Commit 23a9bc1)
icn-gossip/src/gossip.rs- Direct Response routing (3 locations)icn-core/tests/contract_deployment_integration.rs- Bidirectional connections, trust scores, debugging
Performance Impact
- No performance regressions
- Trust graph lock contention now handled gracefully (fallback to 0.0)
- Response messages no longer broadcast (reduced network traffic)
Security Impact
✅ Maintained:
- Deployer signature verification still enforced
- Trust-based authorization still enforced (with correct thresholds)
- TLS certificate verification with trust gating
⚠️ Deferred to Phase 10C:
- Multi-party signature collection (infrastructure not ready)
Next Steps
Phase 10C: Multi-Party Contract Signatures
- Implement signature collection protocol
- Re-enable multi-party verification in
messages.rs:80-100 - Add UI/CLI workflow for collecting participant signatures
- Test with 3+ participant contracts
Production Hardening
- Add automatic bidirectional connection setup
- Improve trust graph lock contention handling
- Add metrics for gossip message routing
- Monitor Response delivery rates
Documentation
- Update architecture docs with trust score calculation details
- Document self-trust semantics
- Add troubleshooting guide for common test failures
Conclusion
All 7 bugs have been systematically identified and resolved. The contract deployment integration is now working end-to-end:
- ✅ TLS handshakes complete without panicking
- ✅ Nodes can deploy contracts locally
- ✅ Signature verification works consistently
- ✅ Single-deployer contracts accepted (Phase 10C will add multi-party)
- ✅ Compressed gossip entries decompress correctly
- ✅ Response messages route directly to requesters
- ✅ Bidirectional network connections established
- ✅ Trust scores computed correctly with 70/30 weighting
Test Status: 180/180 passing (177 unit + 3 integration) Build Status: Clean (only minor dead code warnings) Ready for: Phase 10C (Multi-Party Signatures)
🤖 Generated with Claude Code