Module Splitting Analysis

Date: 2026-01-23
Issue: Analyze large modules for potential splitting
Status: Analysis Complete - Ready for Implementation

Executive Summary

This document analyzes the 9 largest Rust modules in the ICN codebase (each >2000 lines) and provides recommendations for splitting them into more maintainable submodules. The analysis focuses on the largest module icn-ledger/src/ledger.rs (5,447 lines) with concrete splitting recommendations.

Methodology

Modules were evaluated against five criteria:

  1. Cohesion: Does the code naturally cluster into subdomains?
  2. Coupling: Can dependencies between parts be minimized?
  3. Testing: Will splitting improve testability?
  4. Stability: Is the module actively changing or stable?
  5. Risk: What's the blast radius of refactoring errors?

Large Modules Overview

File Lines Priority Recommendation
icn-ledger/src/ledger.rs 5,447 HIGH Split into 5 submodules
icn-obs/src/metrics_legacy.rs 4,920 LOW Mark deprecated, migrate users
icn-gossip/src/gossip.rs 3,672 MEDIUM Split by protocol phase
icn-governance/src/protocol_store.rs 3,452 MEDIUM Extract state management
icn-ledger/src/treasury.rs 3,048 MEDIUM Split by operation type
icn-ccl/src/disputes.rs 2,654 LOW Well-structured, defer
icn-net/src/actor.rs 2,506 MEDIUM Split by message type
icn-governance/src/proposal.rs 2,310 LOW Consider state machine split
icn-net/src/protocol.rs 2,309 LOW Stable, defer

Detailed Analysis: icn-ledger/src/ledger.rs

Current Structure

The ledger.rs file contains 5,447 lines with:

  • 1 main struct: Ledger (lines 274-356)
  • 1 main impl block: impl Ledger (lines 358-3841)
  • 46 tests in the test module (lines 3841-5447)
  • 80+ public methods spanning multiple concerns

Natural Boundaries Identified

The code naturally clusters into 5 functional domains:

1. Query Operations (~600 lines)

Functions that read from the ledger without modifying state:

  • get_entry(), get_all_entries(), count_entries()
  • get_entries_paginated(), get_entries_paginated_asc()
  • get_entries_filtered_paginated()
  • get_archived_entries(), list_rollback_timestamps()

Cohesion: ✅ High - All functions query entry data
Coupling: ✅ Low - Only depend on Store trait and ContentHash
Testing: ✅ Independent test suite possible
Stability: ✅ Stable API, minimal changes expected
Risk: 🟡 Medium - Core functionality but well-isolated

Recommendation: Extract to ledger_impl/queries.rs

2. Balance Operations (~400 lines)

Functions managing account balances and cleared volume:

  • get_balance(), get_account_balances(), get_all_balances()
  • total_cleared_by(), recompute_balances(), recompute_balances_with_retry()
  • load_cached_balances(), save_cached_balances()
  • load_cleared_volume_index(), save_cleared_volume_index()

Cohesion: ✅ High - All functions manage balance state
Coupling: 🟡 Medium - Depends on entry validation but separable
Testing: ✅ Can be tested independently with mock entries
Stability: ✅ Mature, well-tested
Risk: 🟡 Medium - Critical for correctness but self-contained

Recommendation: Extract to ledger_impl/balances.rs

3. Fork Resolution (~500 lines)

Functions for detecting and resolving ledger forks:

  • detect_forks(), has_fork(), detect_and_resolve_forks()
  • quarantine_forked_entry(), get_fork_stats()
  • rebuild_fork_index(), ensure_timestamp_index()

Cohesion: ✅ High - All functions related to fork handling
Coupling: ✅ Low - Uses ForkDetector and ForkResolver (already separate types)
Testing: ✅ Fork scenarios can be tested in isolation
Stability: ✅ Implemented in Phase 18, stable since
Risk: 🟢 Low - Uses existing fork_resolution module types

Recommendation: Extract to ledger_impl/fork_ops.rs

4. Freeze Operations (~300 lines)

Functions for emergency member freezing:

  • freeze_member(), freeze_member_with_metadata()
  • unfreeze_member(), unfreeze_member_with_metadata()
  • is_member_frozen(), get_freeze_record()
  • list_frozen_members(), frozen_member_count(), cleanup_expired_freezes()

Cohesion: ✅ High - All functions manage freeze state
Coupling: ✅ Low - Uses FreezeManager (already separate type)
Testing: ✅ Can test freeze scenarios independently
Stability: ✅ Mature feature (Issue #25)
Risk: 🟢 Low - Delegates to FreezeManager

Recommendation: Extract to ledger_impl/freeze_ops.rs

5. Witness Operations (~400 lines)

Functions for multi-signature witness validation:

  • requires_witnesses(), effective_witness_policy()
  • validate_witness_signatures(), store_witness_signatures()
  • load_witness_signatures(), count_entry_signers()
  • calculate_entry_value()

Cohesion: ✅ High - All functions handle witness signatures
Coupling: 🟡 Medium - Tied to entry validation
Testing: ✅ Can mock witness scenarios
Stability: ✅ Stable since Issue #676
Risk: 🟡 Medium - Security-critical but well-encapsulated

Recommendation: Extract to ledger_impl/witness_ops.rs

Remaining Core (~3200 lines)

After extraction, the core ledger.rs would retain:

  • Struct definition and configuration setters (~200 lines)
  • Entry append/validation logic (~1500 lines)
  • Gossip synchronization (~200 lines)
  • FX (cross-currency) operations (~800 lines)
  • Tests (~500 lines moved to integration tests, rest in submodules)

This brings the core module to ~2700 lines, which is still large but more manageable.

Proposed File Structure

icn-ledger/src/
├── ledger.rs          (~2700 lines) - Core ledger logic
├── ledger_impl/
│   ├── mod.rs         - Re-export submodules
│   ├── queries.rs     (~600 lines) - Entry queries
│   ├── balances.rs    (~400 lines) - Balance operations
│   ├── fork_ops.rs    (~500 lines) - Fork detection/resolution
│   ├── freeze_ops.rs  (~300 lines) - Member freeze operations
│   └── witness_ops.rs (~400 lines) - Witness validation
├── lib.rs             - Public API re-exports (no changes)
└── ... (other existing modules)

Note: The submodule directory is named ledger_impl/ to avoid naming conflict with ledger.rs.

Implementation Strategy

  1. Phase 1: Extract Queries (Lowest Risk)

    • Create ledger/queries.rs with query functions
    • Move tests to integration test file
    • Maintain backward compat with pub re-exports in ledger.rs
    • Run full test suite to verify
  2. Phase 2: Extract Balances

    • Create ledger/balances.rs with balance functions
    • Move related tests
    • Update re-exports
  3. Phase 3: Extract Fork Operations

    • Create ledger/fork_ops.rs with fork functions
    • Move fork tests
  4. Phase 4: Extract Freeze Operations

    • Create ledger/freeze_ops.rs with freeze functions
    • Move freeze tests
  5. Phase 5: Extract Witness Operations

    • Create ledger/witness_ops.rs with witness functions
    • Move witness tests
  6. Phase 6: Final Validation

    • Run full test suite
    • Run clippy and fmt
    • Verify no API breakage
    • Performance regression check

Backward Compatibility

All public functions will be re-exported from ledger.rs via:

// In ledger.rs
mod ledger_impl;

// Re-export all public functions from submodules
pub use ledger_impl::queries::*;
pub use ledger_impl::balances::*;
pub use ledger_impl::fork_ops::*;
pub use ledger_impl::freeze_ops::*;
pub use ledger_impl::witness_ops::*;

This ensures zero API breakage for external consumers.

Risks and Mitigations

Risk Likelihood Impact Mitigation
Test failures after split Low High Incremental extraction with test runs after each phase
API breakage Very Low High Re-export all public functions from original location
Performance regression Very Low Medium Balance computation is cached, no hot path changes
Merge conflicts Medium Low Coordinate with team, do in quiet period

Other Modules Analysis

icn-obs/src/metrics_legacy.rs (4,920 lines)

Status: Legacy code
Recommendation: Mark as deprecated, create migration guide to new metrics system
Priority: LOW - Not blocking features
Risk: 🟢 Low - Legacy system, new code shouldn't use it

icn-gossip/src/gossip.rs (3,672 lines)

Cohesion: 🟡 Medium - Mix of protocol phases (push/pull/anti-entropy)
Recommendation: Split by:

  • gossip/protocol.rs - Message handling
  • gossip/anti_entropy.rs - Bloom filter sync
  • gossip/subscriptions.rs - Topic management
    Priority: MEDIUM
    Risk: 🟡 Medium - Active protocol, needs careful testing

icn-governance/src/protocol_store.rs (3,452 lines)

Cohesion: 🟡 Medium - Complex state management
Recommendation: Split by:

  • protocol_store/state.rs - State machine
  • protocol_store/queries.rs - Read operations
  • protocol_store/persistence.rs - Storage operations
    Priority: MEDIUM
    Risk: 🟡 Medium - Governance is critical

icn-ledger/src/treasury.rs (3,048 lines)

Cohesion: ✅ High - But large
Recommendation: Split by operation type:

  • treasury/budgets.rs - Budget management
  • treasury/approvals.rs - Approval workflow
  • treasury/audit.rs - Audit trail
    Priority: MEDIUM
    Risk: 🟡 Medium - Financial operations

Other Modules (2300-2650 lines)

Recommendation: Defer splitting
Rationale: Below critical size threshold, well-structured, stable
Priority: LOW

Success Metrics

After splitting, we expect:

  • ✅ No module exceeds 3000 lines
  • ✅ Related functions grouped together
  • ✅ Zero API breakage
  • ✅ All tests pass
  • ✅ Improved compile times (parallel module compilation)
  • ✅ Easier code navigation and review

Scope

  • ledger.rs splitting: 5 phases (queries → balances → forks → freeze → witness)
  • Testing and validation: Full test suite after each phase
  • Other modules (if approved): Separate PRs per module

Approval Required

This analysis recommends proceeding with Phase 1-6: ledger.rs splitting as described above. The other modules can be addressed in future iterations based on team priorities.


Next Steps:

  1. Review this analysis with the team
  2. Get approval to proceed with ledger.rs splitting
  3. Create separate issues for other modules
  4. Begin implementation in order: queries → balances → forks → freeze → witness