Event System Integration
Overview
The module event system is designed to handle common integration pain points in distributed module architectures. This document covers all integration scenarios, reliability guarantees, and best practices.
Integration Pain Points Addressed
1. Event Delivery Reliability
Problem: Events can be lost if modules are slow or channels are full.
Solution:
- Channel Buffering: 100-event buffer per module (configurable)
- Non-Blocking Delivery: Uses
try_sendto avoid blocking the publisher - Channel Full Handling: Events are dropped with warning (module is slow, not dead)
- Channel Closed Detection: Automatically removes dead modules from subscriptions
- Delivery Statistics: Track success/failure rates per module
Code:
#![allow(unused)]
fn main() {
// EventManager tracks delivery statistics
let stats = event_manager.get_delivery_stats("module_id").await;
// Returns: Option<(successful_deliveries, failed_deliveries, channel_full_count)>
}
2. Event Ordering and Timing
Problem: Events might arrive out of order or modules might miss events during startup.
Solution:
- ModuleLoaded Timing: Only published AFTER module subscribes (startup complete)
- Hotloaded Modules: Automatically receive all already-loaded modules when subscribing
- Consistent Ordering: Subscription → ModuleLoaded events (guaranteed order)
Flow:
- Module loads → Recorded in
loaded_modules - Module subscribes → Receives all already-loaded modules
- ModuleLoaded published → After subscription (startup complete)
3. Event Channel Backpressure
Problem: Fast publishers can overwhelm slow consumers.
Solution:
- Bounded Channels: 100-event buffer prevents unbounded memory growth
- Non-Blocking: Publisher never blocks, events dropped if channel full
- Statistics Tracking: Monitor channel full events to identify slow modules
- Automatic Cleanup: Dead modules automatically removed
Monitoring:
#![allow(unused)]
fn main() {
let stats = event_manager.get_delivery_stats("module_id").await;
if let Some((_, _, channel_full_count)) = stats {
if channel_full_count > 100 {
warn!("Module {} is slow, dropping events", module_id);
}
}
}
4. Missing Events During Startup
Problem: Modules that start later miss events from earlier modules.
Solution:
- Hotloaded Module Support: Newly subscribing modules receive all already-loaded modules
- Event Replay: ModuleLoaded events sent to newly subscribing modules
- Consistent State: All modules have consistent view of loaded modules
5. Event Type Coverage
Problem: Not all events have corresponding payloads or are published.
Solution:
- Complete Coverage: All EventType variants have corresponding EventPayload variants
- Governance Events: All governance events are published
- Network Events: All network events are published
- Lifecycle Events: All lifecycle events are published
Event Categories
Core Blockchain Events
NewBlock: Block connected to chainNewTransaction: Transaction in mempoolBlockDisconnected: Block disconnected (reorg)ChainReorg: Chain reorganization
Governance Events
GovernanceProposalCreated: Proposal createdGovernanceProposalVoted: Vote castGovernanceProposalMerged: Proposal mergedGovernanceForkDetected: Fork detected
Network Events
PeerConnected: Peer connectedPeerDisconnected: Peer disconnectedPeerBanned: Peer bannedMessageReceived: Network message receivedBroadcastStarted: Broadcast startedBroadcastCompleted: Broadcast completed
Module Lifecycle Events
ModuleLoaded: Module loaded (after subscription)ModuleUnloaded: Module unloadedModuleCrashed: Module crashedModuleHealthChanged: Health status changed
Maintenance Events
DataMaintenance: Unified cleanup/flush (replaces StorageFlush + DataCleanup)MaintenanceStarted: Maintenance startedMaintenanceCompleted: Maintenance completedHealthCheck: Health check performed
Resource Management Events
DiskSpaceLow: Disk space lowResourceLimitWarning: Resource limit warning
Event Delivery Guarantees
At-Most-Once Delivery
- Events are delivered at most once per subscriber
- If channel is full, event is dropped (not retried)
- If channel is closed, module is removed from subscriptions
Best-Effort Delivery
- Events are delivered on a best-effort basis
- No guaranteed delivery (modules can be slow/dead)
- Statistics track delivery success/failure rates
Ordering Guarantees
- Events are delivered in order per module (single channel)
- No cross-module ordering guarantees
- ModuleLoaded events are ordered: subscription → ModuleLoaded
Error Handling
Channel Full
- Event is dropped with warning
- Module subscription is NOT removed (module is slow, not dead)
- Statistics track channel full count
Channel Closed
- Module subscription is removed
- Statistics track failed delivery count
- Module is automatically cleaned up
Serialization Errors
- Event is dropped with warning
- Module subscription is NOT removed
- Error is logged for debugging
Monitoring and Debugging
Delivery Statistics
#![allow(unused)]
fn main() {
// Get statistics for a module
let stats = event_manager.get_delivery_stats("module_id").await;
// Returns: Option<(successful, failed, channel_full)>
// Get statistics for all modules
let all_stats = event_manager.get_all_delivery_stats().await;
// Returns: HashMap<module_id, (successful, failed, channel_full)>
// Reset statistics (for testing)
event_manager.reset_delivery_stats("module_id").await;
}
Event Subscribers
#![allow(unused)]
fn main() {
// Get list of subscribers for an event type
let subscribers = event_manager.get_subscribers(EventType::NewBlock).await;
// Returns: Vec<module_id>
}
Best Practices
For Module Developers
- Subscribe Early: Subscribe to events as soon as possible after handshake
- Handle Events Quickly: Keep event handlers fast and non-blocking
- Monitor Statistics: Check delivery statistics to ensure events are received
- Handle ModuleLoaded: Always handle ModuleLoaded to know about other modules
- Graceful Shutdown: Handle NodeShutdown and DataMaintenance (urgency: “high”)
For Node Developers
- Publish Consistently: Publish events at consistent points in the code
- Use EventPublisher: Use EventPublisher for all event publishing
- Monitor Statistics: Monitor delivery statistics to identify slow modules
- Handle Errors: Log warnings for failed event deliveries
- Test Integration: Test event delivery in integration tests
Common Integration Scenarios
Scenario 1: Module Startup
- Module process spawned
- Module connects via IPC
- Module sends Handshake
- Module subscribes to events
- Module receives ModuleLoaded for all already-loaded modules
- ModuleLoaded published for this module (after subscription)
Scenario 2: Hotloaded Module
- Module B loads while Module A is already running
- Module B subscribes to events
- Module B receives ModuleLoaded for Module A
- ModuleLoaded published for Module B
- Module A receives ModuleLoaded for Module B
Scenario 3: Slow Module
- Module receives events slowly
- Event channel fills up (100 events)
- New events are dropped with warning
- Statistics track channel full count
- Module subscription is NOT removed (module is slow, not dead)
Scenario 4: Dead Module
- Module process crashes
- Event channel is closed
- Event delivery fails
- Module subscription is automatically removed
- Statistics track failed delivery count
Scenario 5: Governance Event Flow
- Network receives governance event
- Event published to governance module
- Governance module processes event
- Governance module may publish additional events
- All events delivered via same reliable channel
Configuration
Channel Buffer Size
Currently hardcoded to 100 events per module. Can be made configurable in the future.
Event Statistics
Statistics are kept in memory and reset on node restart. Can be persisted in the future.
Future Improvements
- Configurable Buffer Size: Make channel buffer size configurable per module
- Event Persistence: Persist events for replay after module restart
- Event Filtering: Allow modules to filter events by criteria
- Event Priority: Add priority queue for critical events
- Event Metrics: Add Prometheus metrics for event delivery
- Event Replay: Allow modules to replay missed events
See Also
- Module System - Module system architecture
- Event Consistency - Event timing and consistency guarantees
- Janitorial Events - Maintenance and lifecycle events
- Module IPC Protocol - IPC communication details