Performance Optimizations

Overview

The node implements performance optimizations for initial block download (IBD), parallel validation, and efficient UTXO operations. Actual speedup depends on hardware, network, and workload. For current numbers, see benchmarks.thebitcoincommons.org (when available) or run benchmarks locally (Benchmarking).

Parallel Initial Block Download (IBD)

Overview

Parallel IBD significantly speeds up initial blockchain synchronization by downloading and validating blocks concurrently from multiple peers. The system uses checkpoint-based parallel header download, block pipelining, streaming validation, and efficient batch storage operations.

The node uses parallel IBD for initial sync. Code: parallel_ibd/mod.rs

Architecture

The parallel IBD system consists of several coordinated optimizations:

Checkpoint Parallel Headers: Download headers in parallel using hardcoded checkpoints
Block Pipelining: Download multiple blocks concurrently from each peer
Streaming Validation: Validate blocks as they arrive using a reorder buffer
Batch Storage: Use batch writes for efficient UTXO set updates

Checkpoint Parallel Headers

Headers are downloaded in parallel using hardcoded checkpoints at well-known block heights. This allows multiple header ranges to be downloaded simultaneously from different peers.

Checkpoints: Genesis, 11111, 33333, 74000, 105000, 134444, 168000, 193000, 210000 (first halving), 250000, 295000, 350000, 400000, 450000, 500000, 550000, 600000, 650000, 700000, 750000, 800000, 850000

Code: parallel_ibd/checkpoints.rs

Algorithm:

Identify checkpoint ranges for the target height range
Download headers in parallel for each range
Each range uses the checkpoint hash as its starting locator
Verification ensures continuity and checkpoint hash matching

Block Pipelining

Blocks are downloaded with deep pipelining per peer, allowing multiple outstanding block requests to hide network latency.

Configuration (see IBD Configuration and Node Configuration):

chunk_size: blocks per chunk (default: 128; ENV BLVM_IBD_CHUNK_SIZE 16–2000)
max_blocks_in_transit_per_peer: in-flight blocks per peer (default: 128; keep ≥ chunk_size)
download_timeout_secs: timeout per block in seconds (default: 30)
max_concurrent_per_peer: fixed at 64 in code (not in [ibd] config; see ParallelIBDConfig)

Code: ParallelIBDConfig

Dynamic Work Dispatch:

Uses a shared work queue instead of static chunk assignment
Fast peers automatically grab more work as they finish chunks
On WAN-only parallel sync, multi-peer work-stealing is default; set BLVM_IBD_WAN_SINGLE_PEER=1 for single-peer download
FIFO ordering ensures lowest heights are processed first

Streaming Validation with Reorder Buffer

Blocks may arrive out of order from parallel downloads. A reorder buffer ensures blocks are validated in sequential order while allowing downloads to continue.

Implementation:

Reorder buffer (BTreeMap) holds blocks until next expected height; buffer limit is height-dependent (see memory.rs).
Streaming validation: validates blocks in order as they become available.
Backpressure: downloads pause when buffer is full.

Code: parallel_ibd (feeder, validation_loop when production feature enabled)

Batch Storage Operations

UTXO set updates use batch writes for efficient bulk operations (single transaction vs many).

BatchWriter Trait:

Accumulates multiple put/delete operations
Commits all operations atomically in a single transaction
Ensures database consistency even on crash

Code: BatchWriter (trait and backend impls)

Usage:

#![allow(unused)]
fn main() {
let mut batch = tree.batch();
for (key, value) in utxo_updates {
    batch.put(key, value);
}
batch.commit()?;  // Single atomic commit
}

Peer Scoring and Filtering

The system tracks peer performance and filters out extremely slow peers during IBD:

Latency Tracking: Monitors average block download latency per peer
Slow Peer Filtering: Drops peers with >90s average latency (keeps at least 2)
Dynamic Selection: Fast peers automatically get more work

Code: parallel_ibd (peer scoring and filtering)

Configuration

[ibd]
chunk_size = 128
download_timeout_secs = 30
mode = "parallel"
eviction = "fifo"
max_blocks_in_transit_per_peer = 128
headers_timeout_secs = 30
headers_max_failures = 10

(max_concurrent_per_peer is fixed at 64 in the node; not in IbdConfig. See Node Configuration and configuration-reference.)

Code: parallel_ibd/mod.rs

Parallel headers, pipelining, streaming validation, and batch storage all contribute to faster IBD compared to a single-threaded sequential sync. See benchmarks for current measurements.

IBD UTXO engine (optional)

When BLVM_IBD_ENGINE=1, validated blocks apply UTXO changes through the age-tiered engine under storage/ibd_engine/ (checkpoints, crash-safe resume). Download still uses the parallel pipeline above.

See IBD UTXO engine for enablement, architecture, and checkpoint env vars.

Parallel Block Validation

Architecture

Blocks are validated in parallel when they are deep enough from the chain tip. This optimization uses Rayon for parallel execution.

Code: validation/mod.rs

Safety Conditions

Parallel validation is only used when:

Blocks are beyond max_parallel_depth from tip (default in code: 100 blocks; see ParallelBlockValidator::default)
Each block uses its own UTXO set snapshot (independent validation)
Blocks are validated sequentially if too close to tip

Code: validation/mod.rs

Implementation

#![allow(unused)]
fn main() {
pub fn validate_blocks_parallel(
    &self,
    contexts: &[BlockValidationContext],
    depth_from_tip: usize,
    network: Network,
) -> Result<Vec<(ValidationResult, UtxoSet)>> {
    if depth_from_tip <= self.max_parallel_depth {
        return self.validate_blocks_sequential(contexts, network);
    }
    
    // Parallel validation using Rayon
    use rayon::prelude::*;
    contexts.par_iter().map(|context| {
        connect_block(&context.block, ...)
    }).collect()
}
}

Code: validation/mod.rs

Batch UTXO Operations

Batch Fee Calculation

Transaction fees are calculated in batches by pre-fetching all UTXOs before validation:

Collect all prevouts from all transactions
Batch UTXO lookup (single pass through HashMap)
Cache UTXOs for fee calculation
Calculate fees using cached UTXOs

Code: block/apply.rs

Implementation

#![allow(unused)]
fn main() {
// Pre-collect all prevouts for batch UTXO lookup
let all_prevouts: Vec<&OutPoint> = block
    .transactions
    .iter()
    .filter(|tx| !is_coinbase(tx))
    .flat_map(|tx| tx.inputs.iter().map(|input| &input.prevout))
    .collect();

// Batch UTXO lookup (single pass)
let mut utxo_cache: HashMap<&OutPoint, &UTXO> =
    HashMap::with_capacity(all_prevouts.len());
for prevout in &all_prevouts {
    if let Some(utxo) = utxo_set.get(prevout) {
        utxo_cache.insert(prevout, utxo);
    }
}
}

Code: block/apply.rs

Configuration

[performance]
enable_batch_utxo_lookups = true
parallel_batch_size = 8

Code: config.rs

Assume-Valid Height

Overview

Assume-valid height skips expensive signature verification for blocks before a configured height, reducing IBD time. The node merges [block_validation] into consensus validation config at startup.

Code: config/mod.rs (BlockValidationNodeConfig), block/mod.rs

Safety

This optimization is safe because:

These blocks are already validated by the network
Block structure, Merkle roots, and proof-of-work are still validated
Only signature verification is skipped (the expensive operation)

Configuration

[block_validation]
assume_valid_height = 912683   # mainnet library default when unset; use 0 for full script checks
# assume_valid_hash = "…"      # optional: hash at assume_valid_height (-assumevalid)

Environment variable (overrides file):

export BLVM_ASSUME_VALID_HEIGHT=912683

Network defaults when neither file nor env is set: mainnet 912 683, testnet 4 550 000, regtest 0. See configuration-reference.

Signature verification is a major cost; skipping it for blocks below the threshold speeds IBD. Set assume_valid_height = 0 for maximum script-validation assurance.

Parallel Transaction Validation

Architecture

Within a block, transaction validation is parallelized where safe:

Parallel validation (read-only UTXO access): transaction structure, input validation, fee calculation, script verification.
Sequential application (write operations): UTXO set updates and state transitions to maintain correctness.

Code: block/mod.rs

Implementation

#![allow(unused)]
fn main() {
#[cfg(feature = "rayon")]
{
    use rayon::prelude::*;
    // Parallel validation (read-only)
    let validation_results: Vec<Result<...>> = block
        .transactions
        .par_iter()
        .map(|tx| { check_transaction(tx)?; check_tx_inputs(tx, &utxo_cache, height)?; ... })
        .collect();
    // Sequential application (write operations)
    for (tx, validation) in transactions.zip(validation_results) {
        apply_transaction(tx, &mut utxo_set)?;
    }
}
}

Code: block/mod.rs

Advanced Indexing

Address Indexing

Indexes transactions by address for fast lookup:

Address Database: Maps addresses to transaction history
Fast Lookup: O(1) address-to-transaction mapping
Incremental Updates: Updates on each block

Code: txindex.rs, transaction-indexing.md

Value Range Indexing

Indexes UTXOs by value range for efficient queries:

Range Queries: Find UTXOs in value ranges
Optimized Lookups: Indexed by value range for efficient queries
Memory Efficient: Sparse indexing structure

Runtime Optimizations

Constant Folding

Pre-computed constants avoid runtime computation:

#![allow(unused)]
fn main() {
pub mod precomputed_constants {
    pub const U64_MAX: u64 = u64::MAX;
    pub const MAX_MONEY_U64: u64 = MAX_MONEY as u64;
    pub const BTC_PER_SATOSHI: f64 = 1.0 / (SATOSHIS_PER_BTC as f64);
}
}

Code: optimizations.rs

Bounds Check Optimization

Optimized bounds checking for proven-safe access patterns:

#![allow(unused)]
fn main() {
pub fn get_proven<T>(slice: &[T], index: usize, bound_check: bool) -> Option<&T> {
    if bound_check {
        slice.get(index)
    } else {
        // Unsafe only when bounds are statically proven
        unsafe { ... }
    }
}
}

Code: optimizations.rs

Cache-Friendly Memory Layouts

32-byte aligned hash structures for better cache performance:

#![allow(unused)]
fn main() {
#[repr(align(32))]
pub struct CacheAlignedHash([u8; 32]);
}

Code: optimizations.rs

Performance Configuration

Configuration Options

[performance]
# Script verification threads (0 = auto-detect)
script_verification_threads = 0

# Parallel batch size
parallel_batch_size = 8

# Enable SIMD optimizations
enable_simd_optimizations = true

# Enable cache optimizations
enable_cache_optimizations = true

# Enable batch UTXO lookups
enable_batch_utxo_lookups = true

Code: config.rs

Default Values

script_verification_threads: 0 (auto-detect from CPU count)
parallel_batch_size: 8 transactions per batch
enable_simd_optimizations: true
enable_cache_optimizations: true
enable_batch_utxo_lookups: true

Code: config.rs

Benchmark Results

Benchmark results are published at benchmarks.thebitcoincommons.org, generated by workflows in the blvm-bench repository. Run benchmarks locally for your hardware; see Benchmarking.

Components

The performance optimization system includes:

Parallel block validation
Batch UTXO operations
Assume-valid height (signature skip below threshold)
Parallel transaction validation
Advanced indexing (address, value range)
Runtime optimizations (constant folding, bounds checks, cache-friendly layouts)
Performance configuration

Location: blvm-consensus/src/optimizations.rs, blvm-consensus/src/block/, blvm-consensus/src/config.rs, blvm-node/src/validation/mod.rs. Storage default for IBD is heed3 (LMDB + rkyv) when the heed3 feature is enabled; use database_backend = "rocksdb" for RocksDB.

BLVM Documentation