Performance and Benchmarks

The spectrograms library is designed for high performance, with a Rust core and zero-copy Python bindings.

Benchmark Results

Benchmarks comparing spectrograms against NumPy and SciPy implementations are available in the PYTHON_BENCHMARK.md file.

Summary

Average speedups across all parameter configurations and signal types:

Operation

Rust (ms)

NumPy (ms)

SciPy (ms)

Avg Speedup

Power

0.126

0.205

0.327

1.6x / 2.6x

Magnitude

0.140

0.198

0.319

1.4x / 2.3x

Decibels

0.257

0.350

0.451

1.4x / 1.8x

Mel

0.180

0.630

0.612

3.5x / 3.4x

LogHz

0.178

0.547

0.534

3.1x / 3.0x

ERB

0.601

3.713

3.714

6.2x / 6.2x

Key Findings

  1. Filterbank operations (Mel, ERB, LogHz) show the largest speedups (3-6x) due to:

    • Pre-computed filterbanks cached in plans

    • Sparse matrix operations

    • Minimal memory allocation

  2. Basic operations (Power, Magnitude, dB) show 1.4-2.6x speedups from:

    • Rust’s performance

    • Zero-copy NumPy integration

    • GIL release during computation

  3. Consistency: Low standard deviations show reliable, predictable performance

Why spectrograms is Faster

The library achieves superior performance through several optimizations that are applied automatically:

Pre-computed Filterbanks

When using the planner API, filterbanks (Mel, ERB, LogHz) are computed once and cached:

planner = sg.SpectrogramPlanner()
plan = planner.mel_db_plan(params, mel_params, db_params)

# Filterbank computed once ↑

for signal in signals:
    spec = plan.compute(signal)  # Reuses cached filterbank

NumPy/SciPy recompute filterbanks on every call, wasting time on redundant calculations.

Sparse Matrix Operations

Filterbanks are stored as sparse matrices and applied using optimized sparse matrix-vector multiplication, avoiding unnecessary computations on zero elements.

Memory Efficiency

The Rust implementation uses:

  • Pre-allocated workspace buffers

  • Minimal temporary allocations

  • Efficient memory layouts

GIL Release

All computation functions release Python’s Global Interpreter Lock (GIL), enabling:

  • Parallel processing of multiple files across threads

  • Concurrent computation with other Python operations

Optimization Tips

1. Use the Planner API

Always use plans for batch processing:

# ❌ Slow: Creates new plan every iteration
for signal in signals:
    spec = sg.compute_mel_db_spectrogram(signal, params, mel_params, db_params)

# ✅ Fast: Reuses plan
planner = sg.SpectrogramPlanner()
plan = planner.mel_db_plan(params, mel_params, db_params)
for signal in signals:
    spec = plan.compute(signal)

Speedup: 1.5-3x depending on operation type.

2. Choose Power-of-2 FFT Sizes

FFT algorithms are optimized for power-of-2 sizes:

# ✅ Fast
stft = sg.StftParams(n_fft=512, ...)   # 2^9
stft = sg.StftParams(n_fft=1024, ...)  # 2^10
stft = sg.StftParams(n_fft=2048, ...)  # 2^11

# ❌ Slower
stft = sg.StftParams(n_fft=1000, ...)  # Not power-of-2

3. Streaming for Real-Time Applications

For real-time processing, use frame-by-frame computation:

plan = planner.mel_db_plan(params, mel_params, db_params)

for frame_idx in range(n_frames):
    frame_data = plan.compute_frame(signal, frame_idx)
    # Process frame immediately

This minimizes latency and memory usage.

4. Batch Processing with Parallelism

Since computation releases the GIL, process multiple files in parallel:

from concurrent.futures import ThreadPoolExecutor

planner = sg.SpectrogramPlanner()
plan = planner.mel_db_plan(params, mel_params, db_params)

def process_file(signal):
    return plan.compute(signal)

with ThreadPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(process_file, signals))

5. Choose the Right Backend

The library supports two FFT backends:

  • RealFFT (default): Pure Rust, no dependencies, good performance

  • FFTW: Requires system library, may be faster for specific sizes

If performance is critical, benchmark both backends for your specific use case.

Backend Comparison

Performance depends on FFT size, system architecture, and available SIMD instructions. General guidelines:

FFT Size

RealFFT

FFTW

Small (≤ 512)

Excellent

Excellent

Medium (1024-2048)

Excellent

Excellent (slightly faster)

Large (≥ 4096)

Good

Better

Both backends provide substantial speedups over NumPy/SciPy.

Measuring Your Performance

Use the included benchmark notebook to measure performance on your system:

# Install development dependencies
pip install jupyter matplotlib seaborn

# Run benchmark notebook
jupyter lab python/examples/notebook.ipynb

This provides detailed timings for your specific hardware and configurations.

See Also