Performance and Benchmarks
The spectrograms library is designed for high performance, with a Rust core and zero-copy Python bindings.
Benchmark Results
Benchmarks comparing spectrograms against NumPy and SciPy implementations are available in the PYTHON_BENCHMARK.md file.
Summary
Average speedups across all parameter configurations and signal types:
Operation |
Rust (ms) |
NumPy (ms) |
SciPy (ms) |
Avg Speedup |
|---|---|---|---|---|
Power |
0.126 |
0.205 |
0.327 |
1.6x / 2.6x |
Magnitude |
0.140 |
0.198 |
0.319 |
1.4x / 2.3x |
Decibels |
0.257 |
0.350 |
0.451 |
1.4x / 1.8x |
Mel |
0.180 |
0.630 |
0.612 |
3.5x / 3.4x |
LogHz |
0.178 |
0.547 |
0.534 |
3.1x / 3.0x |
ERB |
0.601 |
3.713 |
3.714 |
6.2x / 6.2x |
Key Findings
Filterbank operations (Mel, ERB, LogHz) show the largest speedups (3-6x) due to:
Pre-computed filterbanks cached in plans
Sparse matrix operations
Minimal memory allocation
Basic operations (Power, Magnitude, dB) show 1.4-2.6x speedups from:
Rust’s performance
Zero-copy NumPy integration
GIL release during computation
Consistency: Low standard deviations show reliable, predictable performance
Why spectrograms is Faster
The library achieves superior performance through several optimizations that are applied automatically:
Pre-computed Filterbanks
When using the planner API, filterbanks (Mel, ERB, LogHz) are computed once and cached:
planner = sg.SpectrogramPlanner()
plan = planner.mel_db_plan(params, mel_params, db_params)
# Filterbank computed once ↑
for signal in signals:
spec = plan.compute(signal) # Reuses cached filterbank
NumPy/SciPy recompute filterbanks on every call, wasting time on redundant calculations.
Sparse Matrix Operations
Filterbanks are stored as sparse matrices and applied using optimized sparse matrix-vector multiplication, avoiding unnecessary computations on zero elements.
Memory Efficiency
The Rust implementation uses:
Pre-allocated workspace buffers
Minimal temporary allocations
Efficient memory layouts
GIL Release
All computation functions release Python’s Global Interpreter Lock (GIL), enabling:
Parallel processing of multiple files across threads
Concurrent computation with other Python operations
Optimization Tips
1. Use the Planner API
Always use plans for batch processing:
# ❌ Slow: Creates new plan every iteration
for signal in signals:
spec = sg.compute_mel_db_spectrogram(signal, params, mel_params, db_params)
# ✅ Fast: Reuses plan
planner = sg.SpectrogramPlanner()
plan = planner.mel_db_plan(params, mel_params, db_params)
for signal in signals:
spec = plan.compute(signal)
Speedup: 1.5-3x depending on operation type.
2. Choose Power-of-2 FFT Sizes
FFT algorithms are optimized for power-of-2 sizes:
# ✅ Fast
stft = sg.StftParams(n_fft=512, ...) # 2^9
stft = sg.StftParams(n_fft=1024, ...) # 2^10
stft = sg.StftParams(n_fft=2048, ...) # 2^11
# ❌ Slower
stft = sg.StftParams(n_fft=1000, ...) # Not power-of-2
3. Streaming for Real-Time Applications
For real-time processing, use frame-by-frame computation:
plan = planner.mel_db_plan(params, mel_params, db_params)
for frame_idx in range(n_frames):
frame_data = plan.compute_frame(signal, frame_idx)
# Process frame immediately
This minimizes latency and memory usage.
4. Batch Processing with Parallelism
Since computation releases the GIL, process multiple files in parallel:
from concurrent.futures import ThreadPoolExecutor
planner = sg.SpectrogramPlanner()
plan = planner.mel_db_plan(params, mel_params, db_params)
def process_file(signal):
return plan.compute(signal)
with ThreadPoolExecutor(max_workers=4) as executor:
results = list(executor.map(process_file, signals))
5. Choose the Right Backend
The library supports two FFT backends:
RealFFT (default): Pure Rust, no dependencies, good performance
FFTW: Requires system library, may be faster for specific sizes
If performance is critical, benchmark both backends for your specific use case.
Backend Comparison
Performance depends on FFT size, system architecture, and available SIMD instructions. General guidelines:
FFT Size |
RealFFT |
FFTW |
|---|---|---|
Small (≤ 512) |
Excellent |
Excellent |
Medium (1024-2048) |
Excellent |
Excellent (slightly faster) |
Large (≥ 4096) |
Good |
Better |
Both backends provide substantial speedups over NumPy/SciPy.
Measuring Your Performance
Use the included benchmark notebook to measure performance on your system:
# Install development dependencies
pip install jupyter matplotlib seaborn
# Run benchmark notebook
jupyter lab python/examples/notebook.ipynb
This provides detailed timings for your specific hardware and configurations.
See Also
PYTHON_BENCHMARK.md - Full benchmark results
Batch Processing - Efficient batch processing
python/examples/fft_performance_analysis.py- Performance analysis example