Image Processing with 2D FFT

The spectrograms library provides comprehensive 2D FFT operations for image processing, including spatial filtering, convolution, edge detection, and more.

Overview

2D FFT operations allow you to:

Transform images to frequency domain for analysis
Apply efficient convolution (faster than spatial convolution for large kernels)
Perform spatial filtering (low-pass, high-pass, band-pass)
Detect edges and enhance features
Sharpen images

All operations work with 2D NumPy arrays and leverage the same high-performance Rust backend used for audio spectrograms.

Basic 2D FFT

Computing the 2D FFT of an image:

import numpy as np
import spectrograms as sg

# Create or load an image (256x256)
image = np.random.randn(256, 256)

# Compute 2D FFT
spectrum = sg.fft2d(image)
print(f"Spectrum shape: {spectrum.shape}")  # (256, 129)

The output shape is (nrows, ncols/2 + 1) due to Hermitian symmetry for real-valued input.

Inverse 2D FFT

Reconstruct the image from its frequency representation:

# Reconstruct original image
reconstructed = sg.ifft2d(spectrum, output_ncols=256)

# Verify reconstruction
assert np.allclose(image, reconstructed)

Power and Magnitude Spectra

Analyze frequency content:

# Compute power spectrum (squared magnitude)
power = sg.power_spectrum_2d(image)

# Compute magnitude spectrum
magnitude = sg.magnitude_spectrum_2d(image)

# Visualize centered spectrum
power_centered = sg.fftshift(power)

Convolution

FFT-based convolution is faster than spatial convolution for large kernels.

Gaussian Blur

import spectrograms as sg

# Create Gaussian kernel
kernel = sg.gaussian_kernel_2d(size=9, sigma=2.0)

# Apply blur via FFT convolution
blurred = sg.convolve_fft(image, kernel)

For small kernels (< 5x5), spatial convolution may be faster. FFT convolution excels with larger kernels.

Custom Kernels

Use any kernel for convolution:

# Create a simple averaging kernel
kernel = np.ones((5, 5)) / 25.0

# Apply convolution
smoothed = sg.convolve_fft(image, kernel)

Spatial Filtering

Spatial filters modify frequency content to enhance or suppress certain features.

Low-Pass Filter

Removes high-frequency components (smoothing):

# Keep only low frequencies (0-30% of max frequency)
smoothed = sg.lowpass_filter(image, cutoff=0.3)

High-Pass Filter

Removes low-frequency components (edge enhancement):

# Remove low frequencies, keep edges
edges = sg.highpass_filter(image, cutoff=0.1)

Band-Pass Filter

Keeps frequencies within a specific range:

# Keep mid-range frequencies
filtered = sg.bandpass_filter(image, low_cutoff=0.1, high_cutoff=0.5)

Cutoff values are normalized frequencies in the range [0, 1], where 1.0 represents the Nyquist frequency.

Edge Detection

FFT-based edge detection emphasizes high-frequency components:

# Detect edges in frequency domain
edges = sg.detect_edges_fft(image)

This is equivalent to a high-pass filter optimized for edge detection.

Image Sharpening

Enhance edges while preserving overall structure:

# Sharpen image (amount controls strength)
sharpened = sg.sharpen_fft(image, amount=1.5)

Higher amount values produce stronger sharpening effects.

Batch Processing with Fft2dPlanner

For processing multiple images efficiently, use the planner API to reuse FFT plans:

import spectrograms as sg
import numpy as np

# Create multiple images
images = [
    np.random.randn(256, 256),
    np.random.randn(256, 256),
    np.random.randn(256, 256),
]

# Create planner once
planner = sg.Fft2dPlanner()

# Process all images (reuses FFT plan)
spectra = []
for image in images:
    spectrum = planner.fft2d(image)
    spectra.append(spectrum)

The planner caches FFT plans for the given image dimensions, providing a speedup for batch processing.

Performance Considerations

When to use FFT-based operations:

Large kernels (> 7x7) - FFT convolution is much faster
Multiple operations on the same image - Combine in frequency domain
Batch processing - Use Fft2dPlanner for plan reuse

When spatial methods may be faster:

Very small kernels (< 5x5) - Spatial convolution has less overhead
Single operations - FFT setup cost may dominate

Memory usage:

FFT operations require additional memory for complex frequency arrays
Input shape (M, N) produces frequency array of shape (M, N//2+1) (complex64)

Tips and Best Practices

Normalize input: For best results with filtering, normalize image values to [0, 1] or [-1, 1]
Avoid edge artifacts: Consider padding images before FFT to reduce edge effects
Choose appropriate cutoffs: Start with conservative cutoff values (0.1-0.3) and adjust based on results
Combine operations: Apply multiple filters in frequency domain before inverse FFT to minimize overhead
Use planner for batches: Always use Fft2dPlanner when processing multiple images of the same size