Convenience Functions

High-level functions for one-shot computation. For batch processing, use the Planner API API instead.

Audio Processing Functions

Linear Spectrograms

spectrograms.compute_linear_power_spectrogram()

Compute a linear power spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D NumPy array

paramsSpectrogramParams

Spectrogram parameters

Returns

Spectrogram

Spectrogram with linear frequency scale and power amplitude scale

spectrograms.compute_linear_magnitude_spectrogram()

Compute a linear magnitude spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D NumPy array

paramsSpectrogramParams

Spectrogram parameters

Returns

Spectrogram

Spectrogram with linear frequency scale and magnitude amplitude scale

spectrograms.compute_linear_db_spectrogram()

Compute a linear decibel spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D NumPy array

paramsSpectrogramParams

Spectrogram parameters

Returns

Spectrogram

Spectrogram with linear frequency scale and decibel amplitude scale

Mel Spectrograms

spectrograms.compute_mel_power_spectrogram()

Compute a mel power spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyMelParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with mel frequency scale and power amplitude scale

spectrograms.compute_mel_magnitude_spectrogram()

Compute a mel magnitude spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyMelParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with mel frequency scale and magnitude amplitude scale

spectrograms.compute_mel_db_spectrogram()

Compute a mel decibel spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyMelParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with mel frequency scale and decibel amplitude scale

ERB Spectrograms

spectrograms.compute_erb_power_spectrogram()

Compute a ERB/gammatone power spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyErbParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with ERB/gammatone frequency scale and power amplitude scale

spectrograms.compute_erb_magnitude_spectrogram()

Compute a ERB/gammatone magnitude spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyErbParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with ERB/gammatone frequency scale and magnitude amplitude scale

spectrograms.compute_erb_db_spectrogram()

Compute a ERB/gammatone decibel spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyErbParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with ERB/gammatone frequency scale and decibel amplitude scale

LogHz Spectrograms

spectrograms.compute_loghz_power_spectrogram()

Compute a logarithmic Hz power spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyLogHzParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with logarithmic Hz frequency scale and power amplitude scale

spectrograms.compute_loghz_magnitude_spectrogram()

Compute a logarithmic Hz magnitude spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyLogHzParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with logarithmic Hz frequency scale and magnitude amplitude scale

spectrograms.compute_loghz_db_spectrogram()

Compute a logarithmic Hz decibel spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D array

paramsSpectrogramParams

Spectrogram parameters

filter_paramsPyLogHzParams

Filterbank parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

Spectrogram with logarithmic Hz frequency scale and decibel amplitude scale

Audio Features

spectrograms.compute_cqt()

Compute a Constant-Q Transform power spectrogram.

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D NumPy array

paramsSpectrogramParams

Spectrogram parameters

cqtCqtParams

CQT parameters

dbtyping.Optional[LogParams], optional

Optional decibel scaling parameters

Returns

Spectrogram

CQT spectrogram with power amplitude scale

spectrograms.compute_chromagram()

Compute a chromagram (pitch class profile).

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D NumPy array

stft_paramsStftParams

STFT parameters

sample_ratefloat

Sample rate in Hz

chroma_paramsChromaParams

Chromagram parameters

Returns

numpy.ndarray

Chromagram as a 2D NumPy array (12 x n_frames)

spectrograms.compute_mfcc()

Compute MFCCs (Mel-Frequency Cepstral Coefficients).

Parameters

samplesnumpy.typing.NDArray[numpy.float64]

Audio samples as a 1D NumPy array

stft_paramsStftParams

STFT parameters

sample_ratefloat

Sample rate in Hz

n_melsint

Number of mel bands

mfcc_paramsMfccParams

MFCC parameters

Returns

numpy.ndarray

MFCCs as a 2D NumPy array (n_mfcc x n_frames)

Low-Level Audio Functions

spectrograms.compute_stft()

Compute the raw STFT (Short-Time Fourier Transform).

Returns the complex-valued STFT matrix before any frequency mapping or amplitude scaling.

Parameters

:param samples - Audio samples as a 1D NumPy array :param params - Spectrogram parameters

Returns

Complex STFT as a 2D NumPy array of complex128 (n_fft/2+1 x n_frames)

Image Processing Functions

2D FFT Operations

spectrograms.fft2d(data)

Compute 2D FFT of a real-valued 2D array.

Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().

Parameters

datanumpy.typing.NDArray[numpy.float64] or Spectrogram

Input 2D array (e.g., image) with shape (nrows, ncols)

Returns

numpy.typing.NDArray[numpy.complex64]

Complex 2D array with shape (nrows, ncols/2 + 1) due to Hermitian symmetry

Examples

>>> import spectrograms as sg
>>> import numpy as np
>>> image = np.random.randn(128, 128)
>>> spectrum = sg.fft2d(image)
>>> spectrum.shape
(128, 65)
spectrograms.ifft2d()

Compute inverse 2D FFT from frequency domain back to spatial domain.

Parameters

spectrumnumpy.typing.NDArray[numpy.complex64]

Complex frequency array with shape (nrows, ncols/2 + 1)

output_ncolsint

Number of columns in the output (must match original image width)

Returns

numpy.typing.NDArray[numpy.float64]

Real 2D array with shape (nrows, output_ncols)

Examples

>>> import spectrograms as sg
>>> import numpy as np
>>> image = np.random.randn(128, 128)
>>> spectrum = sg.fft2d(image)
>>> reconstructed = sg.ifft2d(spectrum, 128)
>>> np.allclose(image, reconstructed)
True
spectrograms.power_spectrum_2d(data)

Compute 2D power spectrum (squared magnitude).

Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().

Parameters

datanumpy.typing.NDArray[numpy.float64] or Spectrogram

Input 2D array with shape (nrows, ncols)

Returns

numpy.typing.NDArray[numpy.float64]

Power spectrum with shape (nrows, ncols/2 + 1)

Examples

>>> import spectrograms as sg
>>> import numpy as np
>>> image = np.ones((64, 64))
>>> power = sg.power_spectrum_2d(image)
>>> power[0, 0]  # DC component should have all energy
16777216.0
spectrograms.magnitude_spectrum_2d(data)

Compute 2D magnitude spectrum.

Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().

Parameters

datanumpy.typing.NDArray[numpy.float64] or Spectrogram

Input 2D array with shape (nrows, ncols)

Returns

numpy.typing.NDArray[numpy.float64]

Magnitude spectrum with shape (nrows, ncols/2 + 1)

Frequency Shifting

spectrograms.fftshift(arr)

Shift zero-frequency component to center.

Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().

Parameters

arrnumpy.typing.NDArray[numpy.float64] or Spectrogram

Input 2D array

Returns

numpy.typing.NDArray[numpy.float64]

Shifted array with DC component at center

spectrograms.ifftshift(arr)

Inverse of fftshift - shift center back to corners.

Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().

Parameters

arrnumpy.typing.NDArray[numpy.float64] or Spectrogram

Input 2D array

Returns

numpy.typing.NDArray[numpy.float64]

Shifted array with DC component at corners

Kernels

spectrograms.gaussian_kernel_2d()

Create 2D Gaussian kernel for blurring.

Parameters

sizeint

Kernel size (must be odd, e.g., 3, 5, 7, 9)

sigmafloat

Standard deviation of the Gaussian

Returns

numpy.typing.NDArray[numpy.float64]

Normalized Gaussian kernel with shape (size, size)

Examples

>>> import spectrograms as sg
>>> kernel = sg.gaussian_kernel_2d(5, 1.0)
>>> kernel.shape
(5, 5)
>>> kernel.sum()  # Should be ~1.0
1.0

Convolution

spectrograms.convolve_fft(image, kernel)

Convolve 2D image with kernel using FFT.

Parameters

imagenumpy.typing.NDArray[numpy.float64]

Input image with shape (nrows, ncols)

kernelnumpy.typing.NDArray[numpy.float64]

Convolution kernel (must be smaller than image)

Returns

numpy.typing.NDArray[numpy.float64]

Convolved image (same size as input)

Examples

>>> import spectrograms as sg
>>> import numpy as np
>>> image = np.random.randn(256, 256)
>>> kernel = sg.gaussian_kernel_2d(9, 2.0)
>>> blurred = sg.convolve_fft(image, kernel)

Spatial Filtering

spectrograms.lowpass_filter(image, cutoff_fraction)

Apply low-pass filter to suppress high frequencies.

Parameters

imagenumpy.typing.NDArray[numpy.float64] or Spectrogram

Input image

cutoff_fractionfloat

Cutoff radius as fraction (0.0 to 1.0)

Returns

numpy.typing.NDArray[numpy.float64]

Filtered image

spectrograms.highpass_filter(image, cutoff_fraction)

Apply high-pass filter to suppress low frequencies.

Parameters

imagenumpy.typing.NDArray[numpy.float64]

Input image

cutoff_fractionfloat

Cutoff radius as fraction (0.0 to 1.0)

Returns

numpy.typing.NDArray[numpy.float64]

Filtered image with edges emphasized

spectrograms.bandpass_filter(image, low_cutoff, high_cutoff)

Apply band-pass filter to keep frequencies in a range.

Parameters

imagenumpy.typing.NDArray[numpy.float64] or Spectrogram

Input image

low_cutofffloat

Lower cutoff as fraction (0.0 to 1.0)

high_cutofffloat

Upper cutoff as fraction (0.0 to 1.0)

Returns

numpy.typing.NDArray[numpy.float64]

Filtered image

Feature Enhancement

spectrograms.detect_edges_fft(image)

Detect edges using high-pass filtering.

Parameters

imagenumpy.typing.NDArray[numpy.float64] or Spectrogram

Input image

Returns

numpy.typing.NDArray[numpy.float64]

Edge-detected image

spectrograms.sharpen_fft(image, amount)

Sharpen image by enhancing high frequencies.

Parameters

imagenumpy.typing.NDArray[numpy.float64]

Input image

amountfloat

Sharpening strength (typical range: 0.5 to 2.0)

Returns

numpy.typing.NDArray[numpy.float64]

Sharpened image