Convenience Functions
High-level functions for one-shot computation. For batch processing, use the Planner API API instead.
Audio Processing Functions
Linear Spectrograms
- spectrograms.compute_linear_power_spectrogram()
Compute a linear power spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D NumPy array
- paramsSpectrogramParams
Spectrogram parameters
Returns
- Spectrogram
Spectrogram with linear frequency scale and power amplitude scale
- spectrograms.compute_linear_magnitude_spectrogram()
Compute a linear magnitude spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D NumPy array
- paramsSpectrogramParams
Spectrogram parameters
Returns
- Spectrogram
Spectrogram with linear frequency scale and magnitude amplitude scale
- spectrograms.compute_linear_db_spectrogram()
Compute a linear decibel spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D NumPy array
- paramsSpectrogramParams
Spectrogram parameters
Returns
- Spectrogram
Spectrogram with linear frequency scale and decibel amplitude scale
Mel Spectrograms
- spectrograms.compute_mel_power_spectrogram()
Compute a mel power spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyMelParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with mel frequency scale and power amplitude scale
- spectrograms.compute_mel_magnitude_spectrogram()
Compute a mel magnitude spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyMelParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with mel frequency scale and magnitude amplitude scale
- spectrograms.compute_mel_db_spectrogram()
Compute a mel decibel spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyMelParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with mel frequency scale and decibel amplitude scale
ERB Spectrograms
- spectrograms.compute_erb_power_spectrogram()
Compute a ERB/gammatone power spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyErbParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with ERB/gammatone frequency scale and power amplitude scale
- spectrograms.compute_erb_magnitude_spectrogram()
Compute a ERB/gammatone magnitude spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyErbParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with ERB/gammatone frequency scale and magnitude amplitude scale
- spectrograms.compute_erb_db_spectrogram()
Compute a ERB/gammatone decibel spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyErbParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with ERB/gammatone frequency scale and decibel amplitude scale
LogHz Spectrograms
- spectrograms.compute_loghz_power_spectrogram()
Compute a logarithmic Hz power spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyLogHzParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with logarithmic Hz frequency scale and power amplitude scale
- spectrograms.compute_loghz_magnitude_spectrogram()
Compute a logarithmic Hz magnitude spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyLogHzParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with logarithmic Hz frequency scale and magnitude amplitude scale
- spectrograms.compute_loghz_db_spectrogram()
Compute a logarithmic Hz decibel spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D array
- paramsSpectrogramParams
Spectrogram parameters
- filter_paramsPyLogHzParams
Filterbank parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
Spectrogram with logarithmic Hz frequency scale and decibel amplitude scale
Audio Features
- spectrograms.compute_cqt()
Compute a Constant-Q Transform power spectrogram.
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D NumPy array
- paramsSpectrogramParams
Spectrogram parameters
- cqtCqtParams
CQT parameters
- dbtyping.Optional[LogParams], optional
Optional decibel scaling parameters
Returns
- Spectrogram
CQT spectrogram with power amplitude scale
- spectrograms.compute_chromagram()
Compute a chromagram (pitch class profile).
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D NumPy array
- stft_paramsStftParams
STFT parameters
- sample_ratefloat
Sample rate in Hz
- chroma_paramsChromaParams
Chromagram parameters
Returns
- numpy.ndarray
Chromagram as a 2D NumPy array (12 x n_frames)
- spectrograms.compute_mfcc()
Compute MFCCs (Mel-Frequency Cepstral Coefficients).
Parameters
- samplesnumpy.typing.NDArray[numpy.float64]
Audio samples as a 1D NumPy array
- stft_paramsStftParams
STFT parameters
- sample_ratefloat
Sample rate in Hz
- n_melsint
Number of mel bands
- mfcc_paramsMfccParams
MFCC parameters
Returns
- numpy.ndarray
MFCCs as a 2D NumPy array (n_mfcc x n_frames)
Low-Level Audio Functions
- spectrograms.compute_stft()
Compute the raw STFT (Short-Time Fourier Transform).
Returns the complex-valued STFT matrix before any frequency mapping or amplitude scaling.
Parameters
:param samples - Audio samples as a 1D NumPy array :param params - Spectrogram parameters
Returns
Complex STFT as a 2D NumPy array of complex128 (n_fft/2+1 x n_frames)
Image Processing Functions
2D FFT Operations
- spectrograms.fft2d(data)
Compute 2D FFT of a real-valued 2D array.
Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().
Parameters
- datanumpy.typing.NDArray[numpy.float64] or Spectrogram
Input 2D array (e.g., image) with shape (nrows, ncols)
Returns
- numpy.typing.NDArray[numpy.complex64]
Complex 2D array with shape (nrows, ncols/2 + 1) due to Hermitian symmetry
Examples
>>> import spectrograms as sg >>> import numpy as np >>> image = np.random.randn(128, 128) >>> spectrum = sg.fft2d(image) >>> spectrum.shape (128, 65)
- spectrograms.ifft2d()
Compute inverse 2D FFT from frequency domain back to spatial domain.
Parameters
- spectrumnumpy.typing.NDArray[numpy.complex64]
Complex frequency array with shape (nrows, ncols/2 + 1)
- output_ncolsint
Number of columns in the output (must match original image width)
Returns
- numpy.typing.NDArray[numpy.float64]
Real 2D array with shape (nrows, output_ncols)
Examples
>>> import spectrograms as sg >>> import numpy as np >>> image = np.random.randn(128, 128) >>> spectrum = sg.fft2d(image) >>> reconstructed = sg.ifft2d(spectrum, 128) >>> np.allclose(image, reconstructed) True
- spectrograms.power_spectrum_2d(data)
Compute 2D power spectrum (squared magnitude).
Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().
Parameters
- datanumpy.typing.NDArray[numpy.float64] or Spectrogram
Input 2D array with shape (nrows, ncols)
Returns
- numpy.typing.NDArray[numpy.float64]
Power spectrum with shape (nrows, ncols/2 + 1)
Examples
>>> import spectrograms as sg >>> import numpy as np >>> image = np.ones((64, 64)) >>> power = sg.power_spectrum_2d(image) >>> power[0, 0] # DC component should have all energy 16777216.0
- spectrograms.magnitude_spectrum_2d(data)
Compute 2D magnitude spectrum.
Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().
Parameters
- datanumpy.typing.NDArray[numpy.float64] or Spectrogram
Input 2D array with shape (nrows, ncols)
Returns
- numpy.typing.NDArray[numpy.float64]
Magnitude spectrum with shape (nrows, ncols/2 + 1)
Frequency Shifting
- spectrograms.fftshift(arr)
Shift zero-frequency component to center.
Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().
Parameters
- arrnumpy.typing.NDArray[numpy.float64] or Spectrogram
Input 2D array
Returns
- numpy.typing.NDArray[numpy.float64]
Shifted array with DC component at center
- spectrograms.ifftshift(arr)
Inverse of fftshift - shift center back to corners.
Accepts numpy arrays, Spectrogram objects, or any object implementing __array__().
Parameters
- arrnumpy.typing.NDArray[numpy.float64] or Spectrogram
Input 2D array
Returns
- numpy.typing.NDArray[numpy.float64]
Shifted array with DC component at corners
Kernels
- spectrograms.gaussian_kernel_2d()
Create 2D Gaussian kernel for blurring.
Parameters
- sizeint
Kernel size (must be odd, e.g., 3, 5, 7, 9)
- sigmafloat
Standard deviation of the Gaussian
Returns
- numpy.typing.NDArray[numpy.float64]
Normalized Gaussian kernel with shape (size, size)
Examples
>>> import spectrograms as sg >>> kernel = sg.gaussian_kernel_2d(5, 1.0) >>> kernel.shape (5, 5) >>> kernel.sum() # Should be ~1.0 1.0
Convolution
- spectrograms.convolve_fft(image, kernel)
Convolve 2D image with kernel using FFT.
Parameters
- imagenumpy.typing.NDArray[numpy.float64]
Input image with shape (nrows, ncols)
- kernelnumpy.typing.NDArray[numpy.float64]
Convolution kernel (must be smaller than image)
Returns
- numpy.typing.NDArray[numpy.float64]
Convolved image (same size as input)
Examples
>>> import spectrograms as sg >>> import numpy as np >>> image = np.random.randn(256, 256) >>> kernel = sg.gaussian_kernel_2d(9, 2.0) >>> blurred = sg.convolve_fft(image, kernel)
Spatial Filtering
- spectrograms.lowpass_filter(image, cutoff_fraction)
Apply low-pass filter to suppress high frequencies.
Parameters
- imagenumpy.typing.NDArray[numpy.float64] or Spectrogram
Input image
- cutoff_fractionfloat
Cutoff radius as fraction (0.0 to 1.0)
Returns
- numpy.typing.NDArray[numpy.float64]
Filtered image
- spectrograms.highpass_filter(image, cutoff_fraction)
Apply high-pass filter to suppress low frequencies.
Parameters
- imagenumpy.typing.NDArray[numpy.float64]
Input image
- cutoff_fractionfloat
Cutoff radius as fraction (0.0 to 1.0)
Returns
- numpy.typing.NDArray[numpy.float64]
Filtered image with edges emphasized
- spectrograms.bandpass_filter(image, low_cutoff, high_cutoff)
Apply band-pass filter to keep frequencies in a range.
Parameters
- imagenumpy.typing.NDArray[numpy.float64] or Spectrogram
Input image
- low_cutofffloat
Lower cutoff as fraction (0.0 to 1.0)
- high_cutofffloat
Upper cutoff as fraction (0.0 to 1.0)
Returns
- numpy.typing.NDArray[numpy.float64]
Filtered image