Audio Preprocessing#

`audioclass.preprocess` #

Module for preprocessing audio data.

This module helps loading audio data and preprocessing it into a standardized format for audio classification models.

Provides functions for loading audio, resampling, and framing into fixed-length buffers.

Functions#

`load_clip(clip, samplerate, buffer_size, audio_dir=None)` #

Load an audio clip from a soundevent Clip object.

This function will load the clip from the audio file, preprocess it, and return a numpy array.

Parameters:

Name	Type	Description	Default
`clip`	`Clip`	The soundevent `Clip` object representing the audio segment.	required
`samplerate`	`int`	The desired sample rate to resample the audio to.	required
`buffer_size`	`int`	The length of each audio frame in samples.	required
`audio_dir`	`Optional[Path]`	The directory containing the audio files. If not provided, the clip's default audio directory is used.	`None`

Returns:

Type	Description
`ndarray`	A numpy array of shape (num_frames, buffer_size) containing the preprocessed audio data.

`load_recording(recording, samplerate, buffer_size, audio_dir=None)` #

Load an audio recording from a soundevent Recording object.

This function will load the audio file, preprocess it, and return a numpy array.

Parameters:

Name	Type	Description	Default
`recording`	`Recording`	The soundevent `Recording` object representing the audio file.	required
`samplerate`	`int`	The desired sample rate to resample the audio to.	required
`buffer_size`	`int`	The length of each audio frame in samples.	required
`audio_dir`	`Optional[Path]`	The directory containing the audio files. If not provided, the recording's default audio directory is used.	`None`

Returns:

Type	Description
`ndarray`	A numpy array of shape (num_frames, buffer_size) containing the preprocessed audio data.

`preprocess_audio(wave, samplerate, buffer_size)` #

Preprocess a loaded audio waveform.

This function performs the following preprocessing steps:

Selects the first channel if multiple channels are present.
Resamples the audio to the specified sample rate.
Frames the audio into fixed-length buffers.

Parameters:

Name	Type	Description	Default
`wave`	`DataArray`	The loaded audio waveform.	required
`samplerate`	`int`	The desired sample rate to resample the audio to.	required
`buffer_size`	`int`	The length of each audio frame in samples.	required

Returns:

Type	Description
`ndarray`	A numpy array of shape (num_frames, buffer_size) containing the preprocessed audio data.

`resample_audio(wave, samplerate)` #

Resample audio to a specific sample rate.

Parameters:

Name	Type	Description	Default
`wave`	`DataArray`	The audio waveform to resample.	required
`samplerate`	`int`	The target sample rate.	required

Returns:

Type	Description
`DataArray`	The resampled audio waveform.

`stack_array(arr, buffer_size)` #

Stack a 1D array into a 2D array of fixed-length buffers.

This function pads the input array with zeros if necessary to ensure that the number of elements is divisible by the buffer size.

Parameters:

Name	Type	Description	Default
`arr`	`ndarray`	The 1D array to stack.	required
`buffer_size`	`int`	The length of each buffer.	required

Returns:

Type	Description
`ndarray`	A 2D array of shape (num_buffers, buffer_size) containing the stacked buffers.

Audio Preprocessing#

audioclass.preprocess #

Functions#

load_clip(clip, samplerate, buffer_size, audio_dir=None) #

load_recording(recording, samplerate, buffer_size, audio_dir=None) #

preprocess_audio(wave, samplerate, buffer_size) #

resample_audio(wave, samplerate) #

stack_array(arr, buffer_size) #

`audioclass.preprocess` #

`load_clip(clip, samplerate, buffer_size, audio_dir=None)` #

`load_recording(recording, samplerate, buffer_size, audio_dir=None)` #

`preprocess_audio(wave, samplerate, buffer_size)` #

`resample_audio(wave, samplerate)` #

`stack_array(arr, buffer_size)` #