Skip to content

Postprocessing#

audioclass.postprocess #

Module for postprocessing audio classification model outputs.

This module provides functions to convert raw model outputs (class probabilities and features) into various formats suitable for analysis, visualization, or storage. The primary formats include:

  • xarray Datasets: Structured datasets containing features, probabilities, and metadata like time, labels, location, and recording time.
  • Lists of soundevent objects: Collections of PredictedTag and Feature objects, compatible with the soundevent library.

These functions facilitate seamless integration with downstream analysis tools and enable flexible representation of audio classification results.

Attributes#

Functions#

convert_to_dataset(features, class_probs, labels, hop_size, start_time=0, latitude=None, longitude=None, recorded_on=None, attrs=None) #

Convert features and class probabilities to an xarray Dataset.

Parameters:

Name Type Description Default
features ndarray

A 2D array of features, where each row corresponds to a frame and each column to a feature.

required
class_probs ndarray

A 2D array of class probabilities, where each row corresponds to a frame and each column to a class.

required
labels List[str]

A list of labels for the classes.

required
hop_size float

The time step between frames in seconds.

required
start_time float

The start time of the first frame in seconds. Defaults to 0.

0
latitude Optional[float]

The latitude of the recording location. Defaults to None.

None
longitude Optional[float]

The longitude of the recording location. Defaults to None.

None
recorded_on Optional[datetime]

The date and time the recording was made. Defaults to None.

None
attrs Optional[dict]

Additional attributes to add to the Dataset. Defaults to None.

None

Returns:

Type Description
Dataset

An xarray Dataset containing the features and probabilities as DataArrays, along with coordinates and attributes.

convert_to_features_array(features, hop_size, start_time=0, latitude=None, longitude=None, recorded_on=None, attrs=None) #

Convert features to an xarray DataArray.

Parameters:

Name Type Description Default
features ndarray

A 2D array of features, where each row corresponds to a frame and each column to a feature.

required
hop_size float

The time step between frames in seconds.

required
start_time float

The start time of the first frame in seconds. Defaults to 0.

0
latitude Optional[float]

The latitude of the recording location. Defaults to None.

None
longitude Optional[float]

The longitude of the recording location. Defaults to None.

None
recorded_on Optional[datetime]

The date and time the recording was made. Defaults to None.

None
attrs Optional[dict]

Additional attributes to add to the DataArray. Defaults to None.

None

Returns:

Type Description
DataArray

An xarray DataArray with dimensions time and feature, containing the features.

convert_to_features_list(features, prefix) #

Convert a feature array to a list of soundevent Feature objects.

Parameters:

Name Type Description Default
features ndarray

A 2D array of features, where each row corresponds to a frame and each column to a feature.

required
prefix str

A prefix to add to each feature name.

required

Returns:

Type Description
List[List[Feature]]

A list of lists of Feature objects, where each inner list corresponds to a frame and contains the features for that frame.

convert_to_predicted_tags_list(class_probs, tags, confidence_threshold=DEFAULT_THRESHOLD) #

Convert class probabilities to a list of predicted tags.

Parameters:

Name Type Description Default
class_probs ndarray

A 2D array of class probabilities, where each row corresponds to a frame and each column to a class.

required
tags List[Tag]

A list of Tag objects representing the possible classes.

required
confidence_threshold float

The minimum probability threshold for a tag to be considered a prediction. Defaults to DEFAULT_THRESHOLD.

DEFAULT_THRESHOLD

Returns:

Type Description
List[List[PredictedTag]]

A list of lists of PredictedTag objects, where each inner list corresponds to a frame and contains the predicted tags for that frame.

Raises:

Type Description
ValueError

If the number of output tags does not match the number of columns in class_probs.

convert_to_probabilities_array(class_probs, labels, hop_size, start_time=0, latitude=None, longitude=None, recorded_on=None, attrs=None) #

Convert class probabilities to a DataArray.

Parameters:

Name Type Description Default
class_probs ndarray

A 2D array of class probabilities, where each row corresponds to a frame and each column to a class.

required
labels List[str]

A list of labels for the classes.

required
hop_size float

The time step between frames in seconds.

required
start_time float

The start time of the first frame in seconds. Defaults to 0.

0
latitude Optional[float]

The latitude of the recording location. Defaults to None.

None
longitude Optional[float]

The longitude of the recording location. Defaults to None.

None
recorded_on Optional[datetime]

The date and time the recording was made. Defaults to None.

None
attrs Optional[dict]

Additional attributes to add to the DataArray. Defaults to None.

None

Returns:

Type Description
DataArray

An xarray DataArray with dimensions time and label, containing the class probabilities.