Skip to content

Evaluation Module#

Additional dependencies

To use the soundevent.evaluation module you need to install some additional dependencies. Make sure you have them installed by running the following command:

pip install soundevent[evaluation]

soundevent.evaluation #

Evaluation functions.

Modules:

Name Description
affinity

Measures of affinity between sound events geometries.

clip_classification
clip_multilabel_classification
encoding

Tag Encoder Module.

match

Algorithms for matching geometries.

metrics
sound_event_classification

Sound event classification evaluation.

sound_event_detection

Sound event detection evaluation.

tasks

Functions:

Name Description
classification_encoding

Encode a list of tags into an integer value.

compute_affinity

Compute the geometric affinity between two geometries.

create_tag_encoder

Create an encoder object from a list of tags.

match_geometries

Match geometries.

multilabel_encoding

Encode a list of tags into a binary multilabel array.

prediction_encoding

Encode a list of predicted tags into a floating-point array of scores.

Functions#

classification_encoding(tags, encoder) #

Encode a list of tags into an integer value.

This function is commonly used for mapping a list of tags to a compact integer representation, typically representing classes associated with objects like clips or sound events.

Parameters:

Name Type Description Default
tags Sequence[Tag]

A list of tags to be encoded.

required
encoder Callable[[Tag], Optional[int]]

A callable object that takes a data.Tag object as input and returns an optional integer encoding. If the encoder returns None for a tag, it will be skipped.

required

Returns:

Name Type Description
encoded Optional[int]

The encoded integer value representing the tags, or None if no encoding is available.

Examples:

Consider the following set of tags:

>>> dog = data.Tag(key="animal", value="dog")
>>> cat = data.Tag(key="animal", value="cat")
>>> brown = data.Tag(key="color", value="brown")
>>> blue = data.Tag(key="color", value="blue")

If we are interested in encoding only the 'dog' and 'brown' classes, the following examples demonstrate how the encoding works for tag list:

>>> encoder = create_tag_encoder([dog, brown])
>>> classification_encoding([brown], encoder)
1
>>> classification_encoding([dog, blue], encoder)
0
>>> classification_encoding([dog, brown], encoder)
0
>>> classification_encoding([cat], encoder)
None

compute_affinity(geometry1, geometry2, time_buffer=0.01, freq_buffer=100) #

Compute the geometric affinity between two geometries.

This function calculates the geometric similarity between two input geometries in the context of time-frequency space. The geometric affinity metric indicates how similar the two geometries are, with a value ranging from 0 (no similarity) to 1 (perfect similarity).

Parameters:

Name Type Description Default
geometry1 Geometry

The first geometry to be compared.

required
geometry2 Geometry

The second geometry to be compared.

required
time_buffer float

Time buffer for geometric preparation. Default is 0.01.

0.01
freq_buffer float

Frequency buffer for geometric preparation. Default is 100.

100

Returns:

Name Type Description
affinity float

A metric indicating the geometric similarity between the input geometries.

  • 0: The geometries have no overlap
  • 1: The geometries perfectly overlap.

The value is a ratio of the intersection area to the union area of the two geometries.

Notes
  • 0 or 1-dimensional geometries are buffered to 2-dimensional using the specified time and frequency buffers.
  • If either input geometry is of a time-based type, a specialized time-based affinity calculation is performed.
  • The function utilizes the Shapely library for geometric operations.

Examples:

>>> geometry1 = data.Geometry(...)  # Define the first geometry
>>> geometry2 = data.Geometry(...)  # Define the second geometry
>>> affinity = compute_affinity(
...     geometry1,
...     geometry2,
...     time_buffer=0.02,
...     freq_buffer=150,
... )
>>> affinity
0.75

create_tag_encoder(tags) #

Create an encoder object from a list of tags.

Parameters:

Name Type Description Default
tags Sequence[Tag]

A list of tags to be encoded.

required

Returns:

Type Description
SimpleEncoder

An instance of SimpleEncoder initialized with the provided tags.

match_geometries(source, target, time_buffer=0.01, freq_buffer=100) #

Match geometries.

This function matches geometries from a source and target sequence. The geometries are matched based on their affinity, which is computed using the compute_affinity function. The final matches are then selected to maximize the total affinity between the source and target geometries.

Parameters:

Name Type Description Default
source Sequence[Geometry]

Source geometries.

required
target Sequence[Geometry]

Target geometries.

required

Returns:

Type Description
Sequence[Tuple[Optional[int], Optional[int], float]]

A sequence of matches. Each match is a tuple of the source index, target index and affinity. If a source geometry is not matched to any target geometry, the target index is None. If a target geometry is not matched to any source geometry, the source index is None. Every source and target geometry is matched exactly once.

multilabel_encoding(tags, encoder) #

Encode a list of tags into a binary multilabel array.

Parameters:

Name Type Description Default
tags Sequence[Tag]

A list of tags to be encoded.

required
encoder Encoder

A callable object that takes a data.Tag object as input and returns an optional integer encoding. If the encoder returns None for a tag, it will be skipped.

required

Returns:

Name Type Description
encoded ndarray

A binary numpy array of shape (num_classes,) representing the multilabel encoding for the input tags. Each index with a corresponding tag is set to 1, and the rest are 0.

Examples:

Consider the following set of tags:

>>> dog = data.Tag(key="animal", value="dog")
>>> cat = data.Tag(key="animal", value="cat")
>>> brown = data.Tag(key="color", value="brown")
>>> blue = data.Tag(key="color", value="blue")

And we are only interested in encoding the following two classes:

>>> encoder = create_tag_encoder([dog, brown])

Then the following examples show how the multilabel encoding works:

>>> multilabel_encoding([brown], encoder)
array([0, 1])
>>> multilabel_encoding([dog, blue], encoder)
array([1, 0])
>>> multilabel_encoding([dog, brown], encoder)
array([1, 1])
>>> classification_encoding([cat], encoder)
array([0, 0])

prediction_encoding(tags, encoder) #

Encode a list of predicted tags into a floating-point array of scores.

Parameters:

Name Type Description Default
tags Sequence[PredictedTag]

A list of predicted tags to be encoded.

required
encoder Encoder

A callable object that takes a data.Tag object as input and returns an optional integer encoding. If the encoder returns None for a tag, it will be skipped.

required

Returns:

Name Type Description
encoded ndarray

A numpy array of floats of shape (num_classes,) representing the predicted scores for each class. The array contains the scores for each class corresponding to the input predicted tags.

Examples:

Consider the following set of tags:

>>> dog = data.Tag(key="animal", value="dog")
>>> cat = data.Tag(key="animal", value="cat")
>>> brown = data.Tag(key="color", value="brown")
>>> blue = data.Tag(key="color", value="blue")

And we are only interested in encoding the following two classes:

>>> encoder = create_tag_encoder([dog, brown])

Then the following examples show how the encoding works for predicted tags:

>>> prediction_encoding(
...     [data.PredictedTag(tag=brown, score=0.5)], encoder
... )
array([0, 0.5])
>>> multilabel_encoding(
...     [
...         data.PredictedTag(tag=dog, score=0.2),
...         data.PredictedTag(tag=blue, score=0.9),
...     ],
...     encoder,
... )
array([0.2, 0])
>>> multilabel_encoding(
...     [
...         data.PredictedTag(tag=dog, score=0.2),
...         data.PredictedTag(tag=brown, score=0.5),
...     ],
...     encoder,
... )
array([0.2, 0.5])
>>> classification_encoding(
...     [
...         data.PredictedTag(tag=cat, score=0.7),
...     ],
...     encoder,
... )
array([0, 0])