Evaluation Module#
Additional dependencies
To use the soundevent.evaluation
module you need to install some
additional dependencies. Make sure you have them installed by running the
following command:
soundevent.evaluation
#
Evaluation functions.
Modules:
Name | Description |
---|---|
affinity |
Measures of affinity between sound events geometries. |
clip_classification |
|
clip_multilabel_classification |
|
encoding |
Tag Encoder Module. |
match |
Algorithms for matching geometries. |
metrics |
|
sound_event_classification |
Sound event classification evaluation. |
sound_event_detection |
Sound event detection evaluation. |
tasks |
|
Functions:
Name | Description |
---|---|
classification_encoding |
Encode a list of tags into an integer value. |
compute_affinity |
Compute the geometric affinity between two geometries. |
create_tag_encoder |
Create an encoder object from a list of tags. |
match_geometries |
Match geometries. |
multilabel_encoding |
Encode a list of tags into a binary multilabel array. |
prediction_encoding |
Encode a list of predicted tags into a floating-point array of scores. |
Functions#
classification_encoding(tags, encoder)
#
Encode a list of tags into an integer value.
This function is commonly used for mapping a list of tags to a compact integer representation, typically representing classes associated with objects like clips or sound events.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tags
|
Sequence[Tag]
|
A list of tags to be encoded. |
required |
encoder
|
Callable[[Tag], Optional[int]]
|
A callable object that takes a data.Tag object as input and returns an optional integer encoding. If the encoder returns None for a tag, it will be skipped. |
required |
Returns:
Name | Type | Description |
---|---|---|
encoded |
Optional[int]
|
The encoded integer value representing the tags, or None if no encoding is available. |
Examples:
Consider the following set of tags:
>>> dog = data.Tag(key="animal", value="dog")
>>> cat = data.Tag(key="animal", value="cat")
>>> brown = data.Tag(key="color", value="brown")
>>> blue = data.Tag(key="color", value="blue")
If we are interested in encoding only the 'dog' and 'brown' classes, the following examples demonstrate how the encoding works for tag list:
compute_affinity(geometry1, geometry2, time_buffer=0.01, freq_buffer=100)
#
Compute the geometric affinity between two geometries.
This function calculates the geometric similarity between two input geometries in the context of time-frequency space. The geometric affinity metric indicates how similar the two geometries are, with a value ranging from 0 (no similarity) to 1 (perfect similarity).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
geometry1
|
Geometry
|
The first geometry to be compared. |
required |
geometry2
|
Geometry
|
The second geometry to be compared. |
required |
time_buffer
|
float
|
Time buffer for geometric preparation. Default is 0.01. |
0.01
|
freq_buffer
|
float
|
Frequency buffer for geometric preparation. Default is 100. |
100
|
Returns:
Name | Type | Description |
---|---|---|
affinity |
float
|
A metric indicating the geometric similarity between the input geometries.
The value is a ratio of the intersection area to the union area of the two geometries. |
Notes
- 0 or 1-dimensional geometries are buffered to 2-dimensional using the specified time and frequency buffers.
- If either input geometry is of a time-based type, a specialized time-based affinity calculation is performed.
- The function utilizes the Shapely library for geometric operations.
Examples:
create_tag_encoder(tags)
#
match_geometries(source, target, time_buffer=0.01, freq_buffer=100)
#
Match geometries.
This function matches geometries from a source and target sequence. The geometries are matched based on their affinity, which is computed using the compute_affinity function. The final matches are then selected to maximize the total affinity between the source and target geometries.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source
|
Sequence[Geometry]
|
Source geometries. |
required |
target
|
Sequence[Geometry]
|
Target geometries. |
required |
Returns:
Type | Description |
---|---|
Sequence[Tuple[Optional[int], Optional[int], float]]
|
A sequence of matches. Each match is a tuple of the source index, target index and affinity. If a source geometry is not matched to any target geometry, the target index is None. If a target geometry is not matched to any source geometry, the source index is None. Every source and target geometry is matched exactly once. |
multilabel_encoding(tags, encoder)
#
Encode a list of tags into a binary multilabel array.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tags
|
Sequence[Tag]
|
A list of tags to be encoded. |
required |
encoder
|
Encoder
|
A callable object that takes a data.Tag object as input and returns an optional integer encoding. If the encoder returns None for a tag, it will be skipped. |
required |
Returns:
Name | Type | Description |
---|---|---|
encoded |
ndarray
|
A binary numpy array of shape (num_classes,) representing the multilabel encoding for the input tags. Each index with a corresponding tag is set to 1, and the rest are 0. |
Examples:
Consider the following set of tags:
>>> dog = data.Tag(key="animal", value="dog")
>>> cat = data.Tag(key="animal", value="cat")
>>> brown = data.Tag(key="color", value="brown")
>>> blue = data.Tag(key="color", value="blue")
And we are only interested in encoding the following two classes:
Then the following examples show how the multilabel encoding works:
prediction_encoding(tags, encoder)
#
Encode a list of predicted tags into a floating-point array of scores.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tags
|
Sequence[PredictedTag]
|
A list of predicted tags to be encoded. |
required |
encoder
|
Encoder
|
A callable object that takes a data.Tag object as input and returns an optional integer encoding. If the encoder returns None for a tag, it will be skipped. |
required |
Returns:
Name | Type | Description |
---|---|---|
encoded |
ndarray
|
A numpy array of floats of shape (num_classes,) representing the predicted scores for each class. The array contains the scores for each class corresponding to the input predicted tags. |
Examples:
Consider the following set of tags:
>>> dog = data.Tag(key="animal", value="dog")
>>> cat = data.Tag(key="animal", value="cat")
>>> brown = data.Tag(key="color", value="brown")
>>> blue = data.Tag(key="color", value="blue")
And we are only interested in encoding the following two classes:
Then the following examples show how the encoding works for predicted tags:
>>> prediction_encoding(
... [data.PredictedTag(tag=brown, score=0.5)], encoder
... )
array([0, 0.5])
>>> multilabel_encoding(
... [
... data.PredictedTag(tag=dog, score=0.2),
... data.PredictedTag(tag=blue, score=0.9),
... ],
... encoder,
... )
array([0.2, 0])
>>> multilabel_encoding(
... [
... data.PredictedTag(tag=dog, score=0.2),
... data.PredictedTag(tag=brown, score=0.5),
... ],
... encoder,
... )
array([0.2, 0.5])
>>> classification_encoding(
... [
... data.PredictedTag(tag=cat, score=0.7),
... ],
... encoder,
... )
array([0, 0])