Sound Event Detection#

Sound Event Detection (SED) is the task of identifying the presence of sound events in an audio recording, estimating their temporal positions (start and end times), and classifying them into predefined categories.

`soundevent.evaluation.tasks.sound_event_detection` #

Sound event detection evaluation.

Functions:

Name	Description
`evaluate_clip`
`evaluate_sound_event_detection`	Evaluate sound event detections against ground truth annotations.
`sound_event_detection`

Attributes#

`EXAMPLE_METRICS = ()` `module-attribute` #

`RUN_METRICS = ((terms.mean_average_precision, metrics.mean_average_precision), (terms.balanced_accuracy, metrics.balanced_accuracy), (terms.accuracy, metrics.accuracy), (terms.top_3_accuracy, metrics.top_3_accuracy))` `module-attribute` #

`SOUNDEVENT_METRICS = ((terms.true_class_probability, metrics.true_class_probability),)` `module-attribute` #

Classes#

`ClipPrediction` #

Bases: Protocol, Generic[Detection]

Protocol defining the requirements for a clip prediction object.

Attributes:

Name	Type	Description
`clip`	`Clip`
`detections`	`Sequence[Detection]`

Attributes#

`clip` `instance-attribute` #

`detections` `instance-attribute` #

Functions#

`compute_overall_metrics(true_classes, predicted_classes_scores)` #

Compute evaluation metrics based on true classes and predicted scores.

`evaluate_clip(clip_annotations, clip_predictions, encoder)` #

`evaluate_sound_event(sound_event_prediction, sound_event_annotation, encoder)` #

`evaluate_sound_event_detection(clip_predictions, clip_annotations, affinity, score=None, affinity_threshold=0, strict_match=False)` #

Evaluate sound event detections against ground truth annotations.

This function matches predictions to annotations for each clip individually.

Parameters:

Name	Type	Description	Default
`clip_predictions`	`Sequence[ClipPrediction[Detection]]`	A sequence of prediction objects. Each object must contain a reference to the clip and a sequence of detections.	required
`clip_annotations`	`Sequence[ClipAnnotation]`	A sequence of ground truth annotations corresponding to the same clips.	required
`affinity`	`Callable[[Detection, SoundEventAnnotation], float]`	A function that computes the affinity score (e.g., IoU) between a detection and a ground truth annotation.	required
`score`	`Callable[[Detection], float] \| None`	A function to extract the confidence score from a detection. Used to sort detections greedily. If None, detections are processed in the order provided.	`None`
`affinity_threshold`	`float`	The minimum affinity score required for a valid match. Matches with scores less than or equal to this value are discarded. Defaults to 0.0.	`0`
`strict_match`	`bool`	If True, a detection is only matched if its highest affinity target is available. If False (default), it falls back to the next best available target.	`False`

Yields:

Name	Type	Description
`clip`	`Clip`	The clip associated with the match.
`match`	`Match[Detection, SoundEventAnnotation]`	A named tuple containing the matching results, see [`Match`][].

Raises:

Type	Description
`ValueError`	If the number of predictions and annotations differs, or if the sets of clip UUIDs do not match exactly.

Sound Event Detection#

soundevent.evaluation.tasks.sound_event_detection #

Attributes#

EXAMPLE_METRICS = () module-attribute #

RUN_METRICS = ((terms.mean_average_precision, metrics.mean_average_precision), (terms.balanced_accuracy, metrics.balanced_accuracy), (terms.accuracy, metrics.accuracy), (terms.top_3_accuracy, metrics.top_3_accuracy)) module-attribute #

SOUNDEVENT_METRICS = ((terms.true_class_probability, metrics.true_class_probability),) module-attribute #

Classes#

ClipPrediction #

Attributes#

clip instance-attribute #

detections instance-attribute #

Functions#

compute_overall_metrics(true_classes, predicted_classes_scores) #

evaluate_clip(clip_annotations, clip_predictions, encoder) #

evaluate_sound_event(sound_event_prediction, sound_event_annotation, encoder) #

evaluate_sound_event_detection(clip_predictions, clip_annotations, affinity, score=None, affinity_threshold=0, strict_match=False) #

sound_event_detection(clip_predictions, clip_annotations, tags) #

`soundevent.evaluation.tasks.sound_event_detection` #

`EXAMPLE_METRICS = ()` `module-attribute` #

`RUN_METRICS = ((terms.mean_average_precision, metrics.mean_average_precision), (terms.balanced_accuracy, metrics.balanced_accuracy), (terms.accuracy, metrics.accuracy), (terms.top_3_accuracy, metrics.top_3_accuracy))` `module-attribute` #

`SOUNDEVENT_METRICS = ((terms.true_class_probability, metrics.true_class_probability),)` `module-attribute` #

`ClipPrediction` #

`clip` `instance-attribute` #

`detections` `instance-attribute` #

`compute_overall_metrics(true_classes, predicted_classes_scores)` #

`evaluate_clip(clip_annotations, clip_predictions, encoder)` #

`evaluate_sound_event(sound_event_prediction, sound_event_annotation, encoder)` #

`evaluate_sound_event_detection(clip_predictions, clip_annotations, affinity, score=None, affinity_threshold=0, strict_match=False)` #

`sound_event_detection(clip_predictions, clip_annotations, tags)` #