Data Schemas#
Welcome to the data schemas tour with the soundevent
package! In this overview, we'll break down the various data schemas provided by the package into the following sections:
Describing the Data#
soundevent
provides tools to attach essential information to various objects in bioacoustic analysis:
- Users: Keeping reference of everyone's contribution.
- Terms: Standardized vocabularies ensure consistent language.
- Tags: Attaching semantic context to objects.
- Features: Numerical descriptors capturing continuously varying attributes.
- Notes: User-written free-text annotations.
Audio Content#
At the core of acoustic analysis, we have schemas for:
- Recordings: Complete audio files.
- Dataset: A collection of recordings from a common source.
Acoustic Objects#
Identifying distinctive sound elements within audio content:
- Geometric Objects: Defining Regions of Interest (RoI) in the temporal-frequency plane.
- Sound Events: Individual sonic occurrences.
- Sequences: Patterns of connected sound events.
- Clips: Fragments extracted from recordings.
Annotation#
soundevent
places emphasis on human annotation processes, covering:
- Sound Event Annotations: Expert-created markers for relevant sound events.
- Sequence Annotations: User provided annotations of sequences of sound events.
- Clip Annotations: Annotations and notes at the clip level.
- Annotation Task: Descriptions of tasks and the status of annotation.
- Annotation Project: The collective description of tasks and annotations.
Prediction#
Automated processing methods also play a role, generating:
- Sound Event Predictions: Predictions made during automated processing.
- Sequence Predictions: Predictions of sequences of sound events.
- Clip Predictions: Collections of predictions and additional information at the clip level.
- Model Runs: Sets of clip predictions generated in a single run by a specific model.
Evaluation#
Assessing the accuracy of predictions is crucial, and soundevent
provides schemas for:
- Matches: Predicted sound events overlapping with ground truth.
- Clip Evaluation: Information about matches and performance metrics at the clip level.
- Evaluation: Comprehensive details on model performance across the entire evaluation set.
- Evaluation Set: Human annotations serving as ground truth.
Want to know more? Dive in for a closer look at each of these schemas.
Unique Identifiers
In soundevent
, various objects feature a field called uuid
. This field
stores a Universal Unique Identifier (UUID), a 128-bit label generated
automatically. When created following standard methods, UUIDs are practically
unique. Information labeled with UUIDs by different parties can be combined
into a unified database without fear of duplication.