Skip to content

Note

Click here to download the full example code

Integrating Crowsetta with Soundevent#

Crowsetta is a versatile Python tool designed for handling annotations of animal vocalizations and bioacoustics data. If you're working with diverse annotation formats Crowsetta has you covered. Soundevent complements this functionality with its soundevent.io.crowsetta module, offering a convenient solution for converting between Crowsetta and Soundevent formats.

In this guide, we'll walk through the process of using Crowsetta to load annotations and then converting them to Soundevent format using the soundevent.io.crowsetta module.

Usage details

To use the soundevent.io.crowsetta module you need to install some additional dependencies. You can do this by running the following command:

pip install soundevent[crowsetta]

Loading annotations with crowsetta#

To begin our journey, let's delve into loading annotations using Crowsetta.

Crowsetta supported formats#

Crowsetta offers support for various annotation formats. Let's explore the available formats:

import crowsetta

print(crowsetta.data.available_formats())

Out:

['aud-txt', 'birdsong-recognition-dataset', 'generic-seq', 'notmat', 'raven', 'simple-seq', 'textgrid', 'timit']

Loading Example Raven Annotations#

Let's walk through the process of loading example Raven annotations using Crowsetta.

import os
import tempfile

with tempfile.TemporaryDirectory() as tmpdirname:
    # Extract the example data files to a temporary directory
    data_dir = os.path.join(tmpdirname, "crowsetta_data")
    crowsetta.data.extract_data_files(user_data_dir=data_dir)

    # Select a Raven example file
    example_file = crowsetta.data.get("raven", user_data_dir=data_dir)

    # Create a Raven transcriber
    transcriber = crowsetta.Transcriber("raven")

    # Load the Raven annotations
    # For this example, we assume the annotations correspond to a test audio
    # file.
    raven_annotations = transcriber.from_file(
        example_file.annot_path,
        annot_col="Species",
        audio_path="sample_audio.wav",
    )

    print(raven_annotations)

    # Convert the annotations to the standard crowsetta format
    annotations = raven_annotations.to_annot()

print(f"Citation: {example_file.citation}")
print(f"Loaded {len(annotations.bboxes)} bounding box annotations")
print("Notated file: ", annotations.notated_path)

Out:

Raven(df=   Selection           View  Channel  ...  low_freq_hz  high_freq_hz  annotation
0          1  Spectrogram 1        1  ...       2878.2        4049.0        EATO
1          2  Spectrogram 1        1  ...       2731.9        3902.7        EATO
2          3  Spectrogram 1        1  ...       2878.2        3975.8        EATO
3          4  Spectrogram 1        1  ...       2756.2        3951.4        EATO
4          5  Spectrogram 1        1  ...       2707.5        3975.8        EATO
5          6  Spectrogram 1        1  ...       2951.4        3975.8        EATO

[6 rows x 8 columns], annot_path=PosixPath('/tmp/tmpklmbrcqh/crowsetta_data/raven/Recording_1_Segment_02.Table.1.selections.txt'), annot_col='Species', audio_path=PosixPath('sample_audio.wav'))
Citation: Chronister, L. M., Rhinehart, T. A., Place, A., & Kitzes, J. (2021). An annotated set of audio recordings of Eastern North American birds containing frequency, time, and species information.https://datadryad.org/stash/dataset/doi:10.5061/dryad.d2547d81z
Loaded 6 bounding box annotations
Notated file:  sample_audio.wav

Converting to Soundevent format#

Having successfully loaded the annotations using Crowsetta, we're now ready to convert them to Soundevent format.

import soundevent.io.crowsetta as cr

# Convert Crowsetta Annotations to Soundevent ClipAnnotation
clip_annotation = cr.annotation_to_clip_annotation(annotations)

# Print JSON representation of the ClipAnnotation object
print(
    clip_annotation.model_dump_json(
        indent=2,
        # Avoid printing irrelevant information
        exclude_none=True,
        exclude_defaults=True,
        exclude_unset=True,
    )
)

Out:

{
  "clip": {
    "recording": {
      "path": "sample_audio.wav",
      "duration": 3.0,
      "channels": 1,
      "samplerate": 22050,
      "hash": "7df7fabc84c9fa3d235db620c38ef288"
    },
    "start_time": 0.0,
    "end_time": 3.0
  },
  "sound_events": [
    {
      "sound_event": {
        "geometry": {
          "coordinates": [
            154.387792767,
            2878.2,
            154.911598217,
            4049.0
          ]
        },
        "recording": {
          "path": "sample_audio.wav",
          "duration": 3.0,
          "channels": 1,
          "samplerate": 22050,
          "hash": "7df7fabc84c9fa3d235db620c38ef288"
        }
      },
      "tags": [
        {
          "key": "crowsetta",
          "value": "EATO"
        }
      ]
    },
    {
      "sound_event": {
        "geometry": {
          "coordinates": [
            167.526598245,
            2731.9,
            168.17302044,
            3902.7
          ]
        },
        "recording": {
          "path": "sample_audio.wav",
          "duration": 3.0,
          "channels": 1,
          "samplerate": 22050,
          "hash": "7df7fabc84c9fa3d235db620c38ef288"
        }
      },
      "tags": [
        {
          "key": "crowsetta",
          "value": "EATO"
        }
      ]
    },
    {
      "sound_event": {
        "geometry": {
          "coordinates": [
            183.609636834,
            2878.2,
            184.097751553,
            3975.8
          ]
        },
        "recording": {
          "path": "sample_audio.wav",
          "duration": 3.0,
          "channels": 1,
          "samplerate": 22050,
          "hash": "7df7fabc84c9fa3d235db620c38ef288"
        }
      },
      "tags": [
        {
          "key": "crowsetta",
          "value": "EATO"
        }
      ]
    },
    {
      "sound_event": {
        "geometry": {
          "coordinates": [
            250.527480604,
            2756.2,
            251.160710509,
            3951.4
          ]
        },
        "recording": {
          "path": "sample_audio.wav",
          "duration": 3.0,
          "channels": 1,
          "samplerate": 22050,
          "hash": "7df7fabc84c9fa3d235db620c38ef288"
        }
      },
      "tags": [
        {
          "key": "crowsetta",
          "value": "EATO"
        }
      ]
    },
    {
      "sound_event": {
        "geometry": {
          "coordinates": [
            277.88724277,
            2707.5,
            278.480895806,
            3975.8
          ]
        },
        "recording": {
          "path": "sample_audio.wav",
          "duration": 3.0,
          "channels": 1,
          "samplerate": 22050,
          "hash": "7df7fabc84c9fa3d235db620c38ef288"
        }
      },
      "tags": [
        {
          "key": "crowsetta",
          "value": "EATO"
        }
      ]
    },
    {
      "sound_event": {
        "geometry": {
          "coordinates": [
            295.52970757,
            2951.4,
            296.110168316,
            3975.8
          ]
        },
        "recording": {
          "path": "sample_audio.wav",
          "duration": 3.0,
          "channels": 1,
          "samplerate": 22050,
          "hash": "7df7fabc84c9fa3d235db620c38ef288"
        }
      },
      "tags": [
        {
          "key": "crowsetta",
          "value": "EATO"
        }
      ]
    }
  ]
}

And that's it! We have successfully loaded annotations using crowsetta and converted them to soundevent format.

Converting back to crowsetta format#

Now, let's explore the process of converting Soundevent annotations back to Crowsetta objects.

from soundevent.data import (
    BoundingBox,
    Clip,
    ClipAnnotation,
    Recording,
    SoundEvent,
    SoundEventAnnotation,
    Tag,
)

# First, let's create some annotations for the example audio file
recording = Recording.from_file("sample_audio.wav")

clip_annotation = ClipAnnotation(
    clip=Clip(
        recording=recording,
        start_time=0,
        end_time=1,
    ),
    sound_events=[
        SoundEventAnnotation(
            tags=[
                Tag(key="species", value="bird"),
                Tag(key="color", value="red"),
            ],
            sound_event=SoundEvent(
                recording=recording,
                geometry=BoundingBox(coordinates=[0.1, 2000, 0.2, 3000]),
            ),
        ),
        SoundEventAnnotation(
            tags=[Tag(key="species", value="frog")],
            sound_event=SoundEvent(
                recording=recording,
                geometry=BoundingBox(coordinates=[0.3, 1000, 0.6, 1500]),
            ),
        ),
    ],
)

Now, let's convert the ClipAnnotation object to Crowsetta format

annotations = cr.annotation_from_clip_annotation(
    clip_annotation,
    "random_file_path.txt",
    annotation_fmt="bbox",
)

print(annotations)

Out:

Annotation(annot_path=PosixPath('random_file_path.txt'), notated_path=PosixPath('sample_audio.wav'), bboxes=[BBox(onset=0.1, offset=0.2, low_freq=2000.0, high_freq=3000.0, label='species:bird,color:red'), BBox(onset=0.3, offset=0.6, low_freq=1000.0, high_freq=1500.0, label='species:frog')])

Note

While working with crowsetta, annotation objects are typically loaded from a file. In this demonstration, we're using a random file name to instantiate the annotations, even though the file doesn't exist. It's important to note that crowsetta requires a file path to create annotations, even if they are not actually written to the file. Therefore, using a random filepath is a safe practice.

Finer Control#

When converting between crowsetta and soundevent formats, you have a multitude of options at your disposal. Soundevent objects can contain a wealth of information beyond what crowsetta objects offer, including multiple tags, notes, various geometry types, and more. Consequently, the conversion process isn't always straightforward. Particularly when converting from soundevent to crowsetta format, you'll need to make decisions regarding how to handle the additional information.

Tags and Labels#

One of the primary distinctions between crowsetta and soundevent lies in their handling of labels/tags. While crowsetta employs a single textual label for each annotation, soundevent utilizes a list of key-value tags. This difference complicates the conversion process.

By default, when converting to crowsetta format, the label field of the crowsetta annotation gets converted to a single tag with the key crowsetta and the value of the label field. However, numerous customization options are available to tailor this behavior. Refer to the documentation for more information.

label = "bird"
tags = cr.label_to_tags(label)
print(tags)

Out:

[Tag(key='crowsetta', value='bird')]

In the reverse direction, the default behavior amalgamates all tags into a single label. For example:

tags = [
    Tag(key="species", value="bird"),
    Tag(key="color", value="red"),
]
label = cr.label_from_tags(tags)
print(label)

Out:

species:bird,color:red

Once again, you have the option to customize this behavior. Refer to the documentation for more information.

Total running time of the script: ( 0 minutes 1.251 seconds) Estimated memory usage: 14 MB

Download Python source code: 5_integration_with_crowsetta.py

Download Jupyter notebook: 5_integration_with_crowsetta.ipynb

Gallery generated by mkdocs-gallery