Skip to content

Data Description#

Let's explore users, terms, tags, features, and notes – essential tools for enriching bioacoustic research. Controlled vocabularies (terms), categorical tags, numerical features, and free-form notes provide deeper context and insights into your research objects. User information ensures proper attribution for everyone involved.

Users#

Bioacoustic analysis often involves collaboration between data collectors, annotators, reviewers, administrators, developers, and researchers. To acknowledge contributions, soundevent introduces a Users data schema, storing basic information about each individual. The User object can optionally include name, email, username and institution. It's crucial to respect privacy and ensure individuals are comfortable sharing this information. If concerns remain, User objects can be omitted entirely.

erDiagram
    User {
        UUID uuid
        string name
        string email
        string username
        string institution
    }

Terms#

Terms ensure everyone's on the same page. Inconsistent naming like "species" vs. "Species" wastes time. Terms provide a controlled vocabulary for common properties used in annotations and descriptions.

We've selected terms from established vocabularies like Darwin Core and Audiovisual Core, aligning your work with best practices. Take a look here for the terms defined in soundevent.

Tags#

Tags are informative labels within the soundevent package. They add meaning to recordings, clips, or sound events, helping organize and contextualize data.

A Tag has two parts: a term and a value. The term acts as a namespace, refining the Tag's meaning and context.

erDiagram
    Tag {
        string value
    }
    Term
    Tag ||--o| Term: term

You have the flexibility to use a term or not. We strongly recommend it, but it's not mandatory. This adaptability allows you to tailor Tags to your specific project needs.

What is a namespace?

Taken from the Wikipedia article on namespaces:

a namespace is a set of signs (names) that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified.

[...]

namespaces are typically employed for the purpose of grouping symbols and identifiers around a particular functionality and to avoid name collisions between multiple identifiers that share the same name

Features#

Features are numerical descriptions. They can include measurements from environmental sensors, attributes of sound-producing individuals, or even abstract features extracted by deep learning models. Features enable comparison, visualization, outlier identification, understanding characteristic distributions, and statistical analysis.

A Feature consists of a Term and a numerical value.

erDiagram
    Feature {
        float value
    }
    Term
    Feature ||--o| Term: term

Notes#

[Notes] are free-form textual additions, facilitating communication and providing context. They can convey information, enable discussions, or flag data issues.

Notes can have any length and include the note's creator and time of creation. Notes can also be marked as issues to highlight points needing review.

erDiagram
    Note {
        UUID uuid
        string message
        datetime created_on
        bool is_issue
    }
    User
    Note ||--o| User: created_by