Data Description#
Let's explore users, terms, tags, features, and notes – essential tools for enriching bioacoustic research. Controlled vocabularies (terms), categorical tags, numerical features, and free-form notes provide deeper context and insights into your research objects. User information ensures proper attribution for everyone involved.
Users#
Bioacoustic analysis often involves collaboration between data collectors, annotators, reviewers, administrators, developers, and researchers. To acknowledge contributions, soundevent introduces a Users data schema, storing basic information about each individual. The User object can optionally include name, email, username and institution. It's crucial to respect privacy and ensure individuals are comfortable sharing this information. If concerns remain, User objects can be omitted entirely.
erDiagram
User {
UUID uuid
string name
string email
string username
string institution
}
Terms#
Terms ensure everyone's on the same page. Inconsistent naming like "species" vs. "Species" wastes time. Terms provide a controlled vocabulary for common properties used in annotations and descriptions.
We've selected terms from established vocabularies like Darwin Core and Audiovisual Core, aligning your work with best practices. Take a look here for the terms defined in soundevent.
Tags#
Tags are informative labels within the soundevent
package.
They add meaning to recordings, clips, or sound events, helping organize and contextualize data.
A Tag has two parts: a term and a value. The term acts as a namespace, refining the Tag's meaning and context.
erDiagram
Tag {
string value
}
Term
Tag ||--o| Term: term
You have the flexibility to use a term or not. We strongly recommend it, but it's not mandatory. This adaptability allows you to tailor Tags to your specific project needs.
What is a namespace?
Taken from the Wikipedia article on namespaces:
a namespace is a set of signs (names) that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified.
[...]
namespaces are typically employed for the purpose of grouping symbols and identifiers around a particular functionality and to avoid name collisions between multiple identifiers that share the same name
Features#
Features are numerical descriptions. They can include measurements from environmental sensors, attributes of sound-producing individuals, or even abstract features extracted by deep learning models. Features enable comparison, visualization, outlier identification, understanding characteristic distributions, and statistical analysis.
A Feature consists of a Term and a numerical value.
erDiagram
Feature {
float value
}
Term
Feature ||--o| Term: term
Notes#
[Notes] are free-form textual additions, facilitating communication and providing context. They can convey information, enable discussions, or flag data issues.
Notes can have any length and include the note's creator and time of creation. Notes can also be marked as issues to highlight points needing review.
erDiagram
Note {
UUID uuid
string message
datetime created_on
bool is_issue
}
User
Note ||--o| User: created_by