Data Description#
Let's delve into users, tags, features, and notes – the tools that add depth to our bioacoustic research. Categorical tags, numerical features, and freeform notes bring an extra layer of understanding to our research objects, while user information provides adequate attribution to the contribution of all involved.
Users#
Collaboration is at the heart of most bioacoustic analyses, involving data collectors, annotators, reviewers, administrators, developers, and researchers. To ensure proper attribution of work, soundevent introduces a Users data schema, holding minimal information about each individual involved. The User object can optionally include a name, email, username (a commonly known alias), and institution. Recognizing the sensitivity of this information, it's important to ensure that individuals are comfortable sharing these details. If privacy concerns persist, User objects can be omitted altogether.
erDiagram
User {
UUID uuid
string name
string email
string username
string institution
}
Tags#
Tags within the soundevent
package are like
categorical variables that add specific meaning to the objects they adorn—be it
recordings, clips, or sound events. Serving as informative labels, Tags
offer a way to organize and contextualize data.
A Tag comprises two essential components: a key and a value, both in the form of simple text. While in many computational contexts, a Tag might be considered just a text, we find it exceptionally beneficial to introduce a "namespace"—the key—for each tag. This key refines the meaning of the Tag and establishes the context in which it is employed.
erDiagram
Tag{
string key
string value
}
The beauty lies in the flexibility offered – there are no restrictions on what can be employed as a key or value. This flexibility accommodates project-specific requirements, allowing researchers to tailor Tags to their unique needs and objectives.
What is a namespace?
Taken from the Wikipedia article on namespaces:
a namespace is a set of signs (names) that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified.
[...]
namespaces are typically employed for the purpose of grouping symbols and identifiers around a particular functionality and to avoid name collisions between multiple identifiers that share the same name
Features#
Features serve as numerical descriptions, providing valuable information to the objects they enhance. They can encompass a range of nature – from measurements of environmental sensors to attributes of individuals creating a sound, even extending to abstract features extracted by general-purpose deep learning models. When multiple Features accompany sound events, clips, or recordings, they become tools for understanding similarities and differences, allowing comparison and visualization in feature space. Features play a pivotal role in outlier identification, gaining insights into characteristic distribution, and enabling statistical analyses.
A Feature comprises a textual name and a floating value. In
soundevent
, lists of Features can be attached to various objects without
restrictions on the name or value. This flexibility allows for tailoring
features to specific project needs
erDiagram
Feature {
string name
float value
}
Notes#
Notes serve as textual companions, allowing communication among researchers and providing nuanced context to the objects they accompany. Whether conveying vital information, engaging in discussions about specific aspects of the attached objects, or flagging potential data issues, Notes play an indispensable role in promoting collaboration and enriching the overall understanding of audio data.
These textual messages, varying in length, also capture essential details such as the note's creator and the time of creation, ensuring proper recognition. Beyond their informative role, Notes can be marked as issues when highlighting significant points requiring external review.
erDiagram
Note {
UUID uuid
string message
datetime created_on
bool is_issue
}
User
Note ||--o| User: created_by