Transforms Module#
soundevent.transforms
#
Data transformations for soundevent objects.
This module provides a framework for applying transformations to soundevent data
objects. The core of the framework is the TransformBase
class, which defines
a visitor pattern for traversing the complex hierarchy of soundevent data
models.
The module also includes concrete transform classes for common data
manipulation tasks, such as modifying recording paths (PathTransform
) or
transforming tags (TagsTransform
).
These tools are designed to help users clean, modify, and standardize their bioacoustic datasets in a structured and reliable way.
Modules:
Name | Description |
---|---|
base |
Base classes for data transformations. |
path |
Transformations for recording paths. |
tags |
Transformations for tags. |
Classes:
Name | Description |
---|---|
PathTransform |
A transform for modifying the path of recordings. |
TagsTransform |
A transform for modifying sequences of tags. |
TransformBase |
Base class for creating data transformations. |
Classes#
PathTransform(transform)
#
Bases: TransformBase
A transform for modifying the path of recordings.
This class provides a convenient way to apply a path transformation
to all Recording
objects within a larger data structure (like a
Dataset
or AnnotationProject
). It works by overriding the
transform_path
method of the TransformBase
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
transform
|
Callable[[Path], Path]
|
A function that takes a |
required |
Examples:
>>> from pathlib import Path
>>> from soundevent import data
>>> from soundevent.transforms import PathTransform
>>>
>>> # Create a sample dataset to work with
>>> recording = data.Recording(
... path=Path("../relative/path/rec.wav"),
... duration=1,
... channels=1,
... samplerate=16000,
... )
>>> dataset = data.Dataset(name="test-dataset", recordings=[recording])
>>>
>>> # Define a function to make all paths absolute
>>> def make_absolute(path: Path) -> Path:
... # This is a simplistic example, in reality you might need a base directory
... return path.resolve()
>>>
>>> # Create and apply the transform
>>> path_transformer = PathTransform(transform=make_absolute)
>>> transformed_dataset = path_transformer.transform_dataset(dataset)
>>>
>>> # Check that the path in the transformed dataset is absolute
>>> transformed_dataset.recordings[0].path.is_absolute()
True
Parameters:
Name | Type | Description | Default |
---|---|---|---|
transform
|
Callable[[Path], Path]
|
A function that takes a |
required |
Methods:
Name | Description |
---|---|
transform_path |
Apply the transformation to a path. |
Attributes:
Name | Type | Description |
---|---|---|
transform |
|
TagsTransform(transform)
#
Bases: TransformBase
A transform for modifying sequences of tags.
This class provides a way to apply a transformation to all Tag
sequences within a soundevent data structure. It is useful for
filtering, renaming, or otherwise modifying tags across an entire
dataset.
It can be initialized directly with a function that transforms a whole
sequence of tags, or it can be constructed from a function that transforms
a single tag using the from_tag_transform
class method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
transform
|
Callable[[Sequence[Tag]], Sequence[Tag]]
|
A function that takes a sequence of |
required |
Examples:
>>> from pathlib import Path
>>> from soundevent import data
>>> from soundevent.transforms import TagsTransform
>>>
>>> # Create a sample recording with a misspelled species tag
>>> recording = data.Recording(
... path=Path("rec.wav"),
... duration=1,
... channels=1,
... samplerate=16000,
... tags=[
... data.Tag(key="species", value="Myotis mytis"),
... data.Tag(key="quality", value="good"),
... ],
... )
>>>
>>> # Create a transform to correct the spelling of "Myotis myotis"
>>> def correct_species_name(tag: data.Tag) -> data.Tag:
... if tag.key == "species" and tag.value == "Myotis mytis":
... return tag.model_copy(update={"value": "Myotis myotis"})
... return tag
>>> corrector = TagsTransform.from_tag_transform(
... transform=correct_species_name
... )
>>> transformed_recording = corrector.transform_recording(recording)
>>>
>>> # Verify that the tag value has been corrected
>>> species_tag = next(
... t for t in transformed_recording.tags if t.key == "species"
... )
>>> species_tag.value
'Myotis myotis'
>>>
>>> # Verify that other tags are untouched
>>> quality_tag = next(
... t for t in transformed_recording.tags if t.key == "quality"
... )
>>> quality_tag.value
'good'
Parameters:
Name | Type | Description | Default |
---|---|---|---|
transform
|
Callable[[Sequence[Tag]], Sequence[Tag]]
|
A function that takes a sequence of |
required |
Methods:
Name | Description |
---|---|
from_tag_transform |
Create a TagsTransform from a function that transforms a single tag. |
transform_tags |
Apply the transformation to a sequence of tags. |
Attributes:
Name | Type | Description |
---|---|---|
transform |
|
Attributes#
transform = transform
instance-attribute
#
Functions#
from_tag_transform(transform)
classmethod
#
Create a TagsTransform from a function that transforms a single tag.
This factory method is a convenient way to create a TagsTransform
when your logic applies to each tag individually.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
transform
|
Callable[[Tag], Optional[Tag]]
|
A function that takes a single |
required |
Returns:
Type | Description |
---|---|
TagsTransform
|
A new |
TransformBase
#
Base class for creating data transformations.
This class implements the visitor pattern to traverse the complex hierarchy
of soundevent data objects. It provides transform_*
methods for each type
of data object in the soundevent ecosystem.
The default implementation of each transform_*
method returns the object
unchanged or, for container-like objects, recursively calls the appropriate
transform methods on their children and returns a new container with the
transformed children.
To create a custom transformation, inherit from this class and override the
transform_*
method for the specific object or attribute you want to
modify.
Examples:
>>> from soundevent import data
>>> from soundevent.transforms.base import TransformBase
>>>
>>> class UserAnonymizer(TransformBase):
... def transform_user(self, user: data.User) -> data.User:
... return user.model_copy(update={"name": "anonymous"})
Methods: