Transforms Module#
soundevent.transforms
#
Data transformations for soundevent objects.
This module provides a framework for applying transformations to soundevent data
objects. The core of the framework is the TransformBase class, which defines
a visitor pattern for traversing the complex hierarchy of soundevent data
models.
The module also includes concrete transform classes for common data
manipulation tasks, such as modifying recording paths (PathTransform) or
transforming tags (TagsTransform).
These tools are designed to help users clean, modify, and standardize their bioacoustic datasets in a structured and reliable way.
Modules:
| Name | Description |
|---|---|
base |
Base classes for data transformations. |
path |
Transformations for recording paths. |
tags |
Transformations for tags. |
Classes:
| Name | Description |
|---|---|
PathTransform |
A transform for modifying the path of recordings. |
TagsTransform |
A transform for modifying sequences of tags. |
TransformBase |
Base class for creating data transformations. |
Classes#
PathTransform(transform)
#
Bases: TransformBase
A transform for modifying the path of recordings.
This class provides a convenient way to apply a path transformation
to all Recording objects within a larger data structure (like a
Dataset or AnnotationProject). It works by overriding the
transform_path method of the TransformBase.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform
|
Callable[[Path], Path]
|
A function that takes a |
required |
Examples:
>>> from pathlib import Path
>>> from soundevent import data
>>> from soundevent.transforms import PathTransform
>>>
>>> # Create a sample dataset to work with
>>> recording = data.Recording(
... path=Path("../relative/path/rec.wav"),
... duration=1,
... channels=1,
... samplerate=16000,
... )
>>> dataset = data.Dataset(name="test-dataset", recordings=[recording])
>>>
>>> # Define a function to make all paths absolute
>>> def make_absolute(path: Path) -> Path:
... # This is a simplistic example, in reality you might need a base directory
... return path.resolve()
>>>
>>> # Create and apply the transform
>>> path_transformer = PathTransform(transform=make_absolute)
>>> transformed_dataset = path_transformer.transform_dataset(dataset)
>>>
>>> # Check that the path in the transformed dataset is absolute
>>> transformed_dataset.recordings[0].path.is_absolute()
True
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform
|
Callable[[Path], Path]
|
A function that takes a |
required |
Methods:
| Name | Description |
|---|---|
transform_path |
Apply the transformation to a path. |
Attributes:
| Name | Type | Description |
|---|---|---|
transform |
|
TagsTransform(transform)
#
Bases: TransformBase
A transform for modifying sequences of tags.
This class provides a way to apply a transformation to all Tag
sequences within a soundevent data structure. It is useful for
filtering, renaming, or otherwise modifying tags across an entire
dataset.
It can be initialized directly with a function that transforms a whole
sequence of tags, or it can be constructed from a function that transforms
a single tag using the from_tag_transform class method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform
|
Callable[[Sequence[Tag]], Sequence[Tag]]
|
A function that takes a sequence of |
required |
Examples:
>>> from pathlib import Path
>>> from soundevent import data
>>> from soundevent.transforms import TagsTransform
>>>
>>> # Create a sample recording with a misspelled species tag
>>> recording = data.Recording(
... path=Path("rec.wav"),
... duration=1,
... channels=1,
... samplerate=16000,
... tags=[
... data.Tag(key="species", value="Myotis mytis"),
... data.Tag(key="quality", value="good"),
... ],
... )
>>>
>>> # Create a transform to correct the spelling of "Myotis myotis"
>>> def correct_species_name(tag: data.Tag) -> data.Tag:
... if tag.key == "species" and tag.value == "Myotis mytis":
... return tag.model_copy(update={"value": "Myotis myotis"})
... return tag
>>> corrector = TagsTransform.from_tag_transform(
... transform=correct_species_name
... )
>>> transformed_recording = corrector.transform_recording(recording)
>>>
>>> # Verify that the tag value has been corrected
>>> species_tag = next(
... t for t in transformed_recording.tags if t.key == "species"
... )
>>> species_tag.value
'Myotis myotis'
>>>
>>> # Verify that other tags are untouched
>>> quality_tag = next(
... t for t in transformed_recording.tags if t.key == "quality"
... )
>>> quality_tag.value
'good'
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform
|
Callable[[Sequence[Tag]], Sequence[Tag]]
|
A function that takes a sequence of |
required |
Methods:
| Name | Description |
|---|---|
from_tag_transform |
Create a TagsTransform from a function that transforms a single tag. |
transform_tags |
Apply the transformation to a sequence of tags. |
Attributes:
| Name | Type | Description |
|---|---|---|
transform |
|
Attributes#
transform = transform
instance-attribute
#
Functions#
from_tag_transform(transform)
classmethod
#
Create a TagsTransform from a function that transforms a single tag.
This factory method is a convenient way to create a TagsTransform
when your logic applies to each tag individually.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transform
|
Callable[[Tag], Optional[Tag]]
|
A function that takes a single |
required |
Returns:
| Type | Description |
|---|---|
TagsTransform
|
A new |
TransformBase
#
Base class for creating data transformations.
This class implements the visitor pattern to traverse the complex hierarchy
of soundevent data objects. It provides transform_* methods for each type
of data object in the soundevent ecosystem.
The default implementation of each transform_* method returns the object
unchanged or, for container-like objects, recursively calls the appropriate
transform methods on their children and returns a new container with the
transformed children.
To create a custom transformation, inherit from this class and override the
transform_* method for the specific object or attribute you want to
modify.
Examples:
>>> from soundevent import data
>>> from soundevent.transforms.base import TransformBase
>>>
>>> class UserAnonymizer(TransformBase):
... def transform_user(self, user: data.User) -> data.User:
... return user.model_copy(update={"name": "anonymous"})
Methods: