Skip to content

perch#

Warning

Additional Dependencies: To use the Perch model, you will need to install additional dependencies. These dependencies are optional, as they can be heavy and may not be needed in all use cases. To install them, run:

pip install "audioclass[perch]"

audioclass.models.perch #

Module for loading and using the Google Perch audio classification model.

This module provides a convenient interface for working with the Perch model, a TensorFlow Hub-based model designed for bird sound classification. It includes the Perch class, which is a subclass of TensorflowModel, and functions for loading the model and its associated labels.

Notes

The Perch model is hosted on Kaggle. Depending on your network configuration, you might need to set up Kaggle API credentials to access the model. Refer to Kaggle's documentation for instructions.

This package is not affiliated with Google Research, the original developers of the Perch model.

Classes:

Name Description
Perch

Google Perch audio classification model.

Functions:

Name Description
get_signature

Get the signature of a Perch model.

load_tags

Load Perch labels from a file.

Attributes:

Name Type Description
INPUT_SAMPLES

Default number of samples expected in the input tensor.

MODEL_PATH

Default path to the Perch TensorFlow Hub model URL.

SAMPLERATE

Default sample rate of the audio data expected by the model (in Hz).

TAGS_PATH

Default path to the Perch labels file.

ebird2021
ebird2021_def

Attributes#

INPUT_SAMPLES = 160000 module-attribute #

Default number of samples expected in the input tensor.

This value corresponds to 5 seconds of audio data at a sample rate of 32,000 Hz.

MODEL_PATH = 'https://www.kaggle.com/models/google/bird-vocalization-classifier/TensorFlow2/bird-vocalization-classifier/4' module-attribute #

Default path to the Perch TensorFlow Hub model URL.

SAMPLERATE = 32000 module-attribute #

Default sample rate of the audio data expected by the model (in Hz).

TAGS_PATH = DATA_DIR / 'perch' / 'label.csv' module-attribute #

Default path to the Perch labels file.

ebird2021 = data.Term(uri='https://www.birds.cornell.edu/clementschecklist/wp-content/uploads/2021/08/eBird_Taxonomy_v2021.csv', label='ebird2021', name='ebird:2021speciescodes', definition=ebird2021_def) module-attribute #

ebird2021_def = 'The eBird 2021 taxonomy is a global list of bird species used for reporting sightings in eBird.\nIt includes all species and subspecies, and is updated annually to reflect the latest ornithological knowledge.\nThis comprehensive list is used across various Cornell Lab projects and is vital for data analysis,\nbird identification, and citizen science initiatives.\n\nFor more information and to download the taxonomy, visit the eBird website.\n' module-attribute #

Classes#

Perch(callable, signature, tags, confidence_threshold, samplerate, name, logits=True, batch_size=8) #

Bases: TensorflowModel

Google Perch audio classification model.

This class is a wrapper around a TensorFlow Hub model for bird sound classification. It provides methods for loading the model, processing audio data, and returning predictions.

Parameters:

Name Type Description Default
callable Callable

The TensorFlow callable representing the model.

required
signature Signature

The input and output signature of the model.

required
tags List[Tag]

The list of tags that the model can predict.

required
confidence_threshold float

The minimum confidence threshold for assigning a tag to a clip.

required
samplerate int

The sample rate of the audio data expected by the model (in Hz).

required
name str

The name of the model.

required
logits bool

Whether the model outputs logits (True) or probabilities (False). Defaults to True.

True
batch_size int

The maximum number of frames to process in each batch. Defaults to 8.

8

Methods:

Name Description
load

Load a Perch model from a URL.

Functions#
load(model_url=MODEL_PATH, tags_url=TAGS_PATH, confidence_threshold=DEFAULT_THRESHOLD, samplerate=SAMPLERATE, name='Perch', batch_size=8) classmethod #

Load a Perch model from a URL.

Parameters:

Name Type Description Default
model_url Union[Path, str]

The URL of the TensorFlow Hub model. Defaults to the official Perch model URL.

MODEL_PATH
tags_url Union[Path, str]

The URL or path to the file containing the labels. Defaults to the tags file included in the package.

TAGS_PATH
confidence_threshold float

The minimum confidence threshold for making predictions. Defaults to DEFAULT_THRESHOLD.

DEFAULT_THRESHOLD
samplerate int

The sample rate of the audio data expected by the model (in Hz). Defaults to SAMPLERATE.

SAMPLERATE
name str

The name of the model. Defaults to "Perch".

'Perch'
batch_size int

The batch size used for processing audio data. Defaults to 8.

8

Returns:

Type Description
Perch

An instance of the Perch class.

Functions#

get_signature(callable) #

Get the signature of a Perch model.

Parameters:

Name Type Description Default
callable Callable

The TensorFlow callable representing the model.

required

Returns:

Type Description
Signature

The signature of the Perch model.

Raises:

Type Description
ValueError

If the model does not have exactly one input tensor, if the input tensor does not have 2 dimensions, or if the model does not have exactly two output tensors.

load_tags(path=TAGS_PATH) #

Load Perch labels from a file.

Parameters:

Name Type Description Default
path Union[Path, str]

Path or URL to the file containing the labels. Defaults to the tags file included in the package.

TAGS_PATH

Returns:

Type Description
List[Tag]

List of soundevent Tag objects.