Skip to content

perch#

Warning

Additional Dependencies: To use the Perch model, you will need to install additional dependencies. These dependencies are optional, as they can be heavy and may not be needed in all use cases. To install them, run:

pip install "audioclass[perch]"

audioclass.models.perch #

Module for loading and using the Google Perch audio classification model.

This module provides a convenient interface for working with the Perch model, a TensorFlow Hub-based model designed for bird sound classification. It includes the Perch class, which is a subclass of TensorflowModel, and functions for loading the model and its associated labels.

Notes

The Perch model is hosted on Kaggle. Depending on your network configuration, you might need to set up Kaggle API credentials to access the model. Refer to Kaggle's documentation for instructions.

This package is not affiliated with Google Research, the original developers of the Perch model.

Attributes#

INPUT_SAMPLES = 160000 module-attribute #

Default number of samples expected in the input tensor.

This value corresponds to 5 seconds of audio data at a sample rate of 32,000 Hz.

MODEL_PATH = 'https://www.kaggle.com/models/google/bird-vocalization-classifier/TensorFlow2/bird-vocalization-classifier/4' module-attribute #

Default path to the Perch TensorFlow Hub model URL.

SAMPLERATE = 32000 module-attribute #

Default sample rate of the audio data expected by the model (in Hz).

TAGS_PATH = DATA_DIR / 'perch' / 'label.csv' module-attribute #

Default path to the Perch labels file.

Classes#

Perch(callable, signature, tags, confidence_threshold, samplerate, name, logits=True) #

Bases: TensorflowModel

Google Perch audio classification model.

This class is a wrapper around a TensorFlow Hub model for bird sound classification. It provides methods for loading the model, processing audio data, and returning predictions.

Parameters:

Name Type Description Default
callable Callable

The TensorFlow callable representing the model.

required
signature Signature

The input and output signature of the model.

required
tags List[Tag]

The list of tags that the model can predict.

required
confidence_threshold float

The minimum confidence threshold for assigning a tag to a clip.

required
samplerate int

The sample rate of the audio data expected by the model (in Hz).

required
name str

The name of the model.

required
logits bool

Whether the model outputs logits (True) or probabilities (False). Defaults to True.

True
Functions#
load(model_url=MODEL_PATH, tags_url=TAGS_PATH, confidence_threshold=DEFAULT_THRESHOLD, samplerate=SAMPLERATE, name='Perch') classmethod #

Load a Perch model from a URL.

Parameters:

Name Type Description Default
model_url Union[Path, str]

The URL of the TensorFlow Hub model. Defaults to the official Perch model URL.

MODEL_PATH
tags_url Union[Path, str]

The URL or path to the file containing the labels. Defaults to the tags file included in the package.

TAGS_PATH
confidence_threshold float

The minimum confidence threshold for making predictions. Defaults to DEFAULT_THRESHOLD.

DEFAULT_THRESHOLD
samplerate int

The sample rate of the audio data expected by the model (in Hz). Defaults to SAMPLERATE.

SAMPLERATE
name str

The name of the model. Defaults to "Perch".

'Perch'

Returns:

Type Description
Perch

An instance of the Perch class.

Functions#

get_signature(callable) #

Get the signature of a Perch model.

Parameters:

Name Type Description Default
callable Callable

The TensorFlow callable representing the model.

required

Returns:

Type Description
Signature

The signature of the Perch model.

load_tags(path=TAGS_PATH) #

Load Perch labels from a file.

Parameters:

Name Type Description Default
path Union[Path, str]

Path or URL to the file containing the labels. Defaults to the tags file included in the package.

TAGS_PATH

Returns:

Type Description
List[Tag]

List of soundevent Tag objects.