I try to use Pyannotes models offline.
I was loading and applying models like this:
from pyannote.audio import Pipeline
access_token = 'xxxxxxxxxxx'
model = Pipeline.from_pretrained(
"pyannote/speaker-diarization-3.1",
use_auth_token=access_token)
path_in = 'blabla/1-137-A-32.wav'
num_speakers = 1
model(path_in,
num_speakers=num_speakers).labels()
That works fine.
But now I followed the instructions for offline use: https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/applying_a_pipeline.ipynb
My directory structure is as follows:
src-
|-pyannote_offline_config.yaml
|-pyannote_pytorch_model.bin
—- YAML —-
version: 3.1.0
pipeline:
name: pyannote.audio.pipelines.SpeakerDiarization
params:
clustering: AgglomerativeClustering
embedding: pyannote/wespeaker-voxceleb-resnet34-LM
embedding_batch_size: 32
embedding_exclude_overlap: true
segmentation: src/pyannote_pytorch_model.bin
segmentation_batch_size: 32
params:
clustering:
method: centroid
min_cluster_size: 12
threshold: 0.7045654963945799
segmentation:
min_duration_off: 0.0
—- Loading Model —-
path_yaml = 'src/pyannote_offline_config.yaml'
model = Pipeline.from_pretrained(path_yaml)
path_in = 'blabla/1-137-A-32.wav'
num_speakers = 1
model(path_in,
num_speakers=num_speakers).labels()
But that results in: “A pipeline must be instantiated with pipeline.instantiate(parameters)
before it can be applied.”
OK, next try:
—- Loading Model —-
path_yaml = 'src/pyannote_offline_config.yaml'
model = Pipeline.from_pretrained(path_yaml)
params = {'clustering':
{'method': 'centroid',
'min_cluster_size': 12,
'threshold': 0.7045654963945799},
'segmentation':
{'min_duration_off': 0.0}}
pipeline = model.instantiate(params)
path_in = 'blabla/1-137-A-32.wav'
num_speakers = 1
pipeline(path_in,
num_speakers=num_speakers).labels()
But that results in: “A pipeline must be instantiated with pipeline.instantiate(parameters)
before it can be applied.”
I don’t understand the problem.