speaker-embedding-onnx

ONNX export of the ResNet34 backbone from pyannote/wespeaker-voxceleb-resnet34-LM.

Follows the official wespeaker/bin/export_onnx.py approach: fbank features are computed externally, only the backbone is in ONNX.

Inputs / Outputs

Name Shape Description
input_features (batch, T, 80) Kaldi fbank features (T is dynamic)
embedding (batch, 256) Speaker embedding vector

Fbank parameters (must match at inference)

kaldi.fbank(wav * 32768, num_mel_bins=80, frame_length=25, frame_shift=10, round_to_power_of_two=True, window_type="hamming", use_energy=False, snip_edges=True, dither=0.0, sample_frequency=16000)

Then subtract per-bin mean: feats -= feats.mean(axis=0).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support