speaker-embedding-onnx
ONNX export of the ResNet34 backbone from pyannote/wespeaker-voxceleb-resnet34-LM.
Follows the official wespeaker/bin/export_onnx.py approach: fbank features are computed externally, only the backbone is in ONNX.
Inputs / Outputs
| Name | Shape | Description |
|---|---|---|
input_features |
(batch, T, 80) |
Kaldi fbank features (T is dynamic) |
embedding |
(batch, 256) |
Speaker embedding vector |
Fbank parameters (must match at inference)
kaldi.fbank(wav * 32768, num_mel_bins=80, frame_length=25, frame_shift=10, round_to_power_of_two=True, window_type="hamming", use_energy=False, snip_edges=True, dither=0.0, sample_frequency=16000)
Then subtract per-bin mean: feats -= feats.mean(axis=0).
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support