W2V-BERT 2.0 ASR Adapters

This repository contains per-language bottleneck adapters for automatic speech recognition (ASR) trained on top of facebook/w2v-bert-2.0.

Model Description

  • Base Model: facebook/w2v-bert-2.0 (600M parameters, frozen)
  • Adapter Architecture: Bottleneck adapters (Pfeiffer-style, dim=64)
  • Decoder: Lightweight transformer decoder (2 layers)
  • Training: CTC loss with extended vocabulary for double vowels

Trained Adapters

Training in progress...

Adapter Language WER Train Samples

Successful: 14/14

โœ… Good (WER < 30%): 9
    swh_Latn_v2: 4.00%
    swh_Latn_salt: 13.25%
    kik_Latn: 16.49%
    luo_Latn: 16.50%
    swh_Latn_v1: 17.34%
    eng_Latn_tts: 21.85%
    eng_Latn_salt: 24.58%
    lug_Latn_salt: 28.02%
    ach_Latn: 28.62%

โšก Medium (30-60% WER): 3
    kam_Latn: 30.66%
    mer_Latn: 36.49%
    teo_Latn: 58.12%

โš ๏ธ Poor (60-90% WER): 1
    nyn_Latn: 64.39%

โŒ Collapsed (WER >= 90%): 1
    ful_Latn: 99.98%

Architecture

The model uses:

  1. Frozen w2v-bert-2.0 encoder - Extracts audio representations
  2. Bottleneck adapters - Language-specific adaptation (trainable)
  3. Lightweight decoder - Transformer decoder blocks (trainable)
  4. LM head - Per-language vocabulary projection (trainable)

Usage

Each adapter folder contains:

  • adapter_weights.pt - Bottleneck adapter weights
  • decoder_weights.pt - Decoder block weights
  • lm_head_weights.pt - Language model head weights
  • final_norm_weights.pt - Final layer norm weights
  • vocab.json - Language-specific vocabulary
  • adapter_config.json - Adapter configuration
  • metrics.json - Training metrics

Training Configuration

  • Epochs: 10
  • Base Learning Rate: 0.0005 (adaptive based on dataset size)
  • Batch Size: 48 x 1
  • Extended Vocabulary: True
  • Adapter Dimension: 64

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mutisya/w2v-bert-adapters-14lang-e10-25_52-v10

Finetuned
(415)
this model