Qwen3-ForcedAligner-0.6B โ CoreML INT8
CoreML conversion of Qwen/Qwen3-ForcedAligner-0.6B with INT8 palettization for Apple Neural Engine.
Predicts word-level timestamps in a single forward pass.
Models
| Model | Description | Quantization |
|---|---|---|
encoder.mlmodelc |
Audio encoder (24 layers) | INT8 palettized |
decoder.mlmodelc |
Text decoder + classify head (28 layers) | INT8 palettized |
Usage
let aligner = try await CoreMLForcedAligner.fromPretrained(
modelId: "aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8"
)
let aligned = aligner.align(audio: samples, text: "Hello world", sampleRate: 24000)
Variants
| Variant | Backend | Size | Model ID |
|---|---|---|---|
| CoreML INT4 | Neural Engine | ~630 MB | aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT4 |
| CoreML INT8 | Neural Engine | ~1.0 GB | aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8 |
| MLX 4-bit | GPU | ~979 MB | aufklarer/Qwen3-ForcedAligner-0.6B-4bit |
| MLX 8-bit | GPU | ~1.4 GB | aufklarer/Qwen3-ForcedAligner-0.6B-8bit |
Links
- Swift library: soniqo/speech-swift
- Base model: Qwen/Qwen3-ForcedAligner-0.6B
Links
- Blog: blog.ivan.digital
- Library Docs: soniqo.audio
- Downloads last month
- 47
Model tree for aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8
Base model
Qwen/Qwen3-ForcedAligner-0.6B