Qwen3-ForcedAligner-0.6B โ€” CoreML INT8

CoreML conversion of Qwen/Qwen3-ForcedAligner-0.6B with INT8 palettization for Apple Neural Engine.

Predicts word-level timestamps in a single forward pass.

Models

Model Description Quantization
encoder.mlmodelc Audio encoder (24 layers) INT8 palettized
decoder.mlmodelc Text decoder + classify head (28 layers) INT8 palettized

Usage

let aligner = try await CoreMLForcedAligner.fromPretrained(
    modelId: "aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8"
)
let aligned = aligner.align(audio: samples, text: "Hello world", sampleRate: 24000)

Variants

Variant Backend Size Model ID
CoreML INT4 Neural Engine ~630 MB aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT4
CoreML INT8 Neural Engine ~1.0 GB aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8
MLX 4-bit GPU ~979 MB aufklarer/Qwen3-ForcedAligner-0.6B-4bit
MLX 8-bit GPU ~1.4 GB aufklarer/Qwen3-ForcedAligner-0.6B-8bit

Links


Links

Downloads last month
47
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8

Quantized
(2)
this model