Reza2kn
/

canary-180m-persian-semiclean31-staged-smart-init

Automatic Speech Recognition

Model card Files Files and versions

Reza2kn commited on 15 days ago

Commit

2eeee5e

·

verified ·

1 Parent(s): 74c46f8

Add staged smart-init model card

Files changed (1) hide show

README.md +44 -0

README.md ADDED Viewed

	@@ -0,0 +1,44 @@

+---
+license: cc-by-4.0
+tags:
+- automatic-speech-recognition
+- persian
+- farsi
+- nemo
+- canary
+language:
+- fa
+- en
+base_model: nvidia/canary-180m-flash
+---
+# Persian-heavy Canary 180M staged smart-init ASR
+Experimental ASR-only adaptation of `nvidia/canary-180m-flash` for Persian-first bilingual ASR.
+This checkpoint uses the newer staged adaptation path:
+- Fresh Persian-heavy bilingual SentencePiece tokenizer.
+- Smart vocabulary/embedding initialization from the original Canary tokenizer where possible.
+- Stage 1: decoder/head adaptation with the encoder frozen.
+- Stage 2: full-parameter continuation with a short initial encoder freeze.
+Data mix:
+- Persian: `Reza2kn/persian-asr-semi-clean-31h-awq-wer` selected/cleaned audio+text only.
+- English: small FLEURS retention slice.
+- Train split: 46,006 rows, about 31.742 hours.
+- Validation split: 938 rows, about 0.652 hours.
+Validation on the internal portable held-out split:
+- Rows: 938
+- WER: 0.341208 (34.12%)
+- CER: 0.195946 (19.59%)
+Artifact:
+- `canary_180m_persian_semiclean31_staged_smart_gpu1_bd700.nemo`
+- SHA256: `77fe2c46c30a507440b7129bb2efbb8e9b0e18622346509c7c46e99af16adb49`
+This is still a research checkpoint, not yet an Android/CoreML/ONNX export. The earlier non-smart-init semiclean31 checkpoint was much worse, around 102% WER; this staged smart-init checkpoint is the first run where the adaptation is clearly learning.