--- license: apache-2.0 tags: - audio - speech - foundation-model - next-token-prediction - isoflop - research --- # Discrete Audio IsoFLOP Model (discrete-audio-isoflop-3e20-1.68B-d1920-L19-B128-a41e32) A suite of discrete audio models trained for our IsoFLOP study as part of **SODA**, which is a unified next-token prediction on interleaved semantic, acoustic, and text tokens. 🥤 **Project Page:** [https://soda-audio.github.io](https://soda-audio.github.io/) For full usage instructions (e.g., inference code), and more information, please refer to the **[SODA-4B-base](https://huggingface.co/soda-research/soda-4b-base)** model card. The details for this particular model is as follows: - `compute_budget`: 3e20 - `param_count` (non-embedding): 1.68B - `hidden_dim`: 1920 - `num_layers`: 19 - `batch_size`: 128 - `training_step`: 56131 - `hash_key`: a41e32 📈 **WandB**: https://wandb.ai/potsawee/marin/groups/IsoFlop/workspace