SSM-AST: State Space Models for Automatic Speech Transcription
This repository contains model artifacts for the master's thesis βState Space Models for Automatic Speech Transcription.β It provides the trained acoustic encoder checkpoints, language-model checkpoints, n-gram training text, and selected training logs used to support evaluation of a pure State Space Model (SSM) automatic speech transcription pipeline on LibriSpeech.
The code for training and evaluation is maintained separately. This Hugging Face repository is intended as a checkpoint and artifact archive so that the thesis evaluation pipeline can be run without retraining the full models from scratch.
Repository contents
SSM-AST/
βββ datasets/
β βββ librispeecm_lm_dataset_pre-processed_char_level_text.txt
βββ encoder_checkpoints/
β βββ enc_mamba3_460h_checkpoint_best_epoch=49_val_wer=0.255.ckpt
β βββ enc_mamba3_960h_checkpoint_best_epoch=91_val_wer=0.186.ckpt
β βββ enc_mamba_460h_checkpoint_best_epoch=49_val_wer=0.227.ckpt
β βββ enc_mamba_960h_checkpoint_best_epoch=100_val_wer=0.155.ckpt
β βββ enc_ssssm_460h_checkpoint_best_epoch=49_val_wer=0.197.ckpt
β βββ enc_ssssm_960h_checkpoint_best_epoch=100_val_wer=0.111.ckpt
β βββ enc_ssssm_960h_checkpoint_best_epoch=98_val_wer=0.111.ckpt
βββ lm_checkpoints/
β βββ lm_mamba3_checkpoint_MaxChars-1000000000_ds-64_d320_L18.pt
β βββ lm_mamba_checkpoint_MaxChars-1000000000_d320_L18.pt
β βββ lm_ngram_checkpoint_char_10gram.pkl
βββ log files/
β βββ 0utput_exp-mamba-1_960h_W-320_D-48_S-16_B-128_E-100.txt
β βββ 0utput_exp-mamba3_460h_W-512_D-30_S-16_B-64_E-50.txt
β βββ 0utput_exp-mamba3_enc_960h_W-320_D-48_S-16_b-64_E-100.txt
β βββ 0utput_exp-mamba_dt_bias_hier_460h_W-512_D-30_S-16.txt
β βββ 0utput_exp-v75_460h_hier_gating_256_42.txt
β βββ 0utput_exp-v77_960h_hier_gating_320_48.txt
β βββ mamba3_elm_training.log
β βββ mamba_elm_training.log
βββ .gitattributes
βββ README.md