Hugging Face for Audio

Team

company

huggingface

Huggingface

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

bezzam updated a Space about 9 hours ago

hf-audio/open-asr-leaderboard-espnet

bezzam updated a Space about 9 hours ago

hf-audio/open-asr-leaderboard-abr

bezzam updated a bucket about 19 hours ago

hf-audio/asr_leaderboard_h200

View all activity

Papers

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

View all Papers

hf-audio 's collections 7

Open ASR Leaderboard Eval Configurations

Running

Open ASR Leaderboard configuration for Transformers 🤗 models

🎙

Display the server’s root folder as a web page
Running

Open ASR Leaderboard configuration for NVIDIA NeMo ASR models

🎙

Normalize text to a consistent, clean format
Running

Open ASR Leaderboard configuration for Boson's Higgs Audio v3

🎙

Normalize and clean text data for analysis
Running

Open ASR Leaderboard configuration for API models

🎙

Run evaluations and benchmark your ML models

Automatic Speech Recognition 📝

A collection of ASR models supported in 🤗 Transformers

openai/whisper-large-v2

Automatic Speech Recognition • 2B • Updated Feb 29, 2024 • 61.7k • 1.8k
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.09M • 398
facebook/wav2vec2-large-xlsr-53

Updated Mar 18, 2022 • 177k • 159
facebook/hubert-xlarge-ls960-ft

Automatic Speech Recognition • 1.0B • Updated Jun 27, 2023 • 14.5k • 16

Audio Classification 🔊

A collection of audio classification models supported in 🤗 Transformers

laion/clap-htsat-unfused

Feature Extraction • Updated Apr 24, 2023 • 494k • • 74
MIT/ast-finetuned-audioset-10-10-0.4593

Audio Classification • 86.6M • Updated Sep 6, 2023 • 298k • 358
facebook/mms-lid-126

Audio Classification • 1.0B • Updated Jun 13, 2023 • 20.3k • 34

Audio Codecs Embeddings 🎙️

A collection of codec and embedding models supported in 🤗 Transformers.

laion/clap-htsat-unfused

Feature Extraction • Updated Apr 24, 2023 • 494k • • 74
facebook/encodec_32khz

Feature Extraction • 59M • Updated Sep 4, 2023 • 45.3k • 18
descript/dac_44khz

Feature Extraction • 76.6M • Updated Oct 11, 2024 • 88.7k • • 11
descript/dac_24khz

Feature Extraction • 74.7M • Updated Jul 10, 2025 • 3.04k • • 3

Xcodec and Xcodec2

Transformer supported versions of X-Codec models: https://github.com/zhenye234/xcodec?tab=readme-ov-file#available-models

hf-audio/xcodec-hubert-general-balanced

Feature Extraction • 0.2B • Updated Oct 8, 2025 • 1.01k • 1
hf-audio/xcodec-wavlm-more-data

Feature Extraction • 0.2B • Updated Oct 8, 2025 • 1.4k • 1
hf-audio/xcodec-wavlm-mls

Feature Extraction • 0.2B • Updated Oct 8, 2025 • 954
hf-audio/xcodec-hubert-general

Feature Extraction • 0.2B • Updated Oct 8, 2025 • 4.09k

Text to Speech 🗣️

A collection of TTS models supported in 🤗 Transformers.

suno/bark

Text-to-Speech • Updated Oct 4, 2023 • 17.5k • 1.53k
kakao-enterprise/vits-vctk

Text-to-Speech • 39.7M • Updated Sep 11, 2023 • 891 • 15
facebook/mms-tts

Text-to-Speech • Updated Jul 25, 2023 • 191
microsoft/speecht5_tts

Text-to-Speech • Updated Nov 8, 2023 • 101k • 835

Text to Music 🎧

A collection of music generation models supported in 🤗 Transformers and 🧨 Diffusers

cvssp/audioldm2

Updated Apr 16, 2024 • 23.1k • 69
cvssp/audioldm-s-full-v2

Updated Apr 16, 2024 • 2.19k • 21
harmonai/jmann-large-580k

Updated Jun 16, 2024 • 6 • 1
ucsd-reach/musicldm

Updated Sep 22, 2023 • 221 • 7