Robust Quantizer from HuBERT Base (Layer 6)

This model checkpoint contains a Robust Quantizer trained on top of the 6th layer of the hubert-base-ls960 model. It was developed as part of a reproduction and evaluation study on creating robust discrete speech units, originally proposed in Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling (Gat et al., 2023).

Model Details

This quantizer was trained to provide discrete pseudo-labels that are resilient to various acoustic perturbations. By applying data augmentations during the quantization process, the resulting discrete units become, and by extension downstream acoustic models, more robust to noise and varying acoustic conditions.

Usage

Download the Model

from huggingface_hub import hf_hub_download

model_path = hf_hub_download(repo_id="iliasslasri/robust_speech_quantizer", 
                              filename="500_vocab_size/round_1/E1_best.pt",
                              force_download=True)
config_path = hf_hub_download(repo_id="iliasslasri/robust_speech_quantizer", 
                               filename="500_vocab_size/config.yaml",
                               force_download=True)

Augmentation Examples

Here are examples of the data augmentations applied to the audio during the training of the quantizer:

Augmentation Audio Example
Clean
Time Stretch
Pitch Shift
Reverberation
Noise
Echo
Random Noise
Pink Noise
Lowpass Filter
Highpass Filter
Bandpass Filter
Smooth
Boost Audio
Duck Audio
Up-Down Resample

Relevant Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for iliasslasri/robust_speech_quantizer

Finetuned
(134)
this model

Dataset used to train iliasslasri/robust_speech_quantizer