Papers
arxiv:2512.00482

Beyond Performance: Probing Representation Dynamics In Speech Enhancement Models

Published on Nov 29, 2025
Authors:
,
,

Abstract

Analysis of speech enhancement model representations across noise conditions reveals differential activation patterns and inter-layer dynamics that inform SNR-aware conditioning strategies.

AI-generated summary

We probe internal representations of a speech enhancement (SE) model across noise conditions. Using MUSE, a transformer-convolutional model trained on VoiceBank DEMAND, we analyze activations in encoder, latent, decoder, and refinement blocks while sweeping input signal-to-noise-ratios (SNRs) from -10 to 30 dB. We use Centered Kernel Alignment (CKA) to measure point-wise representation similarity and diffusion distance to capture distributional shifts across SNRs. Results show that the encoder CKA between noisy and clean inputs remains stable and latent and decoder CKA drop sharply as SNR decreases. Linear fits of CKA versus SNR reveal a depth-dependent robustness-sensitivity trade-off. The diffusion distance varies incrementally with SNR within each layer but differs strongly across layers, especially at low SNRs. Together, these findings indicate that noise levels differentially activate model regions and induce distinct inter-layer dynamics, motivating SNR-aware conditioning and refinement strategies for SE.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2512.00482
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 1

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.00482 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.