From Words to Waves: Analyzing Concept Formation in Speech and Text-Based Foundation Models

Accepted at Interspeech 2025 Main Track

This repository contains the official implementation and analysis code for our paper.

🛠️ Environment Setup

We provide three specialized conda environments for different analysis components:

Core Speech Analysis Environment

conda env create -f environment.yml
conda activate speechX

NeuroX Interpretability Environment

conda env create -f environment_neurox.yml
conda activate neurox_pip

Concept Clustering Environment

conda env create -f environment_conceptx.yml
conda activate conceptx

📊 Running Experiments

Representation Extraction and Clustering

Speech Modality Analysis (LibriSpeech)

Extract and analyze speech representations using HuBERT:

sh scripts/librispeech/speech/hubert/extract_and_cluster_speech.sh

Text Modality Analysis (LibriSpeech)

Extract and analyze text representations using BERT:

sh scripts/librispeech/text/bert/extract_and_cluster_text.sh

Fine-tuning Experiments

SST-2 Sentiment Analysis

Train models on the Stanford Sentiment Treebank:

# Fine-tune SpeechT5 model
sbatch sst2_ft/scripts/train_speecht5.sh

# Evaluate fine-tuned model (update model path in script first)
sbatch sst2_ft/scripts/infer_speecht5.sh

📖 Citation

If you find this work useful for your research, please cite:

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
ConceptX		ConceptX
NeuroX		NeuroX
data		data
extracted_data/textgrid_sentences		extracted_data/textgrid_sentences
fast_align_output/en-en/Alignment		fast_align_output/en-en/Alignment
scripts		scripts
src		src
sst2_ft		sst2_ft
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
environment_conceptx.yml		environment_conceptx.yml
environment_neurox.yml		environment_neurox.yml
requirements.txt		requirements.txt
requirements_conceptx.txt		requirements_conceptx.txt
requirements_neurox.txt		requirements_neurox.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

From Words to Waves: Analyzing Concept Formation in Speech and Text-Based Foundation Models

🛠️ Environment Setup

Core Speech Analysis Environment

NeuroX Interpretability Environment

Concept Clustering Environment

📊 Running Experiments

Representation Extraction and Clustering

Speech Modality Analysis (LibriSpeech)

Text Modality Analysis (LibriSpeech)

Fine-tuning Experiments

SST-2 Sentiment Analysis

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

shammur/MultimodalXplain

Folders and files

Latest commit

History

Repository files navigation

From Words to Waves: Analyzing Concept Formation in Speech and Text-Based Foundation Models

🛠️ Environment Setup

Core Speech Analysis Environment

NeuroX Interpretability Environment

Concept Clustering Environment

📊 Running Experiments

Representation Extraction and Clustering

Speech Modality Analysis (LibriSpeech)

Text Modality Analysis (LibriSpeech)

Fine-tuning Experiments

SST-2 Sentiment Analysis

📖 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages