Accepted at Interspeech 2025 Main Track
This repository contains the official implementation and analysis code for our paper.
We provide three specialized conda environments for different analysis components:
conda env create -f environment.yml
conda activate speechX
conda env create -f environment_neurox.yml
conda activate neurox_pip
conda env create -f environment_conceptx.yml
conda activate conceptx
Extract and analyze speech representations using HuBERT:
sh scripts/librispeech/speech/hubert/extract_and_cluster_speech.sh
Extract and analyze text representations using BERT:
sh scripts/librispeech/text/bert/extract_and_cluster_text.sh
Train models on the Stanford Sentiment Treebank:
# Fine-tune SpeechT5 model
sbatch sst2_ft/scripts/train_speecht5.sh
# Evaluate fine-tuned model (update model path in script first)
sbatch sst2_ft/scripts/infer_speecht5.sh
If you find this work useful for your research, please cite: