Publications

(2024). BAT: Learning to Reason about Spatial Sounds with Large Language Models.

PDF Project Slides

(2023). emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation.

PDF Cite Code Slides

(2023). Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition.

PDF Cite Slides

(2023). Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning. In ASRU 2023.

PDF Cite Code Slides

(2023). Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation. In INTERSPEECH 2023.

PDF Cite Slides

(2022). MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets. In INTERSPEECH 2023.

PDF Cite Code Dataset Slides

(2022). EXPLORING EFFECTIVE DISTILLATION OF SELF-SUPERVISED SPEECH MODELS FOR AUTOMATIC SPEECH RECOGNITION. ASRU 2023.

PDF Cite Slides