PhD student at NUS
Xintong Wang 王心童
I am 28.77 years old, a third-year PhD student at the National University of Singapore, supervised by Prof. Wang Ye. My research focuses on real-time and data-efficient methods for audio and audio-language models.
I earned my B.S. from Beijing Forestry University in 2022. Email: xintongwang9709 at gmail dot com.
Interactive
Playaround
A small place for interactive demos around speech models.
Running Agents 1 Whisper-Pinyin Demo Transcribe Mandarin speech to Pinyin text
Selected papers
Publications
Conference Articles
- Xintong Wang, Mingqian Shi, and Ye Wang, "Pitch-Aware RNN-T for Mandarin Chinese Mispronunciation Detection and Diagnosis," Interspeech 2024. Oral PDF
- Junchuan Zhao, Xintong Wang, and Ye Wang, "Prosody-Adaptable Audio Codecs for Zero-Shot Voice Conversion via In-Context Learning," Interspeech 2025. arXiv
Workshop Articles
Journal Articles
- Xintong Wang, Chuangang Zhao, "A 2D Convolutional Gating Mechanism for Mandarin Streaming Speech Recognition," Information, 12.4 (2021): 165. PDF
Experience
Work Experience
Oct 2023 - Aug 2024
Research Assistant
Sound and Music Computing Lab, School of Computing, National University of Singapore, Singapore
Jul 2022 - Oct 2023
Machine Learning Engineer
X Studio, Xiaoice, Beijing
May 2021 - Jul 2022
Intern
AI Being BU, Xiaoice, Beijing
Visitors
Visitor Map
Interests
Interests
I often boulder at FitBloc in Singapore. Feel free to reach out if you would like to chat about research or climbing.