Anup Singh
Applied Scientist II at Amazon
I am currently an Applied Scientist at Amazon, where I work on Automatic Speech Recognition (ASR), leveraging large-scale models and speech LLMs to improve accuracy, robustness, and multilingual capabilities in real-world media applications. My broader interests include speech representation learning, audio understanding, and efficient large-scale systems.
I completed my Ph.D. in Computer Science at Ghent University, where I was part of the Speech and Audio Processing Group at IDLab. I was advised by Prof. Kris Demuynck and Prof. Vipul Arora. My doctoral research focused on self-supervised learning for speech and audio, with an emphasis on scalable audio indexing and retrieval. Building on this foundation, I explored speech tokenization techniques aimed at advancing textless NLP and speech-based language models.
I hold a BS–MS dual degree in Mathematics from the Indian Institute of Science Education and Research (IISER-Kolkata).
Outside of work, I enjoy learning about geopolitics, reading, and playing sports (mostly lawn-tennis these days!)
news
| Jan 22, 2026 | Our paper titled “Harmonic Summation-Based Robust Pitch Estimation in Noisy and Reverberant Environments” has been accepted at NCC 2026. Check out the paper. |
|---|---|
| Jan 17, 2026 | Our paper titled “BEST-STD2.0: Balanced and Efficient Speech Tokenizer for Spoken Term Detection” has been accepted at ICASSP 2026. Check out the paper. |
| Jan 01, 2026 | I have joined Amazon as an Applied Scientist II, working on Speech LLMs. |
latest posts
| Apr 12, 2026 | RLHF: Reinforcement Learning from Human Feedback |
|---|---|
| Feb 21, 2026 | Flow Matching |
| Dec 26, 2025 | What are Diffusion Models? |