Anup Singh

Applied Scientist II at Amazon

anup_final.png

I am currently an Applied Scientist at Amazon, where I work on Automatic Speech Recognition (ASR), leveraging large-scale models and speech LLMs to improve accuracy, robustness, and multilingual capabilities in real-world media applications. My broader interests include speech representation learning, audio understanding, and efficient large-scale systems.

I completed my Ph.D. in Computer Science at Ghent University, where I was part of the Speech and Audio Processing Group at IDLab. I was advised by Prof. Kris Demuynck and Prof. Vipul Arora. My doctoral research focused on self-supervised learning for speech and audio, with an emphasis on scalable audio indexing and retrieval. Building on this foundation, I explored speech tokenization techniques aimed at advancing textless NLP and speech-based language models.

I hold a BS–MS dual degree in Mathematics from the Indian Institute of Science Education and Research (IISER-Kolkata).

Outside of work, I enjoy learning about geopolitics, reading, and playing sports (mostly lawn-tennis these days!)

news

Jan 22, 2026 Our paper titled “Harmonic Summation-Based Robust Pitch Estimation in Noisy and Reverberant Environments” has been accepted at NCC 2026. Check out the paper.
Jan 17, 2026 Our paper titled “BEST-STD2.0: Balanced and Efficient Speech Tokenizer for Spoken Term Detection” has been accepted at ICASSP 2026. Check out the paper.
Jan 01, 2026 I have joined Amazon as an Applied Scientist II, working on Speech LLMs.

latest posts

selected publications

  1. Interspeech
    interspeech2025.png
    Language-Agnostic Speech Tokenizer for Spoken Term Detection with Efficient Retrieval
    Anup Singh, Kris Demuynck, and Vipul Arora
    In Proc. Interspeech 2025, 2025
  2. ICASSP
    best-std2.0.png
    BEST-STD2.0: Balanced and Efficient Speech Tokenizer for Spoken Term Detection
    Anup Singh, Vipul Arora, and Kris Demuynck
    In ICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026