A Short Bio

I am currently a Senior Research Engineer at OmniSpeech. Before that, I got my PhD and Masters degrees from the Electrical and Computer Engineering Department at Univeristy of Maryland, College Park. I did my PhD under the supervision of Prof. Carol Espy-Wilson and Prof. Shihab Shamma where I explored the ideas of acoustic-to-articulatory speech inversion and learning interpretable articulatory representations of speech inspired by sensorimotor learning algorithms. During my PhD, I also extensively worked on utilizing articulatory representations of speech along with facial action units extracted from video, and text embeddings to develop multimodal systems for detecting mental health conditions, emotions and child pronunciation disorders.

My primary research interests are in speech communication, audio and deep learning. I combine knowledge of digital signal processing, speech science, linguistics, acoustic phonetics and machine learning to conduct interdisciplinary research in speech production, speech synthesis, speech inversion, speech enhancement and audio classification. I have also worked on using speech as a behavioral signal for emotion recognition, and the detection and monitoring of mental health.

CV

News

(May 2024) Two papers accepted at Interspeech 2024.
- My internship work done at Dolby Labarotories on “Accent Conversion with Articulatory Representations”
- A work I colloborated with the Speech Communication Lab at UMD, “A Multimodal Framework for the Assessment of the Schizophrenia Spectrum”
(May 2024) Paper accepted for publication at the 46th Annual International Conference of the IEEE EMBC 2024
(May 2024) Paper accepted for publication at the 32nd European Signal Processing Conference(EUSIPCO) 2024.
(Jan 2024) I joined OmniSpeech LLC as a Senior Research Engineer to work on developing AI based speech enhancement and audio deepfake detection algorithms.
(Oct 2023) I successfully defended my PhD dissertation at the University of Maryland College park in Electrical and Computer Engineering. I am curently open for research positions in industry.
(June 2023) I joined as a Summer Research Intern with the Multimodal science group at Dolby Laboratories. Excited to work on a speech synthesis and voice conversion related research problem.
(May 2023) Three first author papers accepted for publication in Interspeech 2023
- Learning to Compute the Articulatory Representations of Speech with the MIRRORNET
- Speaker-independent Speech Inversion for Estimation of Nasalance
- Acoustic-to-Articulatory Speech Inversion Features for Mispronunciation Detection of /r/ in Child Speech Sound Disorders (equal contribution with Nina R. Benway)
(May 2023) Our paper on “Audio data augmentation for acoustic-to-articulatory speech inversion” has been accepted for publication in the 31st European Signal Processing Conference(EUSIPCO) 2023
(April 2023) I have been selected to IEEE ICASSP Rising Star Programme to present my thesis work at the ICASSP 2023
(Feb 2023) Our paper “The Secret Source : Incorporating Source Features to Improve Acoustic-to-Articulatory Speech Inversion” has been accepted for publication in ICASSP 2023
(Sep 2022) Attended Interspeech 2022 in Incheon, South Korea to present our paper on “Acoustic-to-articulatory Speech Inversion with Multi-task Learning”
(May 2022) Our paper “Acoustic-to-articulatory Speech Inversion with Multi-task Learning” has been accepted for publication in Interspeech 2022
(Jan 2022) Our paper “The MirrorNet: Learning Audio Synthesizer Controls Inspired by Sensorimotor Interactions” has been accepted for publication in ICASSP 2022
(Dec 2021) Attended the 181st meeting of Acoustical Society of America in Seattle, WA to present our work on “Emotion Recognition with Articulatory Coordination Features”
(July 2021) Our paper “Multimodal Approach for Assessing Neuromotor Coordination in Schizophrenia using Convolutional Neural Networks” has been accepted for publication in ACM ICMI 2021

Yashish M. Siriwardena

CV

News