Joint CTC-attention based end-to-end speech recognition using multi-task learning
S Kim, T Hori, S Watanabe
2017 IEEE international conference on acoustics, speech and signal …, 2017
Hybrid CTC/attention architecture for end-to-end speech recognition
S Watanabe, T Hori, S Kim, JR Hershey, T Hayashi
IEEE Journal of Selected Topics in Signal Processing 11 (8), 1240-1253, 2017
Multi-channel speech recognition: Lstms all the way through
H Erdogan, T Hayashi, JR Hershey, T Hori, C Hori, WN Hsu, S Kim, ...
CHiME-4 workshop, 1-4, 2016
Towards language-universal end-to-end speech recognition
S Kim, ML Seltzer
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018
Multimodal transfer deep learning with applications in audio-visual recognition
S Moon, S Kim, H Wang
arXiv preprint arXiv:1412.3121, 2014
Environmental noise embeddings for robust speech recognition
S Kim, B Raj, I Lane
arXiv preprint arXiv:1601.02553, 2016
Recurrent models for auditory attention in multi-microphone distance speech recognition
S Kim, I Lane
arXiv preprint arXiv:1511.06407, 2015
Improved training for online end-to-end speech recognition systems
S Kim, ML Seltzer, J Li, R Zhao
arXiv preprint arXiv:1711.02212, 2017
Impact of nano-scale through-silicon vias on the quality of today and future 3D IC designs
DH Kim, S Kim, SK Lim
International Workshop on System Level Interconnect Prediction, 1-8, 2011
Dialog-context aware end-to-end speech recognition
S Kim, F Metze
2018 IEEE Spoken Language Technology Workshop (SLT), 434-440, 2018
End-to-End Speech Recognition with Auditory Attention for Multi-Microphone Distance Speech Recognition.
S Kim, I Lane, S Kim, I Lane
Interspeech, 3867-3871, 2017
Improving RNN transducer based ASR with auxiliary tasks
C Liu, F Zhang, D Le, S Kim, Y Saraf, G Zweig
2021 IEEE Spoken Language Technology Workshop (SLT), 172-179, 2021
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion
S Kim, S Dalmia, F Metze
arXiv preprint arXiv:1906.11604, 2019
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer
S Kim, Y Shangguan, J Mahadeokar, A Bruguier, C Fuegen, ML Seltzer, ...
arXiv preprint arXiv:2010.13878, 2020
Situation informed end-to-end asr for chime-5 challenge
S Kim, S Dalmia, F Metze
CHiME5 workshop, 2018
Cross-attention end-to-end asr for two-party conversations
S Kim, S Dalmia, F Metze
arXiv preprint arXiv:1907.10726, 2019
Acoustic-to-word models with conversational context information
S Kim, F Metze
arXiv preprint arXiv:1905.08796, 2019
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
D Le, M Jain, G Keren, S Kim, Y Shi, J Mahadeokar, J Chan, ...
arXiv preprint arXiv:2104.02194, 2021
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding
S Kim, A Arora, D Le, CF Yeh, C Fuegen, O Kalinli, ML Seltzer
arXiv preprint arXiv:2104.02138, 2021
End-to-End Speech Recognition on Conversations
S Kim
Carnegie Mellon University, 2019
