Reducing Insertion Errors in Neural Transducer-Based Automatic Speech Recognition
摘要:
Techniques for training a neural transducer-based automatic speech recognition model to be robust against background additive noise and thereby reducing insertion errors. In one aspect, a method of training an automatic speech recognition model includes: generating a modified training data set from an initial training dataset by concatenating one-word utterances with a preceding or a succeeding sentence in the initial training dataset based on a duration of silence between the one-word utterances and the preceding or the succeeding sentence; and training the automatic speech recognition model using the modified training data set.
信息查询
0/0