Methods and systems for cockpit speech recognition acoustic model training with multi-level corpus data augmentation

    公开(公告)号:US10997967B2

    公开(公告)日:2021-05-04

    申请号:US16388647

    申请日:2019-04-18

    Abstract: A method for initializing a device for performing acoustic speech recognition (ASR) using an ASR model, by a computer system including at least one processor and a system memory element. The method includes obtaining a plurality of voice data articulations of predetermined phrases, by the at least one processor via a user interface. The plurality of voice data articulations includes a first quantity of audio samples of actual articulated voice data, and each of the plurality of voice data articulations includes one of the audio samples including acoustic frequency components. The method further includes performing a plurality of augmentations to the plurality of voice data articulations of predetermined phrases, to generate a corpus audio data set that includes the first quantity of audio samples and a second quantity of audio samples including augmented versions of the first quantity of audio samples.

    METHODS AND SYSTEMS FOR COCKPIT SPEECH RECOGNITION ACOUSTIC MODEL TRAINING WITH MULTI-LEVEL CORPUS DATA AUGMENTATION

    公开(公告)号:US20200335084A1

    公开(公告)日:2020-10-22

    申请号:US16388647

    申请日:2019-04-18

    Abstract: A method for initializing a device for performing acoustic speech recognition (ASR) using an ASR model, by a computer system including at least one processor and a system memory element. The method includes obtaining a plurality of voice data articulations of predetermined phrases, by the at least one processor via a user interface. The plurality of voice data articulations includes a first quantity of audio samples of actual articulated voice data, and each of the plurality of voice data articulations includes one of the audio samples including acoustic frequency components. The method further includes performing a plurality of augmentations to the plurality of voice data articulations of predetermined phrases, to generate a corpus audio data set that includes the first quantity of audio samples and a second quantity of audio samples including augmented versions of the first quantity of audio samples.

Patent Agency Ranking