TRAINING END-TO-END SPOKEN LANGUAGE UNDERSTANDING SYSTEMS WITH UNORDERED ENTITIES

    公开(公告)号:US20230081306A1

    公开(公告)日:2023-03-16

    申请号:US17458772

    申请日:2021-08-27

    IPC分类号: G10L15/22 G10L15/16 G06N3/08

    摘要: Training data can be received, which can include pairs of speech and meaning representation associated with the speech as ground truth data. The meaning representation includes at least semantic entities associated with the speech, where the spoken order of the semantic entities is unknown. The semantic entities of the meaning representation in the training data can be reordered into spoken order of the associated speech using an alignment technique. A spoken language understanding machine learning model can be trained using the pairs of speech and meaning representation having the reordered semantic entities. The meaning representation, e.g., semantic entities, in the received training data can be perturbed to create random order sequence variations of the semantic entities associated with speech. Perturbed meaning representation with associated speech can augment the training data.

    Transliteration based data augmentation for training multilingual ASR acoustic models in low resource settings

    公开(公告)号:US11568858B2

    公开(公告)日:2023-01-31

    申请号:US17073337

    申请日:2020-10-17

    IPC分类号: G10L15/06 G10L15/16

    摘要: A computer-implemented method of building a multilingual acoustic model for automatic speech recognition in a low resource setting includes training a multilingual network on a set of training languages with an original transcribed training data to create a baseline multilingual acoustic model. Transliteration of transcribed training data is performed by processing through the multilingual network a plurality of multilingual data types from the set of languages, and outputting a pool of transliterated data. A filtering metric is applied to the pool of transliterated data output to select one or more portions of the transliterated data for retraining of the acoustic model. Data augmentation is performed by adding one or more selected portions of the output transliterated data back to the original transcribed training data to update training data. The training of a new multilingual acoustic model through the multilingual network is performed using the updated training data.

    MULTI-MODAL LUNG CAPACITY MEASUREMENT FOR RESPIRATORY ILLNESS PREDICTION

    公开(公告)号:US20220110542A1

    公开(公告)日:2022-04-14

    申请号:US17065936

    申请日:2020-10-08

    IPC分类号: A61B5/091 A61B5/00 G06N3/08

    摘要: Determining lung capacity of includes capturing an audio waveform of the user performing an utterance presented to a user. A video of the user performing the utterance can be captured. The captured audio waveform and the video are analyzed for compliance. Based on the audio waveform, an indicator of respiratory function is determined. The indicator is compared with a reference indicator to determine health of the user. A machine learning model such as neural network can be trained to predict the indicator of the respiratory function based on input features comprising audio spectral and temporal characteristics of utterances. Determining the indicator or respiratory function can include running the trained machine learning model.

    DENOISING A SIGNAL
    7.
    发明申请
    DENOISING A SIGNAL 审中-公开

    公开(公告)号:US20190237090A1

    公开(公告)日:2019-08-01

    申请号:US16379667

    申请日:2019-04-09

    IPC分类号: G10L21/0208

    CPC分类号: G10L21/0208

    摘要: A computer-implemented method according to one embodiment includes creating a clean dictionary, utilizing a clean signal, creating a noisy dictionary, utilizing a first noisy signal, determining a time varying projection, utilizing the clean dictionary and the noisy dictionary, denoising a second noisy signal, utilizing the time varying projection, and expanding the clean dictionary and the noisy dictionary by updating the clean dictionary and the noisy dictionary to include new clean spectro-temporal building blocks and new noisy spectro-temporal building blocks created utilizing additional clean and noisy signals.

    SOFT LABEL GENERATION FOR KNOWLEDGE DISTILLATION

    公开(公告)号:US20190205748A1

    公开(公告)日:2019-07-04

    申请号:US15860097

    申请日:2018-01-02

    IPC分类号: G06N3/08

    CPC分类号: G06N3/08

    摘要: A technique for generating soft labels for training is disclosed. In the method, a teacher model having a teacher side class set is prepared. A collection of class pairs for respective data units is obtained. Each class pair includes classes labelled to a corresponding data unit from among the teacher side class set and from among a student side class set that is different from the teacher side class set. A training input is fed into the teacher model to obtain a set of outputs for the teacher side class set. A set of soft labels for the student side class set is calculated from the set of the outputs by using, for each member of the student side class set, at least an output obtained for a class within a subset of the teacher side class set having relevance to the member of the student side class set, based at least in part on observations in the collection of the class pairs.