Speech model retrieval in distributed speech recognition systems

    公开(公告)号:US10152973B2

    公开(公告)日:2018-12-11

    申请号:US14942551

    申请日:2015-11-16

    Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

    Estimating false rejection rate in a detection system
    36.
    发明授权
    Estimating false rejection rate in a detection system 有权
    估计检测系统中的错误拒绝率

    公开(公告)号:US09589560B1

    公开(公告)日:2017-03-07

    申请号:US14135309

    申请日:2013-12-19

    CPC classification number: G10L15/01 G06K9/6277

    Abstract: Features are disclosed for estimating a false rejection rate in a detection system. The false rejection rate can be estimated by fitting a model to a distribution of detection confidence scores. An estimated false rejection rate can then be computed for confidence scores that fall below a threshold. The false rejection rate and model can be verified once the detection system has been deployed by obtaining additional data with confidence scores falling below the threshold. Adjustments to the model or other operational parameters can be implemented based on the verified false rejection rate, model, or additional data.

    Abstract translation: 公开了用于估计检测系统中的假拒绝率的特征。 可以通过将模型拟合到检测置信度分数的分布来估计错误拒绝率。 然后可以计算低于阈值的置信度分数的估计的错误拒绝率。 一旦检测系统被部署,可以通过获得低于阈值的置信度分数的附加数据来验证错误拒绝率和模型。 可以基于验证的假拒绝率,模型或附加数据来实现对模型或其他操作参数的调整。

    Anchored speech detection and speech recognition

    公开(公告)号:US11514901B2

    公开(公告)日:2022-11-29

    申请号:US16437763

    申请日:2019-06-11

    Abstract: A system configured to process speech commands may classify incoming audio as desired speech, undesired speech, or non-speech. Desired speech is speech that is from a same speaker as reference speech. The reference speech may be obtained from a configuration session or from a first portion of input speech that includes a wakeword. The reference speech may be encoded using a recurrent neural network (RNN) encoder to create a reference feature vector. The reference feature vector and incoming audio data may be processed by a trained neural network classifier to label the incoming audio data (for example, frame-by-frame) as to whether each frame is spoken by the same speaker as the reference speech. The labels may be passed to an automatic speech recognition (ASR) component which may allow the ASR component to focus its processing on the desired speech.

    Detecting system-directed speech
    38.
    发明授权

    公开(公告)号:US11361763B1

    公开(公告)日:2022-06-14

    申请号:US15694348

    申请日:2017-09-01

    Abstract: A speech-processing system capable of receiving and processing audio data to determine if the audio data includes speech that was intended for the system. Non-system directed speech may be filtered out while system-directed speech may be selected for further processing. A system-directed speech detector may use a trained machine learning model (such as a deep neural network or the like) to process a feature vector representing a variety of characteristics of the incoming audio data, including the results of automatic speech recognition and/or other data. Using the feature vector the model may output an indicator as to whether the speech is system-directed. The system may also incorporate other filters such as voice activity detection prior to speech recognition, or the like.

    DEVICE-DIRECTED UTTERANCE DETECTION

    公开(公告)号:US20210295833A1

    公开(公告)日:2021-09-23

    申请号:US16822744

    申请日:2020-03-18

    Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.

Patent Agency Ranking