LATENCY REDUCTION FOR MULTI-STAGE SPEECH RECOGNITION

    公开(公告)号:US20240274127A1

    公开(公告)日:2024-08-15

    申请号:US18167763

    申请日:2023-02-10

    Abstract: Systems and techniques are provided for processing one or more audio samples. For example, a process can include receiving one or more audio samples in a first audio frame and determining, using a first keyword detection model, a first keyword detection score for the first audio frame. One or more audio samples can be received in additional audio frames. Based on the first keyword detection score exceeding a first threshold, the first keyword detection model can be used to determine a keyword detection score for each audio frame of the additional audio frames. The respective keyword detection score for each audio frame of the additional audio frames can be compared to a second threshold that is greater than the first threshold. Based on the respective keyword detection score exceeding the second threshold, using a second keyword detection model to process the first audio frame and the additional audio frames can be skipped.

Patent Agency Ranking