-
公开(公告)号:US20230082955A1
公开(公告)日:2023-03-16
申请号:US17447823
申请日:2021-09-16
Applicant: SoundHound, Inc.
Inventor: Timothy P. STONEHOCKER , Zizu GOWAYYED , Matthias EICHSTAEDT , Seyed Majid EMAMI , Evelyn JIANG , Ryan BERRYHILL , Mathieu RAMONA , Neil VEIRA
Abstract: A system for performing automated speech recognition (ASR) on audio data includes a queue manager to receive a request to perform ASR on audio data, add the request to a queue of incoming requests, and determine a queue depth representing a number of requests in the queue at a given time. The system also includes a load supervisor to receive the request and the queue depth from the queue manager and assign a service level for the request based on the queue depth. In addition, the system includes a speech-to-text converter to receive the assigned service level for the request from the load supervisor, select an ASR model for the request based on the received service level, receive the audio data associated with the request, and perform ASR on the audio data using the selected ASR model.
-
公开(公告)号:US20210335340A1
公开(公告)日:2021-10-28
申请号:US17224967
申请日:2021-04-07
Applicant: SoundHound, Inc.
Inventor: Zizu GOWAYYED , Keyvan MOHAJER
Abstract: A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.
-
公开(公告)号:US20230352000A1
公开(公告)日:2023-11-02
申请号:US18348259
申请日:2023-07-06
Applicant: SoundHound, Inc.
Inventor: Zizu GOWAYYED , Keyvan MOHAJER
CPC classification number: G10L15/02 , G10L15/04 , G10L15/22 , G10L2015/025
Abstract: A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.
-
-