Patent search ap:("SoundHound Page Inc.") AND inv:"Sudharsan Krishnaswamy"

1.

发明申请
Neural Speech-to-Meaning 有权

公开(公告)号：US20210174806A1

公开(公告)日：2021-06-10

申请号：US16703783

申请日：2019-12-04

Applicant: SoundHound, Inc.

Inventor： Sudharsan Krishnaswamy , Maisy Wieman , Jonah Probell

IPC: G10L15/26 , G06F3/16 , G10L15/22 , G10L15/30 , G10L15/18 , G10L15/183

Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.

2.

发明授权
Synthesizing speech recognition training data 有权

公开(公告)号：US11308938B2

公开(公告)日：2022-04-19

申请号：US16704216

申请日：2019-12-05

Applicant: SoundHound, Inc.

Inventor： Maisy Wieman , Jonah Probell , Sudharsan Krishnaswamy

IPC: G10L15/22 , G10L15/06 , G10L15/16 , G10L15/18 , G10L13/02 , G10L15/197 , G10L15/187

Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.

3.

发明授权
Meaning inference from speech audio 有权

公开(公告)号：US11769488B2

公开(公告)日：2023-09-26

申请号：US17653365

申请日：2022-03-03

Applicant: SoundHound, Inc.

Inventor： Sudharsan Krishnaswamy , Maisy Wieman , Jonah Probell

IPC: G10L15/06 , G10L15/16 , G10L15/18 , G10L13/02 , G10L15/197 , G10L15/22 , G10L15/187

CPC classification number: G10L15/063 , G10L13/02 , G10L15/16 , G10L15/187 , G10L15/1815 , G10L15/197 , G10L15/22

Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.

4.

发明授权
Neural speech-to-meaning 有权

公开(公告)号：US11749281B2

公开(公告)日：2023-09-05

申请号：US16703783

申请日：2019-12-04

Applicant: SoundHound, Inc.

Inventor： Sudharsan Krishnaswamy , Maisy Wieman , Jonah Probell

IPC: G10L15/26 , G06F3/16 , G10L15/18 , G10L15/183 , G10L15/22 , G10L15/30

CPC classification number: G10L15/26 , G06F3/167 , G10L15/183 , G10L15/1815 , G10L15/22 , G10L15/30 , G10L2015/223

Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.

5.

发明申请
Synthesizing Speech Recognition Training Data 有权

公开(公告)号：US20210174783A1

公开(公告)日：2021-06-10

申请号：US16704216

申请日：2019-12-05

Applicant: SoundHound, Inc.

Inventor： Maisy Wieman , Jonah Probell , Sudharsan Krishnaswamy

IPC: G10L15/06 , G10L15/16 , G10L15/18 , G10L15/187 , G10L15/197 , G10L15/22 , G10L13/02

Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.

Patent Agency Ranking