Patent search ap:"SoundHound Inc." Page 4

31.

发明申请
DISTRIBUTED TRAINING OF NEURAL NETWORK MODELS 有权

公开(公告)号：US20210182660A1

公开(公告)日：2021-06-17

申请号：US16716461

申请日：2019-12-16

Applicant: SoundHound, Inc.

Inventor： Asif Amirguliyev , Zili Li , Jonah Probell

IPC: G06N3/08 , G06N3/04 , H04L29/08 , H04L12/24 , G10L15/22 , G10L15/16 , G10L15/06

Abstract: Systems and methods for distributed training of a neural network model are described. Various embodiments include a master device and a slave device. The master device has a first version of the neural network model. The slave device is communicatively coupled to a first data source and the master device, and the first data source is inaccessible by the master device, in accordance with one embodiment. The slave device is remote from the master device. The master device is configured to output first configuration data for the neural network model based on the first version of the neural network model. The slave device is configured to use the first configuration data to instantiate a second version of the neural network model. The slave device is configured to train the second version of the neural network model using data from the first data source and to output second configuration data for the neural network model. The master device is configured to use the second configuration data to update parameters for the first version of the neural network model.

32.

发明授权
Custom acoustic models 有权

公开(公告)号：US11011162B2

公开(公告)日：2021-05-18

申请号：US15996393

申请日：2018-06-01

Applicant: SOUNDHOUND, INC.

Inventor： Mehul Patel , Keyvan Mohajer

IPC: G10L15/22 , G06F3/16 , G10L15/18

Abstract: The technology disclosed relates to performing speech recognition for a plurality of different devices or devices in a plurality of conditions. This includes storing a plurality of acoustic models associated with different devices or device conditions, receiving speech audio including natural language utterances, receiving metadata indicative of a device type or device condition, selecting an acoustic model from the plurality in dependence upon the received metadata, and employing the selected acoustic model to recognize speech from the natural language utterances included in the received speech audio. Each of speech recognition and the storage of acoustic models can be performed locally by devices or on a network-connected server. Also provided is a platform and interface, used by device developers to select, configure, and/or train acoustic models for particular devices and/or conditions.

33.

发明授权
Integrated programming framework for speech and text understanding with block and statement structure 有权

公开(公告)号：US10996931B1

公开(公告)日：2021-05-04

申请号：US16209854

申请日：2018-12-04

Applicant: SoundHound, Inc.

Inventor： Keyvan Mohajer , Seyed M. Emami , Chris Wilson , Bernard Mont-Reynaud

IPC: G06F17/00 , G06F8/30 , G10L15/06 , H04M3/493 , G10L15/22

Abstract: The technology disclosed relates to authoring of vertical applications of natural language understanding (NLU), which analyze text or utterances and construct their meaning. In particular, it relates to new programming constructs and tools and data structures implementing those new applications.

34.

发明申请
System and Method for Voice Morphing 有权

公开(公告)号：US20210089626A1

公开(公告)日：2021-03-25

申请号：US16578386

申请日：2019-09-22

Applicant: SoundHound, Inc.

Inventor： Dylan H. Ross

IPC: G06F17/28 , G10L15/18 , G10L19/125 , G10L19/26

Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift.

35.

发明申请
DYNAMIC INTERPOLATION FOR HYBRID LANGUAGE MODELS 有权

公开(公告)号：US20210035569A1

公开(公告)日：2021-02-04

申请号：US16529730

申请日：2019-08-01

Applicant: SoundHound, Inc.

Inventor： Steffen Holm , Terry Kong , Kiran Garaga Lokeswarappa

IPC: G10L15/197 , G10L15/02 , G10L15/18 , G10L15/22 , G10L15/16

Abstract: In order to improve the accuracy of ASR, an utterance is transcribed using a plurality of language models, such as for example, an N-gram language model and a neural language model. The language models are trained separately. They each output a probability score or other figure of merit for a partial transcription hypothesis. Model scores are interpolated to determine a hybrid score. While recognizing an utterance, interpolation weights are chosen or updated dynamically, in the specific context of processing. The weights are based on dynamic variables associated with the utterance, the partial transcription hypothesis, or other aspects of context.

36.

发明授权
Unified embeddings for translation 有权

公开(公告)号：US10796107B2

公开(公告)日：2020-10-06

申请号：US16232984

申请日：2018-12-26

Applicant: SoundHound, Inc.

Inventor： Terry Kong

IPC: G06F40/216 , G06F40/58 , G06K9/62 , G06F40/295

Abstract: A method of training word embeddings is provided. The method includes determining anchors, each comprising a first word in a first domain and a second word in a second domain, training word embeddings for the first and second domains, and training a transform for transforming word embedding vectors in the first domain to word embedding vectors in the second domain, wherein the training minimizes a loss function that includes an anchor loss for each anchor, such that for each anchor, the anchor loss is based on a distance between the anchor's second word's embedding vector and the transform of the anchor's first word's embedding vector, and for each anchor, the anchor loss for the respective anchor is zero when the distance between the respective anchor's second word's embedding vector and the transform of the respective anchor's first word's embedding vector is less than a specific tolerance.

37.

发明申请
Adapting An Utterance Cut-Off Period Based On Parse Prefix Detection 审中-公开

公开(公告)号：US20200219513A1

公开(公告)日：2020-07-09

申请号：US16824308

申请日：2020-03-19

Applicant: SoundHound, Inc.

Inventor： Patricia Pozon AGUAYO , Jennifer Hee Young ZHANG , Jonah PROBELL

IPC: G10L15/22 , G10L15/05 , G10L15/18 , G10L25/78

Abstract: A processing system detects a period of non-voice activity and compares its duration to a cutoff period. The system adapts the cutoff period based on parsing previously-recognized speech to determine, according to a model, such as a machine-learned model, the probability that the speech recognized so far is a prefix to a longer complete utterance. The cutoff period is longer when a parse of previously recognized speech has a high probability of being a prefix of a longer utterance.

38.

发明授权
System and methods for a virtual assistant to manage and use context in a natural language dialog 有权

公开(公告)号：US10418032B1

公开(公告)日：2019-09-17

申请号：US15163485

申请日：2016-05-24

Applicant: SoundHound, Inc.

Inventor： Keyvan Mohajer , Christopher Wilson , Bernard Mont-Reynaud , Regina Collecchia

IPC: G10L15/22 , G06F17/30 , G10L15/19 , G06F16/25 , G06F16/2452

Abstract: A dialog with a conversational virtual assistant includes a sequence of user queries and systems responses. Queries are received and interpreted by a natural language understanding system. Dialog context information gathered from user queries and system responses is stored in a layered context data structure. Incomplete queries, which do not have sufficient information to result in an actionable interpretation, become actionable with use of context data. The system recognizes the need to access context data, and retrieves from context layers information required to transform the query into an executable one. The system may then act on the query and provide an appropriate response to the user. Context data buffers forget information, perhaps selectively, with the passage of time, and after a sufficient number and type of intervening queries.

39.

发明申请
PARSE PREFIX-DETECTION IN A HUMAN-MACHINE INTERFACE 审中-公开

公开(公告)号：US20190198012A1

公开(公告)日：2019-06-27

申请号：US15855908

申请日：2017-12-27

Applicant: SoundHound, Inc.

Inventor： Jennifer Hee Young ZHANG , Patricia Pozon AGUAYO , Jonah PROBELL

IPC: G10L15/05 , G10L25/78 , G10L15/18 , G10L15/22

CPC classification number: G10L15/05 , G10L15/1822 , G10L15/22 , G10L15/30 , G10L25/78 , G10L2015/088

Abstract: A speech-based human-machine interface that parses words spoken to detect a complete parse and, responsive to so detecting, computes a hypothesis as to whether the words are a prefix to another complete parse. The duration of no voice activity period to determine an end of a sentence depends on the prefix hypothesis. The user's typical speech speed profile and a short-term measure of speech speed also scale the period. Speech speed is measured by the time between words, and the period scaling uses a continuously adaptive algorithm. The system uses a longer cut-off period after a system wake-up event but before it detects any voice activity.

40.

发明申请
USER-PROVIDED TRANSCRIPTION FEEDBACK AND CORRECTION 审中-公开

公开(公告)号：US20190035385A1

公开(公告)日：2019-01-31

申请号：US16147889

申请日：2018-10-01

Applicant: SoundHound, Inc.

Inventor： Stephanie LAWSON , Kamyar MOHAJER , Glenda MOSLEY , Rainer LEEB

IPC: G10L15/01 , G10L15/06

Abstract: A system, method, and non-transitory computer readable medium provide for a visual display of a user interface for a voice-based virtual assistant system. After displaying a transcription of user speech and performing requested actions, the system allows the user to provide, by speech or manual input, an indication of satisfaction or dissatisfaction. For transcription errors, the user is presented an opportunity to correct the transcription text. The system can present several transcription hypotheses to the user, and allow the user to choose among them, or to edit one of them, as the intended transcription. A back-end server system uses the corrected transcription to train a machine learning model to perform more accurate speech recognition or provide more useful actions for future users. A system can save one or more speech recognition transcription hypotheses and check corrected results against the other transcriptions to further improve models.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification