-
公开(公告)号:US12154546B2
公开(公告)日:2024-11-26
申请号:US18348259
申请日:2023-07-06
Applicant: SoundHound, Inc.
Inventor: Zizu Gowayyed , Keyvan Mohajer
Abstract: A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.
-
公开(公告)号:US11776533B2
公开(公告)日:2023-10-03
申请号:US17225997
申请日:2021-04-08
Applicant: SoundHound, Inc.
Inventor: Bernard Mont-Reynaud , Seyed M. Emami , Chris Wilson , Keyvan Mohajer
CPC classification number: G10L15/18 , G06F8/31 , G06F40/205 , G10L15/06 , G10L15/22 , H04M3/4938
Abstract: A method of building a natural language understanding application is provided. The method includes receiving at least one electronic record containing programming code and creating executable code from the programming code. Further, the executable code, when executed by a processor, causes the processor to create a parse and an interpretation of a sequence of input tokens, the programming code includes an interpret-block and the interpret-block includes an interpret-statement. Additionally, the interpret-statement includes a pattern expression and the interpret-statement includes an action statement.
-
公开(公告)号:US11367448B2
公开(公告)日:2022-06-21
申请号:US17237003
申请日:2021-04-21
Applicant: SOUNDHOUND, INC.
Inventor: Keyvan Mohajer , Mehul Patel
Abstract: A method of providing a platform for configuring device-specific speech recognition is provided. The method includes providing a user interface for developers to select a set of at least two acoustic models appropriate for a specific type of a device, receiving, from a developer, a selection of the set of the at least two acoustic models, and configuring a speech recognition system to perform device-specific speech recognition by using one acoustic model selected from the at least two acoustic models of the set.
-
公开(公告)号:US20210350087A1
公开(公告)日:2021-11-11
申请号:US17383097
申请日:2021-07-22
Applicant: SoundHound, Inc.
Inventor: Kamyar Mohajer , Keyvan Mohajer , Bernard Mont-Reynaud , Pranav Singh
Abstract: Aspects include methods, systems, and computer-program products providing virtual assistant domain functionality. A natural language query including one or more words is received. A collection of natural language modules is accessed. The collection natural language modules are configured to process sets of natural language queries. A natural language module, from the collection of natural language modules, is identified to interpret the natural language query. An interpretation of the natural language query is computed using the identified natural language module. A response to the natural language query is returned using the computed interpretation.
-
公开(公告)号:US11023509B1
公开(公告)日:2021-06-01
申请号:US16226372
申请日:2018-12-19
Applicant: SoundHound, Inc.
Inventor: Jason Weinstein , Keyvan Mohajer
IPC: G06F16/33 , G06F16/338 , G06N5/04
Abstract: A method for processing a natural language query. The method includes receiving a text query, the query referring to a plurality of objects, attributes, qualifiers and other arguments and parsing the query to produce an argument tree representing the substance and structure of the query. The method also includes the capability to define qualifiers as being possibly projectable onto other arguments and indicate their direction of projectability and the capability to denote nodes of the argument tree as foldable, as splittable, or as containing sequences of qualifier arguments. The method additionally includes defining validity rules for a domain of knowledge, used to determine whether a list of arguments form a valid granular query component and processing of the argument tree, in view of the above in order to derive a corresponding plurality of granular query components that collectively request the plurality of pieces of information representing the intent of the query.
-
公开(公告)号:US10896671B1
公开(公告)日:2021-01-19
申请号:US16206963
申请日:2018-11-30
Applicant: SoundHound, Inc.
Inventor: Keyvan Mohajer , Christopher S. Wilson , Bernard Mont-Reynaud , Robert MacRae
Abstract: A command-processing server provides natural language services to applications. More specifically, the command-processing server receives natural language inputs from users for use in applications such as virtual assistants. Some user inputs create user-defined rules that consist of trigger conditions and of corresponding actions that are executed when the triggers fire. The command-processing server stores the rules received from a user in association with the specific user. The command-processing server also identifies rules that can be generalized across users and promoted into generic rules applicable to many or all users. The generic rules may or may not have an associated context constraining their application.
-
公开(公告)号:US10832287B2
公开(公告)日:2020-11-10
申请号:US16134890
申请日:2018-09-18
Applicant: SoundHound, Inc.
Inventor: Aaron Master , Keyvan Mohajer
Abstract: An audio recognition system provides for delivery of promotional content to its user. A user interface device, locally or with the assistance of a network-connected server, performs recognition of audio in response to queries. Recognition can be through a method such as processing features extracted from the audio. Audio can comprise recorded music, singing or humming, instrumental music, vocal music, spoken voice, or other recognizable types of audio. Campaign managers provide promotional content for delivery in response to audio recognized in queries.
-
公开(公告)号:US20190012311A1
公开(公告)日:2019-01-10
申请号:US16128227
申请日:2018-09-11
Applicant: SoundHound, Inc.
Inventor: Pranav Singh , Keyvan Mohajer , Kamyar Mohajer , Bernard Mont-Reynaud
Abstract: A platform provides for developers of applications, such as devices, with natural language interfaces to configure the availability of vertical domain modules in applications. Modules can include grammars for parsing natural language expressions and interfaces to data sources. Third party developers can create modules with pricing models for their usage or access to their data. Device developers can browse or search available modules and test their performance for specific queries. The platform provides for devices users to access the chosen modules as configured by device developers and for charging and payment between users, application developers, and module developers.
-
公开(公告)号:US20170256264A1
公开(公告)日:2017-09-07
申请号:US15603257
申请日:2017-05-23
Applicant: SoundHound, Inc.
Inventor: Timothy Stonehocker , Keyvan Mohajer , Bernard Mont-Reynaud
CPC classification number: G10L15/30 , G10L15/04 , G10L15/063 , G10L15/08 , G10L15/265 , G10L15/34 , G10L17/06 , G10L2015/0635 , G10L2015/081
Abstract: A system and method is presented for performing dual mode speech recognition, employing a local recognition module on a mobile device and a remote recognition engine on a server device. The system accepts a spoken query from a user, and both the local recognition module and the remote recognition engine perform speech recognition operations on the query, returning a transcription and confidence score, subject to a latency cutoff time. If both sources successfully transcribe the query, then the system accepts the result having the higher confidence score. If only one source succeeds, then that result is accepted. In either case, if the remote recognition engine does succeed in transcribing the query, then a client vocabulary is updated if the remote system result includes information not present in the client vocabulary.
-
公开(公告)号:US09619560B1
公开(公告)日:2017-04-11
申请号:US14884650
申请日:2015-10-15
Applicant: Soundhound, Inc.
Inventor: Aaron Master , Bernard Mont-Reynaud , Keyvan Mohajer
IPC: G06F17/30
CPC classification number: G06F17/30743 , G10L15/08 , G10L25/54
Abstract: In one implementation, a method is described of retrying matching of an audio query against audio references. The method includes receiving a follow-up query that requests a retry at matching a previously submitted audio query. In some implementations, this follow-up query is received without any recognition hint that suggests how to retry matching. The follow-up query includes the audio query or a reference to the audio query to be used in the retry. The method further includes retrying matching the audio query using retry matching resources that include an expanded group of audio references, identifying at least one match and transmitting a report of the match. Optionally, the method includes storing data that correlates the follow-up query, the audio query or the reference to the audio query, and the match after retrying.
-
-
-
-
-
-
-
-
-