-
公开(公告)号:US20160292266A1
公开(公告)日:2016-10-06
申请号:US15182300
申请日:2016-06-14
Applicant: SoundHound, Inc.
Inventor: Bernard Mont-Reynaud , Aaron Master , Timothy P. Stonehocker , Keyvan Mohajer
IPC: G06F17/30
CPC classification number: G06F17/30743 , G06F17/30026 , G06F17/30749 , G06F17/30772
Abstract: The present invention relates to the continuous monitoring of an audio signal and identification of audio items within an audio signal. The technology disclosed utilizes predictive caching of fingerprints to improve efficiency. Fingerprints are cached for tracking an audio signal with known alignment and for watching an audio signal without known alignment, based on already identified fingerprints extracted from the audio signal. Software running on a smart phone or other battery-powered device cooperates with software running on an audio identification server.
Abstract translation: 本发明涉及音频信号的连续监视和音频信号内的音频项目的识别。 所公开的技术利用指纹的预测性缓存来提高效率。 基于从音频信号提取的已经识别的指纹,缓存指纹用于跟踪具有已知对准的音频信号并且用于观看没有已知对准的音频信号。 在智能手机或其他电池供电设备上运行的软件与在音频识别服务器上运行的软件配合使用。
-
公开(公告)号:US12125484B2
公开(公告)日:2024-10-22
申请号:US17562891
申请日:2021-12-27
Applicant: SoundHound, Inc.
Inventor: Scott Halstvedt , Keyvan Mohajer , Bernard Mont-Reynaud
IPC: G10L15/22 , G06F3/16 , G06F21/32 , G06V40/16 , G10L15/08 , G10L17/00 , G10L17/04 , G10L17/06 , G10L17/22
CPC classification number: G10L15/22 , G06F3/167 , G06F21/32 , G10L15/08 , G10L17/04 , G10L17/06 , G10L17/22 , G06V40/16 , G06V40/166 , G10L2015/088 , G10L2015/223 , G10L17/00
Abstract: A method of controlling an engagement state of an agent during a human-machine dialog is provided. The method can include receiving a spoken request that is a conditional locking request, wherein the conditional locking request uses a natural language expression to explicitly specify a locking condition, which is a predicate, storing the predicate in a format that can be evaluated when needed by the agent, entering a conditionally locked state in response to the conditional locking request, in the conditionally locked state, receiving a multiplicity of requests without a need for a wakeup indicator, and for a request from the multiplicity of requests evaluating the predicate upon receiving the request, and processing the request if the predicate is true.
-
公开(公告)号:US12067006B2
公开(公告)日:2024-08-20
申请号:US17350294
申请日:2021-06-17
Applicant: SoundHound, Inc.
Inventor: Pranav Singh , Yilun Zhang , Keyvan Mohajer , Mohammadreza Fazeli
IPC: G06F16/242 , G06N3/045 , G06N3/088
CPC classification number: G06F16/2425 , G06N3/045 , G06N3/088
Abstract: A machine learning system for a digital assistant is described, together with a method of training such a system. The machine learning system is based on an encoder-decoder sequence-to-sequence neural network architecture trained to map input sequence data to output sequence data, where the input sequence data relates to an initial query and the output sequence data represents canonical data representation for the query. The method of training involves generating a training dataset for the machine learning system. The method involves clustering vector representations of the query data samples to generate canonical-query original-query pairs in training the machine learning system.
-
公开(公告)号:US20220129639A1
公开(公告)日:2022-04-28
申请号:US17569433
申请日:2022-01-05
Applicant: SoundHound, Inc.
Inventor: Kheng Khov , Keyvan Mohajer , Ian Graves , Christopher S. Wilson
Abstract: A user request is received (e.g., in natural language form) by a client device. In order to facilitate richer natural language understanding, a response-processing server handles interpretation of the request, rather than requiring the client device to interpret it. The response-processing server determines the various possible responses that client devices could make in response to the request based on (for example) the state of the application data, and/or the capabilities of the client devices. The response-processing server accordingly a response package that describes a number of different conditional responses that client devices could have to the request. The client device selects a response from the response package, executes the command (if possible), and provides the user with some representation of the response.
-
公开(公告)号:US11295730B1
公开(公告)日:2022-04-05
申请号:US16529689
申请日:2019-08-01
Applicant: SoundHound, Inc.
Inventor: Keyvan Mohajer , Christopher Wilson , Bernard Mont-Reynaud
Abstract: A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.
-
公开(公告)号:US11003426B1
公开(公告)日:2021-05-11
申请号:US16786991
申请日:2020-02-10
Applicant: SoundHound, Inc.
Inventor: Christopher S. Wilson , Keyvan Mohajer
IPC: G06F8/41 , G06F40/211
Abstract: A command-processing server provides natural language processing services to applications. The command-processing server stores a set of code blocks, each code block being able to interpret a set of corresponding natural language expressions. The command-processing server accepts natural language expressions and identifies the code blocks that are capable of interpreting those expressions by attempting to parse the natural language expressions using the code blocks. The command-processing server then provides a list of the identified code blocks to the developers, who can then incorporate the code blocks into their applications.
-
公开(公告)号:US20210019787A1
公开(公告)日:2021-01-21
申请号:US17039593
申请日:2020-09-30
Applicant: SoundHound, Inc.
Inventor: Aaron Master , Keyvan Mohajer
Abstract: An audio recognition system provides for delivery of promotional content to its user. A user interface device, locally or with the assistance of a network-connected server, performs recognition of audio in response to queries. Recognition can be through a method such as processing features extracted from the audio. Audio can comprise recorded music, singing or humming, instrumental music, vocal music, spoken voice, or other recognizable types of audio. Campaign managers provide promotional content for delivery in response to audio recognized in queries.
-
18.
公开(公告)号:US10832005B1
公开(公告)日:2020-11-10
申请号:US16243920
申请日:2019-01-09
Applicant: SoundHound, Inc.
Inventor: Keyvan Mohajer , Bernard Mont-Reynaud
IPC: G06F40/205 , G06F40/30
Abstract: The technology disclosed relates to computer-implemented conversational agents and particularly to detecting a point in the dialog (end of turn, or end of utterance) at which the agent can start responding to the user. The technology disclosed provides a method of incrementally parsing an input utterance with multiple parses operating in parallel. The technology disclosed includes detecting an interjection point in the input utterance when a pause exceeds a high threshold, or detecting an interjection point in the input utterance when a pause exceeds a low threshold and at least one of the parallel parses is determined to be interruptible by matching a complete sentence according to the grammar. The conversational agents start responding to the user at a detected interjection point.
-
公开(公告)号:US20200013094A1
公开(公告)日:2020-01-09
申请号:US16572179
申请日:2019-09-16
Applicant: SoundHound, Inc.
Inventor: Keyvan Mohajer , Scott Halstvedt
Abstract: Original concepts obtained from a query may be augmented with additional concepts connected to the original concepts in a concept graph in response to determining that the original concepts did not match a sufficient number of bid functions. The augmented set of concepts may then be evaluated with respect to the bid functions to identify matching ad functions. This process may be repeated until a sufficient number of matching ad functions are found. A bid amount of the matching bid functions may be calculated, such as based on semantic information obtained as a result of the query. The bid amounts may further be based on environmental information. A bid function is selected based on the bid amounts and the content associated with the bid function is provided to the source of the query. The content may be selected based on the semantic information.
-
公开(公告)号:US20190303438A1
公开(公告)日:2019-10-03
申请号:US15942875
申请日:2018-04-02
Applicant: SoundHound, Inc.
Inventor: Christopher S. Wilson , Keyvan Mohajer , Bernard Mont-Reynaud
Abstract: The present invention extends to methods, systems, and computer program products for interpreting expressions having potentially ambiguous meanings in different domains. Multi-domain natural language understanding systems can support a variety of different types of clients. Expressions can be interpreted across multiple domains. Weights can be assigned to domains. Weights can be client specific or expression specific so that a chosen interpretation is more likely correct for the type of client or for its context. Stored weight sets can be chosen according to identifying information carried as metadata with expressions or weight sets carried directly as metadata. Domains can additionally or alternatively be ranked in ordered lists or comparative domain pairs of to favor some domains over others as appropriate for client type or client context.
-
-
-
-
-
-
-
-
-