-
公开(公告)号:US11589184B1
公开(公告)日:2023-02-21
申请号:US17655650
申请日:2022-03-21
Applicant: SoundHound, Inc.
Inventor: Bernard Mont-Reynaud
IPC: H04S7/00
Abstract: Methods and systems for intuitive spatial audio rendering with improved intelligibility are disclosed. By establishing a virtual association between an audio source and a location in the listener's virtual audio space, a spatial audio rendering system can generate spatial audio signals that create a natural and immersive audio field for a listener. The system can receive the virtual location of the source as a parameter and map the source audio signal to a source-specific multi-channel audio signal. In addition, the spatial audio rendering system can be interactive and dynamically modify the rendering of the spatial audio in response to a user's active control or tracked movement.
-
公开(公告)号:US20210350087A1
公开(公告)日:2021-11-11
申请号:US17383097
申请日:2021-07-22
Applicant: SoundHound, Inc.
Inventor: Kamyar Mohajer , Keyvan Mohajer , Bernard Mont-Reynaud , Pranav Singh
Abstract: Aspects include methods, systems, and computer-program products providing virtual assistant domain functionality. A natural language query including one or more words is received. A collection of natural language modules is accessed. The collection natural language modules are configured to process sets of natural language queries. A natural language module, from the collection of natural language modules, is identified to interpret the natural language query. An interpretation of the natural language query is computed using the identified natural language module. A response to the natural language query is returned using the computed interpretation.
-
公开(公告)号:US11151329B2
公开(公告)日:2021-10-19
申请号:US16563783
申请日:2019-09-06
Applicant: SoundHound, Inc.
Inventor: Bernard Mont-Reynaud , Seth Taron
IPC: G06F40/30 , G10L15/197 , G06F8/41 , G10L15/18 , G06F40/205 , G06F40/253
Abstract: A natural language understanding server includes grammars specified in a modified extended Backus-Naur form (MEBNF) that includes an agglutination metasymbol not supported by conventional EBNF grammar parsers, as well as an agglutination preprocessor. The agglutination preprocessor applies one or more sets of agglutination rewrite rules to the MEBNF grammars, transforming them to EBNF grammars that can be processed by conventional EBNF grammar parsers. Permitting grammars to be specified in MEBNF form greatly simplifies the authoring and maintenance of grammars supporting inflected forms of words in the languages described by the grammars.
-
公开(公告)号:US20210073333A1
公开(公告)日:2021-03-11
申请号:US16563783
申请日:2019-09-06
Applicant: SoundHound, Inc.
Inventor: Bernard Mont-Reynaud , Seth Taron
IPC: G06F17/27 , G10L15/197 , G10L15/18 , G06F8/41
Abstract: A natural language understanding server includes grammars specified in a modified extended Backus-Naur form (MEBNF) that includes an agglutination metasymbol not supported by conventional EBNF grammar parsers, as well as an agglutination preprocessor. The agglutination preprocessor applies one or more sets of agglutination rewrite rules to the MEBNF grammars, transforming them to EBNF grammars that can be processed by conventional EBNF grammar parsers. Permitting grammars to be specified in MEBNF form greatly simplifies the authoring and maintenance of grammars supporting inflected forms of words in the languages described by the grammars.
-
公开(公告)号:US10896671B1
公开(公告)日:2021-01-19
申请号:US16206963
申请日:2018-11-30
Applicant: SoundHound, Inc.
Inventor: Keyvan Mohajer , Christopher S. Wilson , Bernard Mont-Reynaud , Robert MacRae
Abstract: A command-processing server provides natural language services to applications. More specifically, the command-processing server receives natural language inputs from users for use in applications such as virtual assistants. Some user inputs create user-defined rules that consist of trigger conditions and of corresponding actions that are executed when the triggers fire. The command-processing server stores the rules received from a user in association with the specific user. The command-processing server also identifies rules that can be generalized across users and promoted into generic rules applicable to many or all users. The generic rules may or may not have an associated context constraining their application.
-
公开(公告)号:US20200183815A1
公开(公告)日:2020-06-11
申请号:US16213020
申请日:2018-12-07
Applicant: SoundHound, Inc.
Inventor: Bernard Mont-Reynaud , Jonah Probell
Abstract: A virtual assistant platform provides a user interface for app developers to configure the enablement of domains for virtual assistants. Sets of test queries can be uploaded and statistical analyses displayed for the numbers of test queries served by each selected domain and costs for usage of each domain. Costs can vary according to complex pricing models. The user interface provides display views of tables, cost stack charts, and histograms to inform decisions that trade-off costs with benefits to the virtual assistant user experience. The platform interface shows, for individual queries, responses possible from different domains. Platform providers promote certain chosen domains.
-
公开(公告)号:US10586079B2
公开(公告)日:2020-03-10
申请号:US15406213
申请日:2017-01-13
Applicant: SoundHound, Inc.
Inventor: Monika Almudafar-Depeyrot , Bernard Mont-Reynaud
IPC: G10L13/033 , G10L13/10 , G06F40/30 , G10L13/00
Abstract: Software-based systems perform parametric speech synthesis. TTS voice parameters determine the generated speech audio. Voice parameters include gender, age, dialect, donor, arousal, authoritativeness, pitch, range, speech rate, volume, flutter, roughness, breath, frequencies, bandwidths, and relative amplitudes of formants and nasal sounds. The system chooses TTS parameters based on one or more of: user profile attributes including gender, age, and dialect; situational attributes such as location, noise level, and mood; natural language semantic attributes such as domain of conversation, expression type, dimensions of affect, word emphasis and sentence structure; and analysis of target speaker voices. The system chooses TTS parameters to improve listener satisfaction or other desired listener behavior. Choices may be made by specified algorithms defined by code developers, or by machine learning algorithms trained on labeled samples of system performance.
-
公开(公告)号:US10311858B1
公开(公告)日:2019-06-04
申请号:US15385493
申请日:2016-12-20
Applicant: SoundHound, Inc.
Inventor: Bernard Mont-Reynaud , Jun Huang , Kiran Garaga Lokeswarappa , Joel Gedalius
IPC: G10L15/00 , G10L15/02 , G10L15/18 , G06F17/27 , G10L15/06 , G10L25/90 , H04L29/08 , G06Q30/02 , G06N20/00
Abstract: A system and method are provided for adding user characterization information to a user profile by analyzing user's speech. User properties such as age, gender, accent, and English proficiency may be inferred by extracting and deriving features from user speech, without the user having to configure such information manually. A feature extraction module that receives audio signals as input extracts acoustic, phonetic, textual, linguistic, and semantic features. The module may be a system component independent of any particular vertical application or may be embedded in an application that accepts voice input and performs natural language understanding. A profile generation module receives the features extracted by the feature extraction module and uses classifiers to determine user property values based on the extracted and derived features and store these values in a user profile. The resulting profile variables may be globally available to other applications.
-
公开(公告)号:US20190012311A1
公开(公告)日:2019-01-10
申请号:US16128227
申请日:2018-09-11
Applicant: SoundHound, Inc.
Inventor: Pranav Singh , Keyvan Mohajer , Kamyar Mohajer , Bernard Mont-Reynaud
Abstract: A platform provides for developers of applications, such as devices, with natural language interfaces to configure the availability of vertical domain modules in applications. Modules can include grammars for parsing natural language expressions and interfaces to data sources. Third party developers can create modules with pricing models for their usage or access to their data. Device developers can browse or search available modules and test their performance for specific queries. The platform provides for devices users to access the chosen modules as configured by device developers and for charging and payment between users, application developers, and module developers.
-
公开(公告)号:US20180358019A1
公开(公告)日:2018-12-13
申请号:US15619304
申请日:2017-06-09
Applicant: SoundHound, Inc.
Inventor: Bernard Mont-Reynaud
CPC classification number: G10L15/32 , G10L15/02 , G10L15/063 , G10L15/1822 , G10L15/30 , G10L2015/0635
Abstract: A dual mode speech recognition system sends speech to two or more speech recognizers. If a first recognition result is received, whose recognition score exceeds a high threshold, the first result is selected without waiting for another result. If the score is below a low threshold, the first result is ignored. At intermediate values of recognition scores, a timeout duration is dynamically determined as a function of the recognition score. The timeout duration determines how long the system will wait for another result. Many functions of the recognition score are possible, but timeout durations generally decrease as scores increase. When receiving a second recognition score before the timeout occurs, a comparison based on recognition scores determines whether the first result or the second result is the basis for creating a response.
-
-
-
-
-
-
-
-
-