Differential spatial rendering of audio sources

    公开(公告)号:US11589184B1

    公开(公告)日:2023-02-21

    申请号:US17655650

    申请日:2022-03-21

    Abstract: Methods and systems for intuitive spatial audio rendering with improved intelligibility are disclosed. By establishing a virtual association between an audio source and a location in the listener's virtual audio space, a spatial audio rendering system can generate spatial audio signals that create a natural and immersive audio field for a listener. The system can receive the virtual location of the source as a parameter and map the source audio signal to a source-specific multi-channel audio signal. In addition, the spatial audio rendering system can be interactive and dynamically modify the rendering of the spatial audio in response to a user's active control or tracked movement.

    Virtual Assistant Domain Functionality

    公开(公告)号:US20210350087A1

    公开(公告)日:2021-11-11

    申请号:US17383097

    申请日:2021-07-22

    Abstract: Aspects include methods, systems, and computer-program products providing virtual assistant domain functionality. A natural language query including one or more words is received. A collection of natural language modules is accessed. The collection natural language modules are configured to process sets of natural language queries. A natural language module, from the collection of natural language modules, is identified to interpret the natural language query. An interpretation of the natural language query is computed using the identified natural language module. A response to the natural language query is returned using the computed interpretation.

    Support for grammar inflections within a software development framework

    公开(公告)号:US11151329B2

    公开(公告)日:2021-10-19

    申请号:US16563783

    申请日:2019-09-06

    Abstract: A natural language understanding server includes grammars specified in a modified extended Backus-Naur form (MEBNF) that includes an agglutination metasymbol not supported by conventional EBNF grammar parsers, as well as an agglutination preprocessor. The agglutination preprocessor applies one or more sets of agglutination rewrite rules to the MEBNF grammars, transforming them to EBNF grammars that can be processed by conventional EBNF grammar parsers. Permitting grammars to be specified in MEBNF form greatly simplifies the authoring and maintenance of grammars supporting inflected forms of words in the languages described by the grammars.

    SUPPORT FOR GRAMMAR INFLECTIONS WITHIN A SOFTWARE DEVELOPMENT FRAMEWORK

    公开(公告)号:US20210073333A1

    公开(公告)日:2021-03-11

    申请号:US16563783

    申请日:2019-09-06

    Abstract: A natural language understanding server includes grammars specified in a modified extended Backus-Naur form (MEBNF) that includes an agglutination metasymbol not supported by conventional EBNF grammar parsers, as well as an agglutination preprocessor. The agglutination preprocessor applies one or more sets of agglutination rewrite rules to the MEBNF grammars, transforming them to EBNF grammars that can be processed by conventional EBNF grammar parsers. Permitting grammars to be specified in MEBNF form greatly simplifies the authoring and maintenance of grammars supporting inflected forms of words in the languages described by the grammars.

    Virtual Assistant Domain Selection Analysis
    16.
    发明申请

    公开(公告)号:US20200183815A1

    公开(公告)日:2020-06-11

    申请号:US16213020

    申请日:2018-12-07

    Abstract: A virtual assistant platform provides a user interface for app developers to configure the enablement of domains for virtual assistants. Sets of test queries can be uploaded and statistical analyses displayed for the numbers of test queries served by each selected domain and costs for usage of each domain. Costs can vary according to complex pricing models. The user interface provides display views of tables, cost stack charts, and histograms to inform decisions that trade-off costs with benefits to the virtual assistant user experience. The platform interface shows, for individual queries, responses possible from different domains. Platform providers promote certain chosen domains.

    Parametric adaptation of voice synthesis

    公开(公告)号:US10586079B2

    公开(公告)日:2020-03-10

    申请号:US15406213

    申请日:2017-01-13

    Abstract: Software-based systems perform parametric speech synthesis. TTS voice parameters determine the generated speech audio. Voice parameters include gender, age, dialect, donor, arousal, authoritativeness, pitch, range, speech rate, volume, flutter, roughness, breath, frequencies, bandwidths, and relative amplitudes of formants and nasal sounds. The system chooses TTS parameters based on one or more of: user profile attributes including gender, age, and dialect; situational attributes such as location, noise level, and mood; natural language semantic attributes such as domain of conversation, expression type, dimensions of affect, word emphasis and sentence structure; and analysis of target speaker voices. The system chooses TTS parameters to improve listener satisfaction or other desired listener behavior. Choices may be made by specified algorithms defined by code developers, or by machine learning algorithms trained on labeled samples of system performance.

    Method and system for building an integrated user profile

    公开(公告)号:US10311858B1

    公开(公告)日:2019-06-04

    申请号:US15385493

    申请日:2016-12-20

    Abstract: A system and method are provided for adding user characterization information to a user profile by analyzing user's speech. User properties such as age, gender, accent, and English proficiency may be inferred by extracting and deriving features from user speech, without the user having to configure such information manually. A feature extraction module that receives audio signals as input extracts acoustic, phonetic, textual, linguistic, and semantic features. The module may be a system component independent of any particular vertical application or may be embedded in an application that accepts voice input and performs natural language understanding. A profile generation module receives the features extracted by the feature extraction module and uses classifiers to determine user property values based on the extracted and derived features and store these values in a user profile. The resulting profile variables may be globally available to other applications.

    Modular Virtual Assistant Platform
    19.
    发明申请

    公开(公告)号:US20190012311A1

    公开(公告)日:2019-01-10

    申请号:US16128227

    申请日:2018-09-11

    Abstract: A platform provides for developers of applications, such as devices, with natural language interfaces to configure the availability of vertical domain modules in applications. Modules can include grammars for parsing natural language expressions and interfaces to data sources. Third party developers can create modules with pricing models for their usage or access to their data. Device developers can browse or search available modules and test their performance for specific queries. The platform provides for devices users to access the chosen modules as configured by device developers and for charging and payment between users, application developers, and module developers.

    DUAL MODE SPEECH RECOGNITION
    20.
    发明申请

    公开(公告)号:US20180358019A1

    公开(公告)日:2018-12-13

    申请号:US15619304

    申请日:2017-06-09

    Abstract: A dual mode speech recognition system sends speech to two or more speech recognizers. If a first recognition result is received, whose recognition score exceeds a high threshold, the first result is selected without waiting for another result. If the score is below a low threshold, the first result is ignored. At intermediate values of recognition scores, a timeout duration is dynamically determined as a function of the recognition score. The timeout duration determines how long the system will wait for another result. Many functions of the recognition score are possible, but timeout durations generally decrease as scores increase. When receiving a second recognition score before the timeout occurs, a comparison based on recognition scores determines whether the first result or the second result is the basis for creating a response.

Patent Agency Ranking