Using phonetic variants in a local context to improve natural language understanding

    公开(公告)号:US11295730B1

    公开(公告)日:2022-04-05

    申请号:US16529689

    申请日:2019-08-01

    Abstract: A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.

    RECEIVING A NATURAL LANGUAGE REQUEST AND RETRIEVING A PERSONAL VOICE MEMO

    公开(公告)号:US20220076678A1

    公开(公告)日:2022-03-10

    申请号:US17531371

    申请日:2021-11-19

    Abstract: A computer-implemented method is provided. The method includes receiving commands to store memos, identifying subjects related to the memos, storing, in a database, the memos, their related subjects, and associated time information, receiving a natural language request to retrieve a memo, the request having query information, identifying a subject related to the request, responsive to the request, querying the database for memos related to the subject, identifying multiple memos in response to the database query, identifying a memo, from the multiple identified memos, that has the most recent associated time information and providing a response in dependence on the identified memo.

    System and method for detection and correction of a query

    公开(公告)号:US11263198B2

    公开(公告)日:2022-03-01

    申请号:US16561020

    申请日:2019-09-05

    Abstract: Systems and methods are provided for systematically finding and fixing automatic speech recognition (ASR) mistranscriptions and natural language understanding (NLU) misinterpretations and labeling data for machine learning. High similarity of non-identical consecutive queries indicates ASR mistranscriptions. Consecutive queries with close vectors in a semantic embedding space indicates NLU misinterpretations. Key phrases and barge-in also indicate errors. Only queries within a short amount of time are considered.

    CONFIGURABLE NEURAL SPEECH SYNTHESIS

    公开(公告)号:US20210390944A1

    公开(公告)日:2021-12-16

    申请号:US17341082

    申请日:2021-06-07

    Inventor: Andrew RICHARDS

    Abstract: A discriminator trained on labeled samples of speech can compute probabilities of voice properties. A speech synthesis generative neural network that takes in text and continuous scale values of voice properties is trained to synthesize speech audio that the discriminator will infer as matching the values of the input voice properties. Voice parameters can include speaker voice parameters, accents, and attitudes, among others. Training can be done by transfer learning from an existing neural speech synthesis model or such a model can be trained with a loss function that considers speech and parameter values. A graphical user interface can allow voice designers for products to synthesize speech with a desired voice or generate a speech synthesis engine with frozen voice parameters. A vector of parameters can be used for comparison to previously registered voices in databases such as ones for trademark registration.

    MULTI-MODAL AUDIO PROCESSING FOR VOICE-CONTROLLED DEVICES

    公开(公告)号:US20210312920A1

    公开(公告)日:2021-10-07

    申请号:US17301291

    申请日:2021-03-30

    Inventor: Karl Stahl

    Abstract: A voice-controlled device includes a microphone to receive a set of sound waves that includes speech uttered by a user and other sound, and to output a first audio signal that includes a contribution from the speech uttered by the user and a contribution from the other sound. The device also includes a receiver to receive an electromagnetic signal and to output a second audio signal obtained from the electromagnetic signal. An audio pre-processor of the device processes the first audio signal using the second audio signal to reduce the contribution from the other sound in a processed audio signal. The voice-controlled device then provides the processed audio signal to a speech recognition module to determine a voice command issued by the user.

    Multi Device Proxy
    117.
    发明申请

    公开(公告)号:US20210210099A1

    公开(公告)日:2021-07-08

    申请号:US16735677

    申请日:2020-01-06

    Abstract: A method and system for responding to multiple voice requests sent from a group of devices in substantive response to a single spoken utterance of a user. In one embodiment, if the devices have a same group ID, a server determines if any of the group of received voice requests are duplicate. In one embodiment, voice requests received within a predetermined time window are examined to determine if they are duplicate. If so, the server deems one of the received voice requests as non-duplicate and the others as duplicate and sends a substantive response for the non-duplicate voice request. In some embodiments, a no-op is sent to the devices that do not receive the substantive response.

    Neural Speech-to-Meaning
    118.
    发明申请

    公开(公告)号:US20210174806A1

    公开(公告)日:2021-06-10

    申请号:US16703783

    申请日:2019-12-04

    Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.

    Identification of code for parsing given expressions

    公开(公告)号:US11003426B1

    公开(公告)日:2021-05-11

    申请号:US16786991

    申请日:2020-02-10

    Abstract: A command-processing server provides natural language processing services to applications. The command-processing server stores a set of code blocks, each code block being able to interpret a set of corresponding natural language expressions. The command-processing server accepts natural language expressions and identifies the code blocks that are capable of interpreting those expressions by attempting to parse the natural language expressions using the code blocks. The command-processing server then provides a list of the identified code blocks to the developers, who can then incorporate the code blocks into their applications.

Patent Agency Ranking