Content filtering in media playing devices

    公开(公告)号:US11736769B2

    公开(公告)日:2023-08-22

    申请号:US17228438

    申请日:2021-04-12

    Abstract: Various approaches relate to user defined content filtering in media playing devices of undesirable content represented in stored and real-time content from content providers. For example, video, image, and/or audio data can be analyzed to identify and classify content included in the data using various classification models and object and text recognition approaches. Thereafter, the identification and classification can be used to control presentation and/or access to the content and/or portions of the content. For example, based on the classification, portions of the content can be modified (e.g., replaced, removed, degraded, etc.) using one or more techniques (e.g., media replacement, media removal, media degradation, etc.) and then presented.

    MULTI-MODAL AUDIO PROCESSING FOR VOICE-CONTROLLED DEVICES

    公开(公告)号:US20230254631A1

    公开(公告)日:2023-08-10

    申请号:US18194885

    申请日:2023-04-03

    Inventor: Karl Stahl

    Abstract: A voice-controlled device includes a microphone to receive a set of sound waves that includes speech uttered by a user and other sound, and to output a first audio signal that includes a contribution from the speech uttered by the user and a contribution from the other sound. The device also includes a receiver to receive an electromagnetic signal and to output a second audio signal obtained from the electromagnetic signal. An audio pre-processor of the device processes the first audio signal using the second audio signal to reduce the contribution from the other sound in a processed audio signal. The voice-controlled device then provides the processed audio signal to a speech recognition module to determine a voice command issued by the user.

    TOKEN CONFIDENCE SCORES FOR AUTOMATIC SPEECH RECOGNITION

    公开(公告)号:US20230245649A1

    公开(公告)日:2023-08-03

    申请号:US17649810

    申请日:2022-02-03

    CPC classification number: G10L15/1815 G10L15/02 G10L15/26 G10L2015/025

    Abstract: Methods and systems for correction of a likely erroneous word in a speech transcription are disclosed. By evaluating token confidence scores of individual words or phrases, the automatic speech recognition system can replace a low-confidence score word with a substitute word or phrase. Among various approaches, neural network models can be used to generate individual confidence scores. Such word substitution can enable the speech recognition system to automatically detect and correct likely errors in transcription. Furthermore, the system can indicate the token confidence scores on a graphic user interface for labeling and dictionary enhancement.

    MULTIPLE SERVICE LEVELS FOR AUTOMATIC SPEECH RECOGNITION

    公开(公告)号:US20230082955A1

    公开(公告)日:2023-03-16

    申请号:US17447823

    申请日:2021-09-16

    Abstract: A system for performing automated speech recognition (ASR) on audio data includes a queue manager to receive a request to perform ASR on audio data, add the request to a queue of incoming requests, and determine a queue depth representing a number of requests in the queue at a given time. The system also includes a load supervisor to receive the request and the queue depth from the queue manager and assign a service level for the request based on the queue depth. In addition, the system includes a speech-to-text converter to receive the assigned service level for the request from the load supervisor, select an ASR model for the request based on the received service level, receive the audio data associated with the request, and perform ASR on the audio data using the selected ASR model.

    Differential spatial rendering of audio sources

    公开(公告)号:US11589184B1

    公开(公告)日:2023-02-21

    申请号:US17655650

    申请日:2022-03-21

    Abstract: Methods and systems for intuitive spatial audio rendering with improved intelligibility are disclosed. By establishing a virtual association between an audio source and a location in the listener's virtual audio space, a spatial audio rendering system can generate spatial audio signals that create a natural and immersive audio field for a listener. The system can receive the virtual location of the source as a parameter and map the source audio signal to a source-specific multi-channel audio signal. In addition, the spatial audio rendering system can be interactive and dynamically modify the rendering of the spatial audio in response to a user's active control or tracked movement.

    ENABLING NATURAL LANGUAGE INTERACTIONS WITH USER INTERFACES FOR USERS OF A SOFTWARE APPLICATION

    公开(公告)号:US20220383869A1

    公开(公告)日:2022-12-01

    申请号:US17332927

    申请日:2021-05-27

    Abstract: A user specifies a natural language command to a device. Software on the device generates contextual metadata about the user interface of the device, such as data about all visible elements of the user interface, and sends the contextual metadata along with the natural language command to a natural language understanding engine. The natural language understanding engine parses the natural language query using a stored grammar (e.g., a grammar provided by a maker of the device) and as a result of the parsing identifies information about the command (e.g., the user interface elements referenced by the command) and provides that information to the device. The device uses that provided information to respond to the command.

    SYSTEM AND METHOD FOR CORRECTION OF A QUERY USING A REPLACEMENT PHRASE

    公开(公告)号:US20220147510A1

    公开(公告)日:2022-05-12

    申请号:US17581846

    申请日:2022-01-21

    Abstract: Systems and methods are provided for natural language processing using neural network models and natural language virtual assistants. The system and method include receiving a natural language phrase including a word sequence, computing corresponding error probabilities that the words are errors, and for a word with a corresponding error probability above a threshold, then computing a replacement phrase with a low error probability to provide a response from the virtual assistant depending on the replacement phrase.

    Wake suppression for audio playing and listening devices

    公开(公告)号:US11328721B2

    公开(公告)日:2022-05-10

    申请号:US16781214

    申请日:2020-02-04

    Abstract: A system and method are disclosed for ignoring a wakeword received at a speech-enabled listening device when it is determined the wakeword is reproduced audio from an audio-playing device. Determination can be by detecting audio distortions, by an ignore flag sent locally between an audio-playing device and speech-enabled device, by and ignore flag sent from a server, by comparison of received audio played audio to a wakeword within an audio-playing device or a speech-enabled device, and other means.

Patent Agency Ranking