专利检索 ipc:"G10L25/54" 第 1 页

1.

发明授权
Voice content selection for video content 有权

公开(公告)号：US12101516B1

公开(公告)日：2024-09-24

申请号：US17364448

申请日：2021-06-30

申请人： Amazon Technologies, Inc.

发明人： Saravanan Santhamoorthy Theckyam , Anil Kumar Nelakanti

IPC分类号： H04N21/233 , G06F40/279 , G06F40/58 , G06V40/10 , G10L15/00 , G10L25/54 , G10L25/57 , H04N21/234 , H04N21/239 , H04N21/25

CPC分类号： H04N21/233 , G06F40/279 , G06F40/58 , G06V40/10 , G10L15/005 , G10L25/54 , G10L25/57 , H04N21/23418 , H04N21/2393 , H04N21/251

摘要： Techniques and apparatus for selecting audio content for a content entity in audio-visual content are described. An example technique involves identifying at least one content entity associated with a content item that is accessible to one or more users in a first language over a communication network. One or more attributes of the at least one content entity are determined. A plurality of audio content samples in a second language are obtained. Each audio content sample includes a different audio sample of a portion of speech of the content entity in the second language. A first audio content sample that satisfies a predetermined condition is determined, based on the plurality of audio content samples and the one or more attributes of the at least one content entity. An indication of the first audio content sample is provided.

2.

发明公开
SYSTEM AND METHOD FOR CONTINUOUS MEDIA SEGMENT IDENTIFICATION 审中-公开

公开(公告)号：US20240259613A1

公开(公告)日：2024-08-01

申请号：US18511050

申请日：2023-11-16

申请人： INSCAPE DATA, INC.

发明人： W. Leo Hoarty

IPC分类号： H04N21/233 , G06F16/683 , G06F16/70 , G06F16/732 , G06F16/783 , G06F16/951 , G10L25/06 , G10L25/12 , G10L25/54 , H04N21/235 , H04N21/236 , H04N21/278 , H04N21/8352

CPC分类号： H04N21/233 , G06F16/683 , G06F16/70 , G06F16/7328 , G06F16/783 , G06F16/7834 , G06F16/951 , G10L25/54 , H04N21/235 , H04N21/236 , H04N21/278 , H04N21/8352 , G10L25/06 , G10L25/12

摘要： This invention provides a means to identify unknown media programming using the audio component of said programming. The invention extracts audio information from the media received by consumer electronic devices such as smart TVs and TV set-top boxes then conveys said information to a remote server means which will in turn identify said audio information of unknown identity by way of testing against a database of known audio segment information. The system identifies unknown media programming in real-time such that time-sensitive services may be offered such as interactive television applications providing contextually related information or television advertisement substitution. Other uses include tracking media consumption among many other services.

3.

发明公开
APPARATUS, METHOD AND COMPUTER PROGRAM CODE FOR PROCESSING AUDIO STREAM 审中-公开

公开(公告)号：US20240221777A1

公开(公告)日：2024-07-04

申请号：US18686266

申请日：2022-07-12

申请人： Utopia Music AG

发明人： Linus Wahlgren , Max Flach

IPC分类号： G10L25/54 , G10L25/18

CPC分类号： G10L25/54 , G10L25/18

摘要： Apparatus, method, and computer program code for processing audio stream. The method includes: obtaining first peaks of an audio stream, wherein the first peak comprises a first peak amplitude at a first frequency and at a first time offset from a beginning of the audio stream; for each first peak, detecting a second peak in a window with a predetermined offset from the first peak, wherein the second peak comprises a second peak amplitude at a second frequency and at a second time offset from the beginning of the audio stream; and for each first peak, generating a fingerprint hash based on the first frequency, a time difference between the first time offset and the second time offset, a frequency difference between the first frequency and the second frequency, and an amplitude difference between the first amplitude and the second amplitude.

4.

发明公开
MEDIA SEGMENT PREDICTION FOR MEDIA GENERATION 审中-公开

公开(公告)号：US20240127838A1

公开(公告)日：2024-04-18

申请号：US18047572

申请日：2022-10-18

申请人： QUALCOMM Incorporated

发明人： Stephane VILLETTE , Sen LI , Pravin Kumar RAMADAS , Daniel Jared SINDER

IPC分类号： G10L21/01 , G10L17/02 , G10L25/54

CPC分类号： G10L21/01 , G10L17/02 , G10L25/54

摘要： A device includes one or more processors configured to input one or more segments of an input media stream into a feature extractor. The one or more processors are further configured to pass an output of the feature extractor into an utterance classifier to produce at least one representation of at least one utterance class of a plurality of utterance classes. The one or more processors are further configured to pass the output of the feature extractor and the at least one representation into a segment matcher to produce a media output segment identifier.

5.

发明授权
Audio associating of computing devices 有权

公开(公告)号：US11934740B2

公开(公告)日：2024-03-19

申请号：US16536659

申请日：2019-08-09

申请人： Amazon Technologies, Inc.

发明人： Justin-Josef Angel , Eric Alan Breitbard , Sean Robert Ryan , Robert Steven Murdock , Michael Douglas McQueen , Ryan Charles Chase , Colin Neil Swann

IPC分类号： G06F3/16 , G06F3/14 , G10L15/22 , G10L25/54

CPC分类号： G06F3/167 , G06F3/1407 , G06F3/1454 , G10L15/22 , G10L25/54 , G10L2015/223

摘要： Methods, systems and apparatus for associating electronic devices together based on received audio commands are described. Methods for associating an audio-controlled device with a physically separate display screen device such that information responses can then be provided in both audio and graphic formats using the two devices in conjunction with each other are described. The audio-controlled device can receive audio commands that can be analyzed to determine the author, which can then be used to further streamline the association operation.

6.

发明授权
Social network based voice enhancement system 有权

公开(公告)号：US11871198B1

公开(公告)日：2024-01-09

申请号：US16508648

申请日：2019-07-11

申请人： Meta Platforms Technologies, LLC

发明人： Philip Robinson , Vladimir Tourbabin , Jacob Ryan Donley , Andrew Lovitt

IPC分类号： G06V40/16 , G10L17/00 , H04R5/04 , H04R5/033 , H04R3/04 , G06F3/01 , G10L25/54 , G10L21/0208

CPC分类号： H04R5/04 , G06F3/013 , G06V40/172 , G10L17/00 , G10L21/0208 , G10L25/54 , H04R3/04 , H04R5/033

摘要： An audio system presents enhanced audio content to a user of a headset. The audio system detects sounds from the local area, at least a portion of which originate from a human sound source. The audio system obtains a voice profile of an identifies human sound source that generates at least the portion of the detected sounds. Based in part on the voice profile, the audio system enhances the portion of the detected sounds that are generated by the human sound source to obtain enhanced audio. The audio system presents the enhanced audio to the user.

7.

发明授权
Interacting with a virtual assistant to receive updates 有权

公开(公告)号：US11837215B1

公开(公告)日：2023-12-05

申请号：US17982304

申请日：2022-11-07

申请人： Amazon Technologies, Inc.

发明人： Sunitha Kalkunte Srivatsa , Maayan Aharon , Aakarsh Nair , Nithya Venkataraman , Lohit Bijani

IPC分类号： G06Q10/109 , G10L15/22 , G10L25/54 , G06Q10/1093 , G10L15/26

CPC分类号： G10L15/22 , G06Q10/1095 , G06Q10/1097 , G10L25/54 , G06Q10/109 , G06Q10/1093 , G10L15/26 , G10L2015/223 , G10L2015/225

摘要： Technologies are disclosed for interacting with a virtual assistant to request updates associated with one or more events and/or perform actions. According to some examples, a user may use their voice to interact with a virtual assistant to receive updates relating to events occurring during a certain period of time. For example, a user may request an update associated with one or more events occurring that day. The system may access data sources (e.g., calendar services, email services, etc.) to obtain data associated with the events, tag the events according to one or more conditions indicated by the data, and/or rank the events according to the tags. In addition, to resolve conditions associated with the events, the virtual assistant may also include options in the update to perform certain actions and/or to provide response data. The virtual assistant may generate the update and audibly provide the update to the user.

8.

发明授权
Managing communications-related data based on interactions between and aggregated data involving data-center communications server and client-specific circuitry 有权

公开(公告)号：US11792300B1

公开(公告)日：2023-10-17

申请号：US17845619

申请日：2022-06-21

申请人： 8x8, Inc.

发明人： Ali Arsanjani , Bryan R. Martin , Manu Mukerji , Venkat Nagaswamy , Marshall Lincoln

IPC分类号： H04L67/63 , G10L25/54 , H04L41/22 , G10L15/22

CPC分类号： H04L67/63 , G10L15/22 , G10L25/54 , H04L41/22

摘要： Certain aspects of the disclosure are directed to context aggregation in a data communications network. According to a specific example, process user-data communications between a client station and another station participating in data communications via the data communications services can be processed, where the client station is associated with one of a plurality of client entities configured and arranged to interface with a data communications server providing data communications services. Context information can be aggregated for each respective user-data communication between the client station and the participating station, where the context information corresponds to at least one communications-specific characteristic associated with the user-data communications. In response to receipt of a subsequent user-data communication from the participating station and intended for the client station, present to the participating station a dynamic series of inquiries to address the subsequent user-data communication, based on the aggregated context information.

9.

发明授权
Tool for assisting people with speech disorder 有权

公开(公告)号：US11763821B1

公开(公告)日：2023-09-19

申请号：US16455196

申请日：2019-06-27

申请人： CERNER INNOVATION, INC.

发明人： Douglas S. McNair

IPC分类号： G10L15/30 , G10L25/66 , G10L15/22 , G10L25/54 , G10L25/63 , A61B5/00 , G10L15/20 , G10L15/26

CPC分类号： G10L15/30 , G10L15/22 , G10L25/54 , G10L25/66 , A61B5/4803 , G10L15/20 , G10L15/26 , G10L25/63

摘要： Various tools are disclosed for providing assistive or augmentative means to enhance the fluency and accuracy of persons having speech disabilities. These technologies may automatically ascertain and dynamically improve the accuracy with which automatic speech recognition (ASR) systems recognize utterances of persons having impaired speech conditions. In an embodiment, digitized audio information about a speaker’s utterance is processed to determine a set of candidate words matching the utterance. From these candidate words, a set of concepts is determined using a finite state machine model. A pictogram representing each concept is identified and presented to the speaker so that the speaker may select the pictogram corresponding to the best match of his or her intended meaning associated with the utterance. An action corresponding to speaker’s selection then may be performed. For example, displaying or synthesizing speech from textual information describing the selected concept.

10.

发明公开
MACHINE LEARNING MODELS FOR AUTOMATED PROCESSING OF AUDIO WAVEFORM DATABASE ENTRIES 审中-公开

公开(公告)号：US20230238019A1

公开(公告)日：2023-07-27

申请号：US17580748

申请日：2022-01-21

申请人： Evernorth Strategic Development, Inc.

发明人： Jonathan J. Lisic , John Ciliberti

IPC分类号： G10L25/66 , G10L25/30 , G10L25/54

CPC分类号： G10L25/66 , G10L25/30 , G10L25/54

摘要： A computer system includes memory hardware and processor hardware configured to execute stored instructions. The instructions include training a machine learning model with the historical feature vector inputs including multiple audio data entries and multiple claims data entries, to generate a condition likelihood output indicative of a specified condition associated with one of multiple historical database entities. The instructions include for each of a set of multiple database entities, generating a feature vector input according to audio data and the claims data associated with the entity, processing the feature vector input with the machine learning model to generate the condition likelihood output, and assigning the database entity to an identified condition subset in response to determining that the condition likelihood output is greater than a specified likelihood threshold. The instructions include transforming a user interface to display the condition likelihood output associated with the database entity.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类