专利检索 cpc:"G10L25/57" 第 1 页

1.

发明公开
SEGMENT IDENTIFICATION FROM LONG VIDEOS 审中-公开

公开(公告)号：US20240355119A1

公开(公告)日：2024-10-24

申请号：US18305587

申请日：2023-04-24

申请人： ADOBE INC.

发明人： Ioana Croitoru , Trung Huu Bui , Zhaowen Wang , Seunghyun Yoon , Franck Dernoncourt , Hailin Jin

IPC分类号： G06V20/40 , G06V10/774 , G06V20/70 , G10L15/04 , G10L15/18 , G10L15/22 , G10L25/57

CPC分类号： G06V20/41 , G06V10/774 , G06V20/49 , G06V20/70 , G10L15/04 , G10L15/1815 , G10L15/22 , G10L25/57 , G10L15/16

摘要： One or more aspects of the method, apparatus, and non-transitory computer readable medium include receiving a query relating to a long video. One or more aspects of the method, apparatus, and non-transitory computer readable medium further include generating a segment of the long video corresponding to the query using a machine learning model trained to identify relevant segments from long videos. One or more aspects of the method, apparatus, and non-transitory computer readable medium further include responding to the query based on the generated segment.

2.

发明授权
Video segment selection and editing using transcript interactions 有权

公开(公告)号：US12119028B2

公开(公告)日：2024-10-15

申请号：US17967364

申请日：2022-10-17

申请人： Adobe Inc.

发明人： Xue Bai , Justin Jonathan Salamon , Aseem Omprakash Agarwala , Hijung Shin , Haoran Cai , Joel Richard Brandt , Lubomira Assenova Dontcheva , Cristin Ailidh Fraser

IPC分类号： G11B27/036 , G06F40/166 , G10L15/26 , G10L25/57 , G11B27/34 , G06F3/0482 , G06F3/04845 , G06F3/0485

CPC分类号： G11B27/036 , G06F40/166 , G10L15/26 , G10L25/57 , G11B27/34 , G06F3/0482 , G06F3/04845 , G06F3/0485

摘要： Embodiments of the present invention provide systems, methods, and computer storage media for identifying candidate boundaries for video segments, video segment selection using those boundaries, and text-based video editing of video segments selected via transcript interactions. In an example implementation, boundaries of detected sentences and words are extracted from a transcript, the boundaries are retimed into an adjacent speech gap to a location where voice or audio activity is a minimum, and the resulting boundaries are stored as candidate boundaries for video segments. As such, a transcript interface presents the transcript, interprets input selecting transcript text as an instruction to select a video segment with corresponding boundaries selected from the candidate boundaries, and interprets commands that are traditionally thought of as text-based operations (e.g., cut, copy, paste) as an instruction to perform a corresponding video editing operation using the selected video segment.

3.

发明授权
Personal content managed during extended display screen recording 有权

公开(公告)号：US12114097B2

公开(公告)日：2024-10-08

申请号：US18190227

申请日：2023-03-27

申请人： Motorola Mobility LLC

发明人： Amit Kumar Agrawal , Gautham Prabhakar Natakala , Shuang Wu

IPC分类号： H04N5/92 , G06T5/20 , G10L25/57

CPC分类号： H04N5/92 , G06T5/20 , G10L25/57

摘要： In aspects of personal content managed during extended display screen recording, a screen recording system includes a wireless device that provides digital image content for display on an extended display device, and a screen recording session on the wireless device captures the digital image content and audio data. The wireless device implements a content control module that can determine the screen recording session would capture personal content associated with a user of the wireless device. The content control module can initiate a private screen review mode in which the personal content is displayable on a display screen of the wireless device and is prevented from visual display on the extended display device. The content control module can also generate a shareable screen recording that includes the audio data and the digital image content displayed on the extended display device, without including the personal content.

4.

发明授权
Audio processing method and electronic device 有权

公开(公告)号：US12106777B2

公开(公告)日：2024-10-01

申请号：US17940057

申请日：2022-09-08

申请人： VIVO MOBILE COMMUNICATION CO., LTD.

发明人： Jixiang Hu

IPC分类号： G11B27/02 , G06F3/16 , G06F40/166 , G10L25/57

CPC分类号： G11B27/02 , G06F3/165 , G06F40/166 , G10L25/57

摘要： Embodiments of the present disclosure provide an audio processing method and an electronic device. The method includes: first obtaining text information corresponding to a to-be-processed audio, where the text information includes a to-be-processed text and a playback period corresponding to each field in the to-be-processed text; then receiving a first input on the to-be-processed text; in response to the first input, determining, as a to-be-processed field, a field indicated by the first input in the to-be-processed text; then receiving a second input on the to-be-processed field; obtaining a target audio segment in response to the second input; and finally modifying an audio segment at a playback period corresponding to the to-be-processed field according to the target audio segment, to obtain a target audio.

5.

发明授权
Voice content selection for video content 有权

公开(公告)号：US12101516B1

公开(公告)日：2024-09-24

申请号：US17364448

申请日：2021-06-30

申请人： Amazon Technologies, Inc.

发明人： Saravanan Santhamoorthy Theckyam , Anil Kumar Nelakanti

IPC分类号： H04N21/233 , G06F40/279 , G06F40/58 , G06V40/10 , G10L15/00 , G10L25/54 , G10L25/57 , H04N21/234 , H04N21/239 , H04N21/25

CPC分类号： H04N21/233 , G06F40/279 , G06F40/58 , G06V40/10 , G10L15/005 , G10L25/54 , G10L25/57 , H04N21/23418 , H04N21/2393 , H04N21/251

摘要： Techniques and apparatus for selecting audio content for a content entity in audio-visual content are described. An example technique involves identifying at least one content entity associated with a content item that is accessible to one or more users in a first language over a communication network. One or more attributes of the at least one content entity are determined. A plurality of audio content samples in a second language are obtained. Each audio content sample includes a different audio sample of a portion of speech of the content entity in the second language. A first audio content sample that satisfies a predetermined condition is determined, based on the plurality of audio content samples and the one or more attributes of the at least one content entity. An indication of the first audio content sample is provided.

6.

发明公开
COMPUTER SYSTEM, METHOD, AND PROGRAM FOR IMPROVING RELATIONS WITH INDIVIDUAL PARTIES IN TWO-PARTY COMMUNICATION 审中-公开

公开(公告)号：US20240297954A1

公开(公告)日：2024-09-05

申请号：US18572683

申请日：2022-06-24

申请人： KAKEAI, Inc.

发明人： Hidetaka HONDA

IPC分类号： H04N5/272 , G06V20/40 , G06V40/20 , G10L15/22 , G10L25/57

CPC分类号： H04N5/272 , G06V20/40 , G06V40/28 , G10L15/22 , G10L25/57

摘要： This computer system for assisting communication between two parties is configured to: receive information indicating a response that a first party of the two parties expects a second party of the two parties to make in the communication; receive speech and/or video during the communication; derive an advice relating to the communication on the basis of the information and the speech and/or the video; and provide the derived advice during the communication.

7.

发明公开
METHOD AND SYSTEM FOR AUTOMATIC SPEAKER FRAMING IN VIDEO APPLICATIONS 审中-公开

公开(公告)号：US20240296694A1

公开(公告)日：2024-09-05

申请号：US18661087

申请日：2024-05-10

申请人： GN Audio A/S

发明人： Morten Smidt Proschowsky , Sui Kun Guan , Nihit Rajendra Save , Aurangzeb Khan

IPC分类号： G06V40/10 , G06T7/20 , G06T7/70 , G06V10/25 , G06V20/40 , G10L17/00 , G10L25/57 , H04N5/262 , H04R1/40 , H04R3/00

CPC分类号： G06V40/10 , G06T7/20 , G06T7/70 , G06V10/25 , G06V20/49 , G10L17/00 , G10L25/57 , H04N5/2628 , G06T2207/10016 , G06T2207/20084 , G06T2207/20132 , G06T2207/30196 , H04R1/406 , H04R3/005

摘要： For video applications, a method for dynamically switching from a current ROI to a target ROI is disclosed, wherein the target ROI only includes active speakers. Advantageously, if the target ROI crops a non-speaker, then the target ROI is expanded to include said non-speaker. Transitioning from the current ROI to the target ROI may be achieved based on a cutover transition technique, or a smooth transition technique. The cutover transition technique achieves the change from the current arrived to the target ROI in a single interval, whereas the smooth transition technique achieves the change over a number of intervals, wherein a percentage of the total change required is allocated to each interval. A system for implementing the above method is also disclosed.

8.

发明公开
SUGGESTED QUERIES FOR TRANSCRIPT SEARCH 审中-公开

公开(公告)号：US20240273139A1

公开(公告)日：2024-08-15

申请号：US18408792

申请日：2024-01-10

申请人： Microsoft Technology Licensing, LLC

发明人： Adi Miller , Haim Somech , Michael Sterenberg

IPC分类号： G06F16/683 , G06F3/0481 , G06F3/0484 , G06F16/332 , G06F40/295 , G06F40/30 , G10L17/22 , G10L25/57 , H04N7/15

CPC分类号： G06F16/685 , G06F3/0481 , G06F3/0484 , G06F16/3329 , G06F40/295 , G06F40/30 , G10L17/22 , G10L25/57 , H04N7/15

摘要： Systems and methods for surfacing natural language queries from one or more transcripts. An example method may include converting received audio to text, through automated speech recognition, to form a transcript of the audio, wherein the transcript includes text of the audio and identifications of speakers associated with portions of the text corresponding to utterances from the respective speakers; generating input signals based on at least the transcript; executing at least one of one or more heuristics or a trained machine-learning (ML) model, using the generated input signals as an input, to generate at least one of a suggested natural language query for searching the transcript or a key moment within the received audio; and causing at least one of the suggested natural language query or the key moment to be surfaced on one or more remote devices.

9.

发明公开
Enhancing Audio Content of a Captured Sense 审中-公开

公开(公告)号：US20240249743A1

公开(公告)日：2024-07-25

申请号：US18562663

申请日：2021-05-25

申请人： Google LLC

发明人： Snehitha Singaraju

IPC分类号： G10L21/0364 , G06V10/70 , G06V20/40 , G06V20/50 , G10L21/034 , G10L25/57

CPC分类号： G10L21/0364 , G06V10/768 , G06V20/40 , G06V20/50 , G10L21/034 , G10L25/57

摘要： This document describes systems and methods for enhancing dynamically audio content of a captured scene (104). As part of the described systems and methods, an electronic device (102) may include a content-enhancement manager module (216) that directs the electronic device (102) to perform operations to enhance the audio content. Operations may include determining a context (504) surrounding the capture of the scene, determining an audio focus point (604) within the scene, or determining an intent of a user directing the electronic device (102) to capture the scene (104). Based on one or more of these determinations, the electronic device (102) may use a variety of techniques to enhance the audio content associated with the captured scene so as to present the captured scene (104) with relevant audio content.

10.

发明公开
GLOBALIZATION OF VIDEOS USING AUTOMATED VOICE DUBBING 审中-公开

公开(公告)号：US20240211704A1

公开(公告)日：2024-06-27

申请号：US18069438

申请日：2022-12-21

申请人： Meta Platforms, Inc.

发明人： Charles Patrick Mason Griffin , Prakash Chandra , Carlos Lourenco , Amit Agarwal

IPC分类号： G06F40/58 , G10L17/20 , G10L19/16

CPC分类号： G06F40/58 , G10L17/20 , G10L19/167 , G10L25/57

摘要： An audio processing system includes: a receiver configured to receive the original audio data; a processor configured to execute the instructions stored in the memory to cause the audio processing system to: separate a background noise audio data, a first speaker audio data, and a second speaker audio data; recognize first speaker speech, convert the first speaker speech to first speaker text, translate the first speaker text to a second language text, and convert the second language text to a second speech; recognize second speaker speech, convert the second speaker speech to second speaker text, translate the second speaker text to the second language text, and convert the second language text of the second speaker to a second speech for the second speaker; and generate encoded audio data; and a transmitter configured to transmit the encoded audio data to a content user device.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类