Patent search ap:("NETFLIX Page INC.") AND inv:"Yadong Wang"

1.

发明授权
Systems and methods for active speaker detection 有权

公开(公告)号：US11983923B1

公开(公告)日：2024-05-14

申请号：US18063107

申请日：2022-12-08

Applicant: NETFLIX, INC.

Inventor： Yadong Wang , Kyle Tacke , Shilpa Jois Rao

IPC: G06V20/40 , G10L25/57 , G10L25/60 , G11B27/031

CPC classification number: G06V20/41 , G06V20/48 , G10L25/57 , G10L25/60 , G11B27/031 , G06V2201/10

Abstract: The disclosed computer-implemented method may include receiving, as input, an audio/video data object; isolating a video stream of a visible potential speaker over a plurality of frames of the audio/video data object; isolating an audio stream over the plurality of frames; providing the isolated video stream and the isolated audio stream to a machine learning model trained with contrastive learning, the contrastive learning using (i) a corpus of video segments of visible speakers with corresponding original audio for positive samples; and (ii) a corpus of video segments of visible speakers with corresponding dubbed audio for negative samples; and evaluating a match between the isolated audio stream and the isolated video stream based at least in part on an output of the machine learning model. Various other methods, systems, and computer-readable media are also disclosed.

2.

发明授权
Automated workflows from media asset differentials 有权

公开(公告)号：US11659214B2

公开(公告)日：2023-05-23

申请号：US17245252

申请日：2021-04-30

Applicant: Netflix, Inc.

Inventor： Yadong Wang , Chih-Wei Wu , Kyle Tacke , Shilpa Jois Rao , Boney Sekh , Andrew Swan , Raja Ranjan Senapati

IPC: H04N21/2343 , G11B27/10 , G11B27/031 , H04N21/234 , G06Q10/0631

CPC classification number: H04N21/2343 , G06Q10/06312 , G11B27/031 , G11B27/10 , H04N21/23412 , H04N21/23418

Abstract: The disclosed computer-implemented method may include (1) accessing a first media data object and a different, second media data object that, when played back, each render temporally sequenced content, (2) comparing first temporally sequenced content represented by the first media data object with second temporally sequenced content represented by the second media data object to identify a set of common temporal subsequences between the first media data object and the second media data object, (3) identifying a set of edits relative to the set of common temporal subsequences that describe a difference between the temporally sequenced content of the first media data object and the temporally sequenced content of the second media data object, and (4) executing a workflow relating to the first media data object and/or the second media data object based on the set of edits. Various other methods, systems, and computer-readable media are also disclosed.

3.

发明授权
Systems and methods for mixing synthetic voice with original audio tracks 有权

公开(公告)号：US11430485B2

公开(公告)日：2022-08-30

申请号：US16747314

申请日：2020-01-20

Applicant: Netflix, Inc.

Inventor： Yadong Wang , Murthy Parthasarathi , Andrew Swan , Raja Ranjan Senapati , Shilpa Jois Rao , Anjali Chablani , Kyle Tacke

IPC: G11B27/00 , H04N5/93 , G11B27/036 , G11B27/034 , H04N21/84 , G10L13/08 , H04N21/485 , H04N21/81 , G10L13/00 , H04N9/80

Abstract: The disclosed computer-implemented method may include accessing an audio track that is associated with a video recording, identifying a section of the accessed audio track having a specific audio characteristic, reducing a volume level of the audio track in the identified section, accessing an audio segment that includes a synthesized voice and inserting the accessed audio segment into the identified section of the audio track, where the inserted segment has a higher volume level than the reduced volume level of the audio track in the identified section. The synthesized voice description can be used to provide additional information to a visually impaired viewer without interrupting the audio track that is associated with the video recording, typically by inserting the synthesized voice description into a segment of the audio track in which there is no dialog. Various other methods, systems, and computer-readable media are also disclosed.

4.

发明申请
SYSTEMS AND METHODS FOR CORRELATING SPEECH AND LIP MOVEMENT 有权

公开(公告)号：US20210407510A1

公开(公告)日：2021-12-30

申请号：US16911247

申请日：2020-06-24

Applicant: Netflix, Inc.

Inventor： Yadong Wang , Shilpa Jois Rao

IPC: G10L15/25 , G10L25/78 , G06K9/00

Abstract: The disclosed computer-implemented method includes analyzing, by a speech detection system, a media file to detect lip movement of a speaker who is visually rendered in media content of the media file. The method additionally includes identifying, by the speech detection system, audio content within the media file, and improving accuracy of a temporal correlation of the speech detection system. The method may involve correlating the lip movement of the speaker with the audio content, and determining, based on the correlation between the lip movement of the speaker and the audio content, that the audio content comprises speech from the speaker. The method may further involve recording, based on the determination that the audio content comprises speech from the speaker, the temporal correlation between the speech and the lip movement of the speaker as metadata of the media file. Various other methods, systems, and computer-readable media are disclosed.

5.

发明授权
Techniques for generating subtitles for trailers 审中-公开

公开(公告)号：US10674222B2

公开(公告)日：2020-06-02

申请号：US15875989

申请日：2018-01-19

Applicant: NETFLIX, INC.

Inventor： Murthy Parthasarathi , Yadong Wang , Boney Sekh

IPC: H04N5/93 , G11B27/00 , H04N21/485 , G06F16/432 , G11B27/031 , G11B27/34 , H04N21/233 , H04N21/488 , H04N21/8547 , H04N21/8549 , H04N9/80

Abstract: In various embodiments, a subtitle application generates a subtitle list for a trailer. In operation, the subtitle application performs matching operation(s) between trailer audio associated with a trailer and source audio associated with an audiovisual program. The subtitle application then maps a subtitle associated with the source audio from a source timeline associated with the source audio to a trailer timeline associated with the trailer audio to generate a mapped subtitle. Subsequently, the subtitle application generates a trailer subtitle list based on the mapped subtitle and at least one additional mapped subtitle. Because the subtitle application generates the trailer subtitle list based on audio comparisons, the subtitle application ensures that the proper subtitles are included in the trailer subtitle list without requiring a subtitler to view the trailer.

6.

发明公开
SYSTEMS AND METHODS FOR CLASSIFYING MUSIC FROM HETEROGENOUS AUDIO SOURCES 审中-公开

公开(公告)号：US20230409897A1

公开(公告)日：2023-12-21

申请号：US17841322

申请日：2022-06-15

Applicant: Netflix, Inc.

Inventor： Yadong Wang , Jeff Kitchener , Shilpa Jois Rao

IPC: G06N3/08 , G10H1/00

CPC classification number: G06N3/08 , G10H1/0008 , G10H2250/311 , G10H2210/036 , G10H2210/041

Abstract: The disclosed computer-implemented method may include accessing an audio stream with heterogenous audio content; dividing the audio stream into a plurality of frames; generating a plurality of spectrogram patches, each spectrogram patch within the plurality of spectrogram patches being derived from a frame within the plurality of frames; and providing each spectrogram patch within the plurality of spectrogram patches as input to a convolutional neural network classifier and receiving, as output, a classification of music within a corresponding frame from within the plurality of frames. Various other methods, systems, and computer-readable media are also disclosed.

7.

发明授权
System and methods for automatically mixing audio for acoustic scenes 有权

公开(公告)号：US11238888B2

公开(公告)日：2022-02-01

申请号：US16732142

申请日：2019-12-31

Applicant: Netflix, Inc.

Inventor： Yadong Wang , Shilpa Jois Rao , Murthy Parthasarathi , Kyle Tacke

IPC: G10L25/51 , G10L15/00 , G10L15/22 , G10L25/81 , G10L25/84

Abstract: The disclosed computer-implemented method may include obtaining an audio sample from a content source, inputting the obtained audio sample into a trained machine learning model, obtaining the output of the trained machine learning model, wherein the output is a profile of an environment in which the input audio sample was recorded, obtaining an acoustic impulse response corresponding to the profile of the environment in which the input audio sample was recorded, obtaining a second audio sample, processing the obtained acoustic impulse response with the second audio sample, and inserting a result of processing the obtained acoustic impulse response and the second audio sample into an audio track. Various other methods, systems, and computer-readable media are also disclosed.

8.

发明公开
SYSTEMS AND METHODS FOR AUTOMATICALLY GENERATING SOUND EVENT SUBTITLES 审中-公开

公开(公告)号：US20230412760A1

公开(公告)日：2023-12-21

申请号：US17841564

申请日：2022-06-15

Applicant: Netflix, Inc.

Inventor： Yadong Wang , Shilpa Jois Rao

IPC: H04N5/93 , G10L15/00 , G10L15/04 , G10L15/26 , G10L25/57 , G10L25/81 , G10L15/22 , H04N5/278

CPC classification number: H04N5/9305 , G10L15/005 , G10L15/04 , G10L15/26 , G10L25/57 , G10L25/81 , G10L15/22 , H04N5/278

Abstract: The disclosed computer-implemented method may include systems and methods for automatically generating sound event subtitles for digital videos. For example, the systems and methods described herein can automatically generate subtitles for sound events within a digital video soundtrack that includes sounds other than speech. Additionally, the systems and methods described herein can automatically generate sound event subtitles as part of an automatic and comprehensive approach that generates subtitles for all sounds within a soundtrack of a digital video—thereby avoiding the need for any manual inputs as part of the subtitling process.

9.

发明公开
AUTOMATED WORKFLOWS FROM MEDIA ASSET DIFFERENTIALS 审中-公开

公开(公告)号：US20230232055A1

公开(公告)日：2023-07-20

申请号：US18186366

申请日：2023-03-20

Applicant: Netflix, Inc.

Inventor： Yadong Wang , Chih-Wei Wu , Kyle Tacke , Shilpa Jois Rao , Boney Sekh , Andrew Swan , Raja Ranjan Senapati

IPC: H04N21/2343 , H04N21/234 , G06Q10/0631 , G11B27/10 , G11B27/031

CPC classification number: H04N21/2343 , H04N21/23412 , G06Q10/06312 , H04N21/23418 , G11B27/10 , G11B27/031

Abstract: The disclosed computer-implemented method may include (1) accessing a first media data object and a different, second media data object that, when played back, each render temporally sequenced content, (2) comparing first temporally sequenced content represented by the first media data object with second temporally sequenced content represented by the second media data object to identify a set of common temporal subsequences between the first media data object and the second media data object, (3) identifying a set of edits relative to the set of common temporal subsequences that describe a difference between the temporally sequenced content of the first media data object and the temporally sequenced content of the second media data object, and (4) executing a workflow relating to the first media data object and/or the second media data object based on the set of edits. Various other methods, systems, and computer-readable media are also disclosed.

10.

发明申请
SYSTEM AND METHODS FOR AUTOMATICALLY MIXING AUDIO FOR ACOUSTIC SCENES 有权

公开(公告)号：US20210201931A1

公开(公告)日：2021-07-01

申请号：US16732142

申请日：2019-12-31

Applicant: Netflix, Inc.

Inventor： Yadong Wang , Shilpa Jois Rao , Murthy Parthasarathi , Kyle Tacke

IPC: G10L25/51 , G10L15/00 , G10L15/22 , G10L25/84 , G10L25/81

Abstract: The disclosed computer-implemented method may include obtaining an audio sample from a content source, inputting the obtained audio sample into a trained machine learning model, obtaining the output of the trained machine learning model, wherein the output is a profile of an environment in which the input audio sample was recorded, obtaining an acoustic impulse response corresponding to the profile of the environment in which the input audio sample was recorded, obtaining a second audio sample, processing the obtained acoustic impulse response with the second audio sample, and inserting a result of processing the obtained acoustic impulse response and the second audio sample into an audio track. Various other methods, systems, and computer-readable media are also disclosed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification