Patent search ap:("Google Inc.") AND inv:"Sourish Chaudhuri" Page 1

1.

发明授权
Automatic smoothed captioning of non-speech sounds from audio 有权

公开(公告)号：US10037313B2

公开(公告)日：2018-07-31

申请号：US15245152

申请日：2016-08-23

Applicant: Google Inc.

Inventor： Fangzhou Wang , Sourish Chaudhuri , Daniel Ellis , Nathan Reale

IPC: G10L15/00 , G06F17/24 , G10L25/84 , G10L15/20 , G10L25/78 , G10L21/06

CPC classification number: G06F17/241 , G10L15/20 , G10L25/78 , G10L25/84 , G10L2021/065 , G10L2025/783

Abstract: A content server accessing an audio stream, and inputs portions of the audio stream into one or more non-speech classifiers for classification, the non-speech classifiers generating, for portions of the audio stream, a set of raw scores representing likelihoods that the respective portion of the audio stream includes an occurrence of a particular class of non-speech sounds associated with each of the non-speech classifiers. The content server generates binary scores for the sets of raw scores, the binary scores generated based on a smoothing of a respective set of raw scores. The content server applies a set of non-speech captions to portions of the audio stream in time, each of the sets of non-speech captions based on a different one of the set binary scores of the corresponding portion of the audio stream.

2.

发明申请
ASSOCIATING FACES WITH VOICES FOR SPEAKER DIARIZATION WITHIN VIDEOS 审中-公开

公开(公告)号：US20180174600A1

公开(公告)日：2018-06-21

申请号：US15497497

申请日：2017-04-26

Applicant: Google Inc.

Inventor： Sourish Chaudhuri , Kenneth Hoover

IPC: G10L25/57 , G11B27/10 , G10L17/00 , G06K9/00 , G10L25/21 , G10L15/26 , G10L25/93 , G06K9/66 , G10L15/06 , H04N21/488

CPC classification number: G10L25/57 , G06K9/00288 , G06K9/00744 , G06K9/00765 , G06K9/66 , G10L15/063 , G10L15/265 , G10L17/005 , G10L17/04 , G10L17/10 , G10L21/0272 , G10L25/21 , G10L25/30 , G10L25/78 , G10L25/93 , G11B27/031 , G11B27/10 , G11B27/28 , H04N21/233 , H04N21/23418 , H04N21/4394 , H04N21/44008 , H04N21/4666 , H04N21/4884 , H04N21/8549

Abstract: A computer-implemented method for speech diarization is described. The method comprises determining temporal positions of separate faces in a video using face detection and clustering. Voice features are detected in the speech sections of the video. The method further includes generating a correlation between the determined separate faces and separate voices based at least on the temporal positions of the separate faces and the separate voices in the video. This correlation is stored in a content store with the video.

3.

发明申请
FILTERING WIND NOISES IN VIDEO CONTENT 审中-公开

公开(公告)号：US20180084301A1

公开(公告)日：2018-03-22

申请号：US15826622

申请日：2017-11-29

Applicant: Google Inc.

Inventor： Elad Eban , Aren Jansen , Sourish Chaudhuri

IPC: H04N21/439 , H04N21/44 , H04N21/233 , G10L21/0208 , H04N5/60 , H04H60/58 , G10L25/57 , H04N5/911

CPC classification number: H04N21/4398 , G10L21/0208 , G10L25/57 , H04H60/12 , H04H60/58 , H04H60/65 , H04N5/602 , H04N5/911 , H04N9/802 , H04N21/233 , H04N21/4394 , H04N21/44016

Abstract: Implementations disclose filtering wind noises in video content. A method includes receiving video content comprising an audio component and a video component, detecting, by a processing device, occurrence of a wind noise artifact in a segment of the audio component, identifying an intensity of the wind noise artifact, wherein the intensity is based on a signal-to-noise ratio of the wind noise artifact, selecting, by the processing device, a wind noise replacement operation based on the identified intensity of the wind noise artifact, and applying, by the processing device, the selected wind noise replacement operation to the segment of the audio component to remove the wind noise artifact from the segment.

4.

发明申请
FILTERING WIND NOISES IN VIDEO CONTENT 有权

公开(公告)号：US20170324990A1

公开(公告)日：2017-11-09

申请号：US15147040

申请日：2016-05-05

Applicant: Google Inc.

Inventor： Elad Eban , Aren Jansen , Sourish Chaudhuri

IPC: H04N21/439 , H04N5/60

CPC classification number: H04N21/4398 , G10L21/0208 , G10L25/57 , H04H60/12 , H04H60/58 , H04H60/65 , H04N5/602 , H04N5/911 , H04N9/802 , H04N21/233 , H04N21/4394 , H04N21/44016

Abstract: Implementations disclose filtering wind noises in video content. A method includes receiving video content comprising an audio component and a video component, detecting, by a processing device, occurrence of a wind noise artifact in a segment of the audio component, identifying duration of the wind noise artifact and intensity of the wind noise artifact, selecting, by the processing device, a wind noise replacement operation based on the identified duration and intensity of the wind noise artifact, and applying, by the processing device, the selected wind noise replacement operation to the segment of the audio component to remove the wind noise artifact from the segment.

5.

发明申请
AUTOMATIC SMOOTHED CAPTIONING OF NON-SPEECH SOUNDS FROM AUDIO 审中-公开

公开(公告)号：US20170278525A1

公开(公告)日：2017-09-28

申请号：US15245152

申请日：2016-08-23

Applicant: Google Inc.

Inventor： Fangzhou Wang , Sourish Chaudhuri , Daniel Ellis , Nathan Reale

IPC: G10L21/10 , G10L15/20 , G06F17/24 , G10L25/84

CPC classification number: G06F17/241 , G10L15/20 , G10L25/78 , G10L25/84 , G10L2021/065 , G10L2025/783

Abstract: A content server accessing an audio stream, and inputs portions of the audio stream into one or more non-speech classifiers for classification, the non-speech classifiers generating, for portions of the audio stream, a set of raw scores representing likelihoods that the respective portion of the audio stream includes an occurrence of a particular class of non-speech sounds associated with each of the non-speech classifiers. The content server generates binary scores for the sets of raw scores, the binary scores generated based on a smoothing of a respective set of raw scores. The content server applies a set of non-speech captions to portions of the audio stream in time, each of the sets of non-speech captions based on a different one of the set binary scores of the corresponding portion of the audio stream.

6.

发明授权
Filtering wind noises in video content 有权

公开(公告)号：US10356469B2

公开(公告)日：2019-07-16

申请号：US15826622

申请日：2017-11-29

Applicant: Google Inc.

Inventor： Elad Eban , Aren Jansen , Sourish Chaudhuri

IPC: H04N5/60 , G10L25/57 , H04H60/12 , H04H60/58 , H04H60/65 , H04N21/44 , H04N5/911 , H04N9/802 , H04N21/233 , H04N21/439 , G10L21/0208

Abstract: Implementations disclose filtering wind noises in video content. A method includes receiving video content comprising an audio component and a video component, detecting, by a processing device, occurrence of a wind noise artifact in a segment of the audio component, identifying an intensity of the wind noise artifact, wherein the intensity is based on a signal-to-noise ratio of the wind noise artifact, selecting, by the processing device, a wind noise replacement operation based on the identified intensity of the wind noise artifact, and applying, by the processing device, the selected wind noise replacement operation to the segment of the audio component to remove the wind noise artifact from the segment.

7.

发明授权
Filtering wind noises in video content 有权

公开(公告)号：US09838737B2

公开(公告)日：2017-12-05

申请号：US15147040

申请日：2016-05-05

Applicant: Google Inc.

Inventor： Elad Eban , Aren Jansen , Sourish Chaudhuri

IPC: H04N5/60 , H04N21/439

CPC classification number: H04N21/4398 , G10L21/0208 , G10L25/57 , H04H60/12 , H04H60/58 , H04H60/65 , H04N5/602 , H04N5/911 , H04N9/802 , H04N21/233 , H04N21/4394 , H04N21/44016

Abstract: Implementations disclose filtering wind noises in video content. A method includes receiving video content comprising an audio component and a video component, detecting, by a processing device, occurrence of a wind noise artifact in a segment of the audio component, identifying duration of the wind noise artifact and intensity of the wind noise artifact, selecting, by the processing device, a wind noise replacement operation based on the identified duration and intensity of the wind noise artifact, and applying, by the processing device, the selected wind noise replacement operation to the segment of the audio component to remove the wind noise artifact from the segment.

8.

发明申请
AUTOMATIC DETERMINATION OF TIMING WINDOWS FOR SPEECH CAPTIONS IN AN AUDIO STREAM 审中-公开

公开(公告)号：US20170316792A1

公开(公告)日：2017-11-02

申请号：US15225513

申请日：2016-08-01

Applicant: Google Inc.

Inventor： Sourish Chaudhuri , Nebojsa Ciric , Khiem Pham

IPC: G10L25/93 , G10L19/022 , G10L21/055 , G10L25/27

CPC classification number: G10L25/27 , G10L25/48 , G10L25/87 , G11B27/031

Abstract: A content system accessing an audio stream. The content system inputs segments of the audio stream into a speech classifier for classification, the speech classifier generating, for the segments of the audio stream, raw scores representing likelihoods that the respective segment of the audio stream includes an occurrence of a speech sound. The content system generates binary scores for the audio stream based on the set of raw scores, each binary score generated based on an aggregation of raw scores from consecutive series of the segments of the audio stream. The content system generates one or more timing windows for the speech sounds in the audio stream based on the binary scores, each timing window indicating an estimate of a beginning and ending timestamps of one or more speech sounds in the audio stream.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification