Patent search ap:("Google Inc.") AND inv:"Nathan Reale" Page 1

1.

发明授权
Automatic smoothed captioning of non-speech sounds from audio 有权

公开(公告)号：US10037313B2

公开(公告)日：2018-07-31

申请号：US15245152

申请日：2016-08-23

Applicant: Google Inc.

Inventor： Fangzhou Wang , Sourish Chaudhuri , Daniel Ellis , Nathan Reale

IPC: G10L15/00 , G06F17/24 , G10L25/84 , G10L15/20 , G10L25/78 , G10L21/06

CPC classification number: G06F17/241 , G10L15/20 , G10L25/78 , G10L25/84 , G10L2021/065 , G10L2025/783

Abstract: A content server accessing an audio stream, and inputs portions of the audio stream into one or more non-speech classifiers for classification, the non-speech classifiers generating, for portions of the audio stream, a set of raw scores representing likelihoods that the respective portion of the audio stream includes an occurrence of a particular class of non-speech sounds associated with each of the non-speech classifiers. The content server generates binary scores for the sets of raw scores, the binary scores generated based on a smoothing of a respective set of raw scores. The content server applies a set of non-speech captions to portions of the audio stream in time, each of the sets of non-speech captions based on a different one of the set binary scores of the corresponding portion of the audio stream.

2.

发明申请
AUTOMATIC SMOOTHED CAPTIONING OF NON-SPEECH SOUNDS FROM AUDIO 审中-公开

公开(公告)号：US20170278525A1

公开(公告)日：2017-09-28

申请号：US15245152

申请日：2016-08-23

Applicant: Google Inc.

Inventor： Fangzhou Wang , Sourish Chaudhuri , Daniel Ellis , Nathan Reale

IPC: G10L21/10 , G10L15/20 , G06F17/24 , G10L25/84

CPC classification number: G06F17/241 , G10L15/20 , G10L25/78 , G10L25/84 , G10L2021/065 , G10L2025/783

Abstract: A content server accessing an audio stream, and inputs portions of the audio stream into one or more non-speech classifiers for classification, the non-speech classifiers generating, for portions of the audio stream, a set of raw scores representing likelihoods that the respective portion of the audio stream includes an occurrence of a particular class of non-speech sounds associated with each of the non-speech classifiers. The content server generates binary scores for the sets of raw scores, the binary scores generated based on a smoothing of a respective set of raw scores. The content server applies a set of non-speech captions to portions of the audio stream in time, each of the sets of non-speech captions based on a different one of the set binary scores of the corresponding portion of the audio stream.

Patent Agency Ranking