AUTOMATIC SMOOTHED CAPTIONING OF NON-SPEECH SOUNDS FROM AUDIO

发明申请

US20170278525A1 AUTOMATIC SMOOTHED CAPTIONING OF NON-SPEECH SOUNDS FROM AUDIO 审中-公开

请登陆查看更多内容

专利标题： AUTOMATIC SMOOTHED CAPTIONING OF NON-SPEECH SOUNDS FROM AUDIO
申请号： US15245152

申请日： 2016-08-23
公开(公告)号： US20170278525A1

公开(公告)日： 2017-09-28
发明人: Fangzhou Wang , Sourish Chaudhuri , Daniel Ellis , Nathan Reale
申请人： Google Inc.
主分类号： G10L21/10
IPC分类号： G10L21/10 ; G10L15/20 ; G06F17/24 ; G10L25/84

AUTOMATIC SMOOTHED CAPTIONING OF NON-SPEECH SOUNDS FROM AUDIO

摘要：

A content server accessing an audio stream, and inputs portions of the audio stream into one or more non-speech classifiers for classification, the non-speech classifiers generating, for portions of the audio stream, a set of raw scores representing likelihoods that the respective portion of the audio stream includes an occurrence of a particular class of non-speech sounds associated with each of the non-speech classifiers. The content server generates binary scores for the sets of raw scores, the binary scores generated based on a smoothing of a respective set of raw scores. The content server applies a set of non-speech captions to portions of the audio stream in time, each of the sets of non-speech captions based on a different one of the set binary scores of the corresponding portion of the audio stream.

公开/授权文献

US10037313B2 Automatic smoothed captioning of non-speech sounds from audio 公开/授权日：2018-07-31

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L21/00	为了改变语音或声音信号的质量或其可识度而处理语音或声音信号，以产生另一种可听的或非可听的信号，例如视觉信号或触觉信号（G10L19/00优先）
G10L21/06	.将语音转换成非可听表达形式，例如语音可视化、触觉辅助的语音处理（G10L15/26优先）
G10L21/10	..转换成可视信息