Audio event detection with window-based prediction

发明授权

US11948599B2 Audio event detection with window-based prediction 有权

请登陆查看更多内容

专利标题： Audio event detection with window-based prediction
申请号： US17647318

申请日： 2022-01-06
公开(公告)号： US11948599B2

公开(公告)日： 2024-04-02
发明人: Lihi Ahuva Shiloh Perl , Ben Fishman , Gilad Pundak , Yonit Hoffman
申请人： Microsoft Technology Licensing, LLC
申请人地址： US WA Redmond
专利权人： MICROSOFT TECHNOLOGY LICENSING, LLC
当前专利权人： MICROSOFT TECHNOLOGY LICENSING, LLC
当前专利权人地址： US WA Redmond
代理机构： Newport IP, LLC
代理商 Jacob P. Rohwer
主分类号： G10L25/93
IPC分类号： G10L25/93 ; G06N3/048 ; G06N3/08 ; G10L25/45

Audio event detection with window-based prediction

摘要：

A computing system for a plurality of classes of audio events is provided, including one or more processors configured to divide a run-time audio signal into a plurality of segments and process each segment of the run-time audio signal in a time domain to generate a normalized time domain representation of each segment. The processor is further configured to feed the normalized time domain representation of each segment to an input layer of a trained neural network. The processor is further configured to generate, by the neural network, a plurality of predicted classification scores and associated probabilities for each class of audio event contained in each segment of the run-time input audio signal. In post-processing, the processor is further configured to generate smoothed predicted classification scores, associated smoothed probabilities, and class window confidence values for each class for each of a plurality of candidate window sizes.

公开/授权文献

US20230215460A1 AUDIO EVENT DETECTION WITH WINDOW-BASED PREDICTION 公开/授权日：2023-07-06

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L25/00	不限于组G10L 15/00-G10L 21/00的语言或者声音分析技术(当利用语音检测器来感知一些信号特殊特征的基于半导体的静噪放大器，如无信号时的感知入H03G3/34)
G10L25/93	.判别语音信号之间的浊音和清音部分（G10L25/90优先）