Audio scene recognition using time series analysis

Invention Grant

US11355138B2 Audio scene recognition using time series analysis 有权

Please log in to see more content

Patent Title: Audio scene recognition using time series analysis
Application No.: US16997249

Application Date: 2020-08-19
Publication No.: US11355138B2

Publication Date: 2022-06-07
Inventor: Cristian Lumezanu , Yuncong Chen , Dongjin Song , Takehiko Mizuguchi , Haifeng Chen , Bo Dong
Applicant: NEC Laboratories America, Inc.
Applicant Address: US NJ Princeton
Assignee: NEC Laboratories America, Inc.
Current Assignee: NEC Laboratories America, Inc.
Current Assignee Address: US NJ Princeton
Agent Joseph Kolodka
Main IPC: G10L25/51
IPC: G10L25/51 ; G10L25/24 ; G10L25/18 ; G10L25/21

Audio scene recognition using time series analysis

Abstract:

A method is provided. Intermediate audio features are generated from respective segments of an input acoustic time series for a same scene. Using a nearest neighbor search, respective segments of the input acoustic time series are classified based on the intermediate audio features to generate a final intermediate feature as a classification for the input acoustic time series. Each respective segment corresponds to a respective different acoustic window. The generating step includes learning the intermediate audio features from Multi-Frequency Cepstral Component (MFCC) features extracted from the input acoustic time series, dividing the same scene into the different windows having varying MFCC features, and feeding the MFCC features of each window into respective LSTM units such that a hidden state of each respective LSTM unit is passed through an attention layer to identify feature correlations between hidden states at different time steps corresponding to different ones of the different windows.

Public/Granted literature

US20210065734A1 AUDIO SCENE RECOGNITION USING TIME SERIES ANALYSIS Public/Granted day:2021-03-04

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L25/00	不限于组G10L 15/00-G10L 21/00的语言或者声音分析技术(当利用语音检测器来感知一些信号特殊特征的基于半导体的静噪放大器，如无信号时的感知入H03G3/34)
G10L25/48	.专门适用于特定用途
G10L25/51	..比较或判别