-
公开(公告)号:US11587556B2
公开(公告)日:2023-02-21
申请号:US16594624
申请日:2019-10-07
申请人: Audio Analytic Ltd
发明人: Christopher James Mitchell , Sacha Krstulovic , Cagdas Bilen , Juan Azcarreta Ortiz , Giacomo Ferroni , Arnoldas Jasonas , Francesco Tuveri
摘要: A method for recognising at least one of a non-verbal sound event and a scene in an audio signal comprising a sequence of frames of audio data, the method comprising: for each frame of the sequence: processing the frame of audio data to extract multiple acoustic features for the frame of audio data; and classifying the acoustic features to classify the frame by determining, for each of a set of sound classes, a score that the frame represents the sound class; processing the sound class scores for multiple frames of the sequence of frames to generate, for each frame, a sound class decision for each frame; and processing the sound class decisions for the sequence of frames to recognise the at least one of a non-verbal sound event and a scene.
-
公开(公告)号:US10783434B1
公开(公告)日:2020-09-22
申请号:US16594605
申请日:2019-10-07
申请人: Audio Analytic Ltd
发明人: Christopher James Mitchell , Sacha Krstulovic , Cagdas Bilen , Juan Azcarreta Ortiz , Giacomo Ferroni , Arnoldas Jasonas , Francesco Tuveri
摘要: A method of training a non-verbal sound class detection machine learning system, the non-verbal sound class detection machine learning system comprising a machine learning model configured to: receive data for each frame of a sequence of frames of audio data obtained from an audio signal; for each frame of the sequence of frames: process the data for multiple frames; and output data for at least one sound class score representative of a degree of affiliation of the frame with at least one sound class of a plurality of sound classes, wherein the plurality of sound classes comprises: one or more target sound classes; and a non-target sound class representative of an absence of each of the one or more target sound classes; wherein the method comprises: training the machine learning model using a loss function.
-
公开(公告)号:US20210104230A1
公开(公告)日:2021-04-08
申请号:US16594624
申请日:2019-10-07
申请人: Audio Analytic Ltd.
发明人: Christopher James Mitchell , Sacha Krstulovic , Cagdas Bilen , Juan Azcarreta Ortiz , Giacomo Ferroni , Amoldas Jasonas , Francesco Tuveri
摘要: A method for recognising at least one of a non-verbal sound event and a scene in an audio signal comprising a sequence of frames of audio data, the method comprising: for each frame of the sequence: processing the frame of audio data to extract multiple acoustic features for the frame of audio data; and classifying the acoustic features to classify the frame by determining, for each of a set of sound classes, a score that the frame represents the sound class; processing the sound class scores for multiple frames of the sequence of frames to generate, for each frame, a sound class decision for each frame; and processing the sound class decisions for the sequence of frames to recognise the at least one of a non-verbal sound event and a scene.
-
-