Audio event detection
    1.
    发明授权

    公开(公告)号:US10803885B1

    公开(公告)日:2020-10-13

    申请号:US16023923

    申请日:2018-06-29

    Abstract: An audio event detection system that processes audio data into audio feature data and processes the audio feature data using pre-configured candidate interval lengths to identify top candidate regions of the feature data that may include an audio event. The feature data from the top candidate regions are then scored by a classifier, where the score indicates a likelihood that the candidate region corresponds to a desired audio event. The scores are compared to a threshold, and if the threshold is satisfied, the top scoring candidate region is determined to include an audio event.

    Self-supervised federated learning

    公开(公告)号:US12039998B1

    公开(公告)日:2024-07-16

    申请号:US17665129

    申请日:2022-02-04

    CPC classification number: G10L25/78 G06N3/045 G06N3/08 G10L25/21

    Abstract: An acoustic event detection system may employ self-supervised federated learning to update encoder and/or classifier machine learning models. In an example operation, an encoder may be pre-trained to extract audio feature data from an audio signal. A decoder may be pre-trained to predict a subsequent portion of audio data (e.g., a subsequent frame of audio data represented by log filterbank energies). The encoder and decoder may be trained using self-supervised learning to improve the decoder's predictions and, by extension, the quality of the audio feature data generated by the encoder. The system may apply federated learning to share encoder updates across user devices. The system may fine-tune the classifier to improve inferences based on the improved audio feature data. The system may distribute classifier updates to the user device(s) to update the on-device classifier.

    Media presence detection
    5.
    发明授权

    公开(公告)号:US11069352B1

    公开(公告)日:2021-07-20

    申请号:US16278440

    申请日:2019-02-18

    Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.

    Audio event detection
    8.
    发明授权

    公开(公告)号:US10418957B1

    公开(公告)日:2019-09-17

    申请号:US16023990

    申请日:2018-06-29

    Abstract: An audio event detection system that subsamples input audio data using a series of recurrent neural networks to create data of a coarser time scale than the audio data. Data frames corresponding to the coarser time scale may then be upsampled to data frames that match the finer time scale of the original audio data frames. The resulting data frames are then scored with a classifier to determine a likelihood that the individual frames correspond to an audio event. Each frame is then weighted by its score and a composite weighted frame is created by summing the weighted frames and dividing by the cumulative score. The composite weighted frame is then scored by the classifier. The resulting score is taken as an overall score indicating a likelihood that the input audio data includes an audio event.

Patent Agency Ranking