WAKEWORD AND ACOUSTIC EVENT DETECTION

    公开(公告)号:US20210358497A1

    公开(公告)日:2021-11-18

    申请号:US17321999

    申请日:2021-05-17

    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

    Media presence detection
    24.
    发明授权

    公开(公告)号:US11069352B1

    公开(公告)日:2021-07-20

    申请号:US16278440

    申请日:2019-02-18

    Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.

    Binary target acoustic trigger detecton

    公开(公告)号:US10460729B1

    公开(公告)日:2019-10-29

    申请号:US15639254

    申请日:2017-06-30

    Abstract: A method for selective transmission of audio data to a speech processing server uses detection of an acoustic trigger in the audio data in determining the data to transmit. Detection of the acoustic trigger makes use of an efficient computation approach that reduces the amount of run-time computation required, or equivalently improves accuracy for a given amount of computation, by using a neural network to determine an indicator of presence of the acoustic trigger. In some example, the neural network combines a “time delay” structure in which intermediate results of computations are reused at various time delays, thereby avoiding computation of computing new results, and decomposition of certain transformations to require fewer arithmetic operations without sacrificing significant performance.

Patent Agency Ranking