Sensitivity mode for an audio spotting system

    公开(公告)号:US11823707B2

    公开(公告)日:2023-11-21

    申请号:US17572002

    申请日:2022-01-10

    Abstract: An audio spotting system configured for various operating modes including a regular mode and sensitivity mode is described. An example cascade audio spotting system may include a high-power subsystem including a high-power trigger and a transfer module. This high-power trigger includes one or more detection models used to detect whether a target sound activity is included in the one or more audio streams. The one or more detection models are associated with a first set of hyperparameters when the cascade audio spotting system is in a regular mode, and the one or more detection models are associated with a second set of hyperparameters when the cascade audio spotting system is in a sensitivity mode. The transfer module provides at least one of one or more processed audio streams for further processing in response to the high-power trigger detecting the target sound activity in the one or more audio streams.

    Connectionist temporal classification using segmented labeled sequence data

    公开(公告)号:US10762427B2

    公开(公告)日:2020-09-01

    申请号:US15909930

    申请日:2018-03-01

    Abstract: Classification training systems and methods include a neural network for classification of input data, a training dataset providing segmented labeled training data, and a classification training module operable to train the neural network using the training data. A forward pass processing module is operable to generate neural network outputs for the training data using weights and bias for the neural network, and a backward pass processing module is operable to update the weights and biases in a backward pass, including obtaining Region of Target (ROT) information from the training data, generate a forward-backward masking based on the ROT information, the forward-backward masking placing at least one restriction on a neural network output path, compute modified forward and backward variables based on the neural network outputs and the forward-backward masking, and update the weights and biases.

    MANY OR ONE DETECTION CLASSIFICATION SYSTEMS AND METHODS

    公开(公告)号:US20210248470A1

    公开(公告)日:2021-08-12

    申请号:US17243519

    申请日:2021-04-28

    Abstract: A classification training system comprises a neural network configured to perform classification of input data, a training dataset including pre-segmented, labeled training samples, and a classification training module configured to train the neural network using the training dataset. The classification training module includes a forward pass processing module, and a backward pass processing module. The backward pass processing module is configured to determine whether a current frame is in a region of target (ROT), determine ROT information such as beginning and length of the ROT and update weights and biases using a cross-entropy cost function and a tunable many-or-one detection (MOOD) cost function, that comprises a tunable hyperparameter for tuning the classifier for a particular task. The backward pass module further computes a soft target value using ROT information and computes a signal output error using the soft target value and network output value.

    360-DEGREE MULTI-SOURCE LOCATION DETECTION, TRACKING AND ENHANCEMENT

    公开(公告)号:US20190355373A1

    公开(公告)日:2019-11-21

    申请号:US16414677

    申请日:2019-05-16

    Abstract: Audio processing systems and methods comprise an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and a target activity detector configured to identify audio target sources in the multichannel audio signal. The target activity detector includes a VAD, an instantaneous locations component configured to detect a location of a plurality of audio sources, a dominant locations component configured to selectively buffer a subset of the plurality of audio sources comprising dominant audio sources, a source tracker configured to track locations of the dominant audio sources over time, and a dominance selection component configured to select the dominant target sources for further audio processing. The instantaneous location component computes a discrete spatial map comprising the location of the plurality of audio sources, and the dominant location component selects N of the dominant sources from the discrete spatial map for source tracking.

    VOICE ACTIVITY DETECTION SYSTEMS AND METHODS

    公开(公告)号:US20190172480A1

    公开(公告)日:2019-06-06

    申请号:US15832709

    申请日:2017-12-05

    Abstract: An audio processing device or method includes an audio transducer operable to receive audio input and generate an audio signal based on the audio input. The audio processing device or method also includes an audio signal processor operable to extract local features from the audio signal, such as Power-Normalized Coefficients (PNCC) of the audio signal. The audio signal processor also is operable to extract global features from the audio signal, such as chroma features and harmonicity features. A neural network is provided to determine a probability that a target audio is present in the audio signal based on the local and global features. In particular, the neural network is trained to output a value indicating whether the target audio is present and locally dominant in the audio signal.

    EFFICIENT CONNECTIONIST TEMPORAL CLASSIFICATION FOR BINARY CLASSIFICATION

    公开(公告)号:US20180232632A1

    公开(公告)日:2018-08-16

    申请号:US15894872

    申请日:2018-02-12

    Abstract: A classification system and method for training a neural network includes receiving a stream of segmented, labeled training data having a sequence of frames, computing a stream of input features data for the sequence of frames, and generating neural network outputs for the sequence of frames in a forward pass through the training data and in accordance weights and biases. The weights and biases are updated in a backward pass through the training data, including determining Region of Target (ROT) information from the segmented, labeled training data, computing modified forward and backward variables based on the neural network outputs and the ROT information, deriving a signal error for each frame within the sequence of frames based on the modified forward and backward variables, and updating the weights and biases based on the derived signal error. An adaptive learning module is provided to improve a convergence rate of the neural network.

    Dynamic range compression combined with active noise cancellation to remove artifacts caused by transient noises

    公开(公告)号:US12254860B2

    公开(公告)日:2025-03-18

    申请号:US18052374

    申请日:2022-11-03

    Abstract: This disclosure provides methods, devices, and systems for active noise cancellation (ANC). The present implementations more specifically relate to the use of dynamic range compression (DRC) for ANC. In some aspects, an ANC system receives an input audio signal of a transient noise as measured by a microphone, performs DRC on the input audio signal to generate a compressed dynamic range audio signal, and performs ANC on the compressed dynamic range audio signal to generate a cancellation signal associated with the input audio signal. The cancellation signal is based on an adjusted gain of the input audio signal to prevent saturation or large spikes of the cancellation signal, which can cause undesirable audio during playback.

Patent Agency Ranking