摘要:
Copies of original sound recordings are identified by extracting features from the copy, creating a vector of those features, and comparing that vector against a database of vectors. Identification can be performed for copies of sound recordings that have been subjected to compression and other manipulation such that they are not exact replicas of the original. Computational efficiency permits many hundreds of queries to be serviced at the same time. The vectors may be less than 100 bytes, so that many millions of vectors can be stored on a portable device.
摘要:
Copies of original sound recordings are identified by extracting features from the copy, creating a vector of those features, and comparing that vector against a database of vectors. Identification can be performed for copies of sound recordings that have been subjected to compression and other manipulation such that they are not exact replicas of the original. Computational efficiency permits many hundreds of queries to be serviced at the same time. The vectors may be less than 100 bytes, so that many millions of vectors can be stored on a portable device.
摘要:
Copies of original sound recordings are identified by extracting features from the copy, creating a vector of those features, and comparing that vector against a database of vectors. Identification can be performed for copies of sound recordings that have been subjected to compression and other manipulation such that they are not exact replicas of the original. Computational efficiency permits many hundreds of queries to be serviced at the same time. The vectors may be less than 100 bytes, so that many millions of vectors can be stored on a portable device.
摘要:
The present invention provides systems and methods for signal detection and enhancement. The systems and methods utilize one or more discriminative classifiers (e.g., a logistic regression model and a convolutional neural network) to estimate a posterior probability that indicates whether a desired signal is present in a received signal. The discriminative estimators generate the estimated probability based on one or more signal-to-noise ratio (SNRs) (e.g., a normalized logarithmic posterior SNR (nlpSNR) and a mel-transformed nlpSNR (mel-nlpSNR)) and an estimated noise model. Depending on the resolution desired, the estimated SNR can be generated at a frame level or at an atom level, wherein the atom level estimates are utilized to generate the frame level estimate. The novel systems and methods can be utilized to facilitate speech detection, speech recognition, speech coding, noise adaptation, speech enhancement, microphone arrays and echo-cancellation.
摘要:
The present invention provides systems and methods for signal detection and enhancement. The systems and methods utilize one or more discriminative classifiers (e.g., a logistic regression model and a convolutional neural network) to estimate a posterior probability that indicates whether a desired signal is present in a received signal. The discriminative estimators generate the estimated probability based on one or more signal-to-noise ratio (SNRs) (e.g., a normalized logarithmic posterior SNR (nlpSNR) and a mel-transformed nlpSNR (mel-nlpSNR)) and an estimated noise model. Depending on the resolution desired, the estimated SNR can be generated at a frame level or at an atom level, wherein the atom level estimates are utilized to generate the frame level estimate. The novel systems and methods can be utilized to facilitate speech detection, speech recognition, speech coding, noise adaptation, speech enhancement, microphone arrays and echo-cancellation.