摘要:
The present invention provides systems and methods for signal detection and enhancement. The systems and methods utilize one or more discriminative classifiers (e.g., a logistic regression model and a convolutional neural network) to estimate a posterior probability that indicates whether a desired signal is present in a received signal. The discriminative estimators generate the estimated probability based on one or more signal-to-noise ratio (SNRs) (e.g., a normalized logarithmic posterior SNR (nlpSNR) and a mel-transformed nlpSNR (mel-nlpSNR)) and an estimated noise model. Depending on the resolution desired, the estimated SNR can be generated at a frame level or at an atom level, wherein the atom level estimates are utilized to generate the frame level estimate. The novel systems and methods can be utilized to facilitate speech detection, speech recognition, speech coding, noise adaptation, speech enhancement, microphone arrays and echo-cancellation.
摘要:
Signal detectors are described herein. By way of example, a system for detecting signals can include a microphone signal detector, a loudspeaker signal detector, a signal discriminator and a decision component. When the microphone signal detector detects the presence of a microphone signal, the loudspeaker signal detector detects the presence of a loudspeaker signal and the signal discriminator determines that near-end speech dominates loudspeaker echo, the decision component can confirm the presence of doubletalk. When the microphone signal detector detects the presence of a microphone signal and the signal discriminator determines that near-end speech dominates loudspeaker echo, the decision component confirms the presence of near-end signal.
摘要:
A regression-based residual echo suppression (RES) system and process for suppressing the portion of the microphone signal corresponding to a playback of a speaker audio signal that was not suppressed by an acoustic echo canceller (AEC). In general, a prescribed regression technique is used between a prescribed spectral attribute of multiple past and present, fixed-length, periods (e.g., frames) of the speaker signal and the same spectral attribute of a current period (e.g., frame) of the echo residual in the output of the AEC. This automatically takes into consideration the correlation between the time periods of the speaker signal. The parameters of the regression can be easily tracked using adaptive methods. Multiple applications of RES can be used to produce better results and this system and process can be applied to stereo-RES as well.
摘要:
A system that facilitates organization of emails comprises a clustering component that clusters a plurality of emails and creates topics for emails by assigning key phrases extracted from emails within one or more clusters. An organization component then utilizes the key phrases to organize documents. Furthermore, the organization component can comprise a probability component that determines a probability that a document belongs to a certain topic.
摘要:
Signal detectors are described herein. By way of example, a system for detecting signals can include a microphone signal detector, a loudspeaker signal detector, a signal discriminator and a decision component. When the microphone signal detector detects the presence of a microphone signal, the loudspeaker signal detector detects the presence of a loudspeaker signal and the signal discriminator determines that near-end speech dominates loudspeaker echo, the decision component can confirm the presence of doubletalk. When the microphone signal detector detects the presence of a microphone signal and the signal discriminator determines that near-end speech dominates loudspeaker echo, the decision component confirms the presence of near-end signal.
摘要:
A system that facilitates detecting a targeted topic in a document is described herein. The system includes a receiver component that receives a document. The system additionally includes a topic model component trained using a plurality of training documents including the topic and a plurality of training documents that do not include the topic. The topic model component analyzes the document and automatically determines which portions of the document include the topic and which portions of the document do not include the topic.
摘要:
A general probabilistic formulation referred to as ‘Conditional Harmonic Mixing’ is provided, in which links between classification nodes are directed, a conditional probability matrix is associated with each link, and where the numbers of classes can vary from node to node. A posterior class probability at each node is updated by minimizing a divergence between its distribution and that predicted by its neighbors. For arbitrary graphs, as long as each unlabeled point is reachable from at least one training point, a solution generally always exists, is unique, and can be found by solving a sparse linear system iteratively. In one aspect, an automated data classification system is provided. The system includes a data set having at least one labeled category node in the data set. A semi-supervised learning component employs directed arcs to determine the label of at least one other unlabeled category node in the data set.
摘要:
Prior to searching a multidimensional feature space populated with data objects, each dimension in the feature space is divided into a number of intervals. When a query is received, a single interval that is overlapped by the query is selected from each dimension. A reduced set of data objects is then selected that includes only those data objects that overlap the selected intervals. This reduced set of data objects, rather than the entire set of data objects in the feature space, is then used to determine matches for the query.
摘要:
The present invention relates to a system and methodology to facilitate automatic generation of mnemonic audio portions or segments referred to as audio thumbnails. A system is provided for summarizing audio information. The system includes an analysis component to determine common features in an audio file and a mnemonic detector to extract fingerprint portions of the audio file based in part on the common features in order to generate a thumbnail of the audio file. The generated thumbnails can then be employed to facilitate browsing or searching audio files in order to mitigate listening to longer portions or segments of such files.
摘要:
Systems and methods are disclosed that facilitate producing probabilistic outputs also referred to as posterior probabilities. The probabilistic outputs include an estimate of classification strength. The present invention intercepts non-probabilistic classifier output and applies a set of kernel models based on a softmax function to derive the desired probabilistic outputs. Such probabilistic outputs can be employed with handwriting recognition where the probability of a handwriting sample classification is combined with language models to make better classification decisions.