摘要:
Methods and arrangements for facilitating speaker identification. At least one N-best list is generated based on input speech, a system output is posited based on the input speech, and a determination is made, via at least one property of the N-best list, as to whether the posited system output is inconclusive.
摘要:
There is provided an apparatus for providing a Text Independent (TI) speaker recognition mode in a Text Dependent (TD) Hidden Markov Model (HMM) speaker recognition system and/or a Text Constrained (TC) HMM speaker recognition system. The apparatus includes a Gaussian Mixture Model (GMM) generator and a Gaussian weight normalizer. The GMM generator is for creating a GMM by pooling Gaussians from a plurality of HMM states. The Gaussian weight normalizer is for normalizing Gaussian weights with respect to the plurality of HMM states.
摘要:
A method, system and program storage device are provided for machine diagnostics, detection and profiling using pressure waves, the method including profiling known sources, acquiring pressure wave data, analyzing the acquired pressure wave data, and detecting if the analyzed pressure wave data matches a profiled known source; the system including a processor, a pressure wave transducer in signal communication with the processor, a pressure wave analysis unit in signal communication with the processor, and a source or threat detection unit in signal communication with the processor; and the program storage device including program steps for profiling known sources, acquiring pressure wave data, analyzing the acquired pressure wave data, and detecting if the analyzed pressure wave data matches a profiled known source.
摘要:
A method and system for speaker recognition and identification includes transforming features of a speaker utterance in a first condition state to match a second condition state and provide a transformed utterance. A discriminative criterion is used to generate a transform that maps an utterance to obtain a computed result. The discriminative criterion is maximized over a plurality of speakers to obtain a best transform for recognizing speech and/or identifying a speaker under the second condition state. Speech recognition and speaker identity may be determined by employing the best transform for decoding speech to reduce channel mismatch.
摘要:
In detection systems, such as speaker verification systems, for a given operating point range, with an associated detection “cost”, the detection cost is preferably reduced by essentially trading off the system error in the area of interest with areas essentially “outside” that interest. Among the advantages achieved thereby are higher optimization gain and better generalization. From a measurable Detection Error Tradeoff (DET) curve of the given detection system, a criterion is preferably derived, such that its minimization provably leads to detection cost reduction in the area of interest. The criterion allows for selective access to the slope and offset of the DET curve (a line in case of normally distributed detection scores, a curve approximated by mixture of Gaussians in case of other distributions). By modifying the slope of the DET curve, the behavior of the detection system is changed favorably with respect to the given area of interest.
摘要:
The present invention provides a system and method for treating distortion propagated though a detection system. The system includes a compensation module that compensates for untreated distortions propagating through the detection compensation system, a user model pool that comprises of a plurality of model sets, and a model selector that selects at least one model set from plurality of model sets in the user model pool. The compensation is accomplished by continually producing scores distributed according to a prescribed distribution for the at least one model set and mitigating the adverse effects of the scores being distorted and lying off a pre-set operating point.The method for treating distortion propagated though a detection system includes receiving a signal from a remote device, and compensating the signal for untreated distortions. The compensation includes selecting at least one relevant model set from a plurality of model sets, producing scores distributed according to a pre-described distribution for the at least one model set, and mitigating the adverse effect of the scores being distorted by rejecting a signal if it lies off a preset operating point.
摘要:
In large-scale deployments of speaker recognition systems the potential for legacy problems increases as the evolving technology may require configuration changes in the system thus invalidating already existing user voice accounts. Unless the entire database of original speech waveform were stored, users need to reenroll to keep their accounts functional, which, however, may be expensive and commercially not acceptable. Model migration is defined as a conversion of obsolete models to new-configuration models without additional data and waveform requirements. The present disclosure investigates ways to achieve such a migration with minimum loss of system accuracy.
摘要:
A system and method for determining and authenticating a person's identity by generating a behavioral profile for that person by presenting that person with various stimulus and measuring that person's response characteristics in an enrollment stage. That person's response profile, once generated is stored. When that user subsequently needs to access a secure resource, that user to be authorized is presented with the stimulus that was presented at the time of generating that person's behavioral profile and the person's responses are detected and compared to his/her behavioral profile. If a match is detected, that user is identified. The user's behavioral response may be in the form of signals as detected by sensor means that detects visual or audible emotional cues or as signals resulting from that person's behavior as detected by polygraph or EEG devices.
摘要:
A method and system for speaker recognition and identification includes transforming features of a speaker utterance in a first condition state to match a second condition state and provide a transformed utterance. A discriminative criterion is used to generate a transform that maps an utterance to obtain a computed result. The discriminative criterion is maximized over a plurality of speakers to obtain a best transform for recognizing speech and/or identifying a speaker under the second condition state. Speech recognition and speaker identity may be determined by employing the best transform for decoding speech to reduce channel mismatch.
摘要:
In detection systems, such as speaker verification systems, for a given operating point range, with an associated detection “cost”, the detection cost is preferably reduced by essentially trading off the system error in the area of interest with areas essentially “outside” that interest. Among the advantages achieved thereby are higher optimization gain and better generalization. From a measurable Detection Error Tradeoff (DET) curve of the given detection system, a criterion is preferably derived, such that its minimization provably leads to detection cost reduction in the area of interest. The criterion allows for selective access to the slope and offset of the DET curve (a line in case of normally distributed detection scores, a curve approximated by mixture of Gaussians in case of other distributions). By modifying the slope of the DET curve, the behavior of the detection system is changed favorably with respect to the given area of interest.