Abstract:
Method for text-dependent Speaker Recognition using a speaker adapted Universal Background Model; wherein the speaker adapted Universal Background Model is a speaker adapted Hidden Markov Model comprising channel correction.
Abstract:
A system and method is provided that authenticates a user using hybrid biometrics information, such as a user's image information, a user's voice information, etc. The user authentication method includes: acquiring a number of biometrics information; generating a number of authentication information corresponding to the acquired biometrics information; and performing an integral user authentication based on the by generated authentication information.
Abstract:
The invention relates to a method and to a radiotelephony system for identifying and verifying radiotelephony messages (M 1 ...M 3 ) and for associating radiotelephony messages (M 1 ...M 3 ) with vehicles (F 1 ...F 3 ). An announcer specifies the identification (K) at a predetermined point of each radiotelephony message (M 1 ...M 3 ). According to the invention, a) a number of delivered radiotelephony messages (M 1 ...M 3 ) are recorded. Each identification (K) containing the radiotelephony message (M 1 ...M 3 ) is transformed by voice recognition (0) into a digital identification (K d ). Radiotelephony messages (M 1 ...M 3 ) are extracted, each one being associated with the same digital identification, a biometric data set (B 1 ...B 3 ) is extracted, and said biometric data set (B 1 ...B 3 ) is associated with the respective digital identification (K d ), and b) then an additional radiotelephony message (M 4 ) is recorded. From the additional radiotelephony message (M 4 ), an additional biometric data set (B 4 ) is extracted. From the recorded biometric data sets (B 1 ...B 3 )), the biometric-dataset (B 1 ) which best matches the additional biometric data set (B 4 ) and the radiotelephony message (M 4 ) of the vehicle (F 1 ) is searched for and the associated identification (K d ) is associated with said biometric data set (B 1 ).
Abstract:
A communication interface apparatus for a system and a plurality of users is provided. The communication interface apparatus for the system and the plurality of users includes a first process unit configured to receive voice information and face information from at least one user, and determine whether the received voice information is voice information of at least one registered user based on user models corresponding to the respective received voice information and face information; a second process unit configured to receive the face information, and determine whether the at least one user's attention is on the system based on the received face information; and a third process unit configured to receive the voice information, analyze the received voice information, and determine whether the received voice information is substantially meaningful to the system based on a dialog model that represents conversation flow on a situation basis.
Abstract:
A communication interface apparatus for a system and a plurality of users is provided. The communication interface apparatus for the system and the plurality of users includes a first process unit configured to receive voice information and face information from at least one user, and determine whether the received voice information is voice information of at least one registered user based on user models corresponding to the respective received voice information and face information; a second process unit configured to receive the face information, and determine whether the at least one user's attention is on the system based on the received face information; and a third process unit configured to receive the voice information, analyze the received voice information, and determine whether the received voice information is substantially meaningful to the system based on a dialog model that represents conversation flow on a situation basis.
Abstract:
Method for text-dependent Speaker Recognition using a speaker adapted Universal Background Model; wherein the speaker adapted Universal Background Model is a speaker adapted Hidden Markov Model comprising channel correction.
Abstract:
Verfahren zur automatischen Sprecherklassifizierung, durch ein digitales System, wobei mindestens zwei unterschiedliche Sprecherklassifikationsverfahren auf digitale Sprachdaten angewandt werden und deren Ergebnisse miteinander kombiniert werden.
Abstract:
The present invention relates to a pattern recognition system (Fig. 1) which uses data fusion to combine data from a plurality of extracted features (60, 61, 62) and a plurality of classifiers (70, 71, 72). Speaker patterns can be accurately verified with the combination of discriminant based and distortion based classifiers. A novel approach using a training set of a "leave one out" data can be used for training the system with a reduced data set (Figs. 7A, 7B, 7C). Extracted features can be improved with a pole filtered method for reducing channel effects (Fig. 11B) and an affine transformation for improving the correlation between training and testing data (Fig. 14).
Abstract:
Implementations relate to automatic generation of speaker features for each of one or more particular text-dependent speaker verifications (TD-SVs) for a user. Implementations can generate speaker features for a particular TD-SV using instances of audio data that each capture a corresponding spoken utterance of the user during normal non-enrollment interactions with an automated assistant via one or more respective assistant devices. For example, a portion of an instance of audio data can be used in response to: (a) determining that recognized term(s) for the spoken utterance captured by that the portion correspond to the particular TD-SV; and (b) determining that an authentication measure, for the user and for the spoken utterance, satisfies a threshold. Implementations additionally or alternatively relate to utilization of speaker features, for each of one or more particular TD-SVs for a user, in determining whether to authenticate a spoken utterance for the user.