-
71.
公开(公告)号:US20230326465A1
公开(公告)日:2023-10-12
申请号:US18023556
申请日:2020-08-31
申请人: NEC Corporation
发明人: Hitoshi Yamamoto
摘要: The present disclosure implements speaker verification with high accuracy regardless of input devices. An integration unit (110) integrates voice data inputted using an input device, and the frequency characteristic of the input device, and a feature extraction unit (120) extracts, from an integrated feature obtained by integrated the voice data and the frequency characteristic, a speaker feature for verifying the speaker of voice.
-
公开(公告)号:US20230298594A1
公开(公告)日:2023-09-21
申请号:US18323496
申请日:2023-05-25
发明人: Yi GAO
IPC分类号: G10L17/20 , G10L17/02 , G10L17/06 , G10L17/22 , G10L21/0232 , G10L25/18 , H04R1/40 , H04R3/00 , G10L15/22
CPC分类号: G10L17/20 , G10L17/02 , G10L17/06 , G10L17/22 , G10L21/0232 , G10L25/18 , H04R1/406 , H04R3/005 , G10L15/22 , G10L2021/02082
摘要: An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.
-
公开(公告)号:US20230282217A1
公开(公告)日:2023-09-07
申请号:US18016571
申请日:2020-07-27
申请人: NEC Corporation
发明人: Koji OKABE , Takafumi Koshinaka
IPC分类号: G10L17/04 , G10L17/02 , G10L17/20 , G10L21/0208
CPC分类号: G10L17/04 , G10L17/02 , G10L17/20 , G10L21/0208
摘要: The voice registration device 1X mainly includes a noise reproduction means 220X, a voice data acquisition means 200X, and a voice registration means 210X. The noise reproduction means 220X is configured to reproduce noise data during a time period in which voice input from a user is performed. The voice data acquisition means 200X is configured to acquire the voice data based on the voice input. The voice registration means 210X is configured to register the voice data or data generated based on the voice data as data to be used for verification relating to a voice of the user.
-
公开(公告)号:US20230197085A1
公开(公告)日:2023-06-22
申请号:US17997243
申请日:2020-06-22
发明人: Xiaoxia DONG , Jun WEI , Qimeng PAN
IPC分类号: G10L17/20 , G10L25/51 , G10L21/0216 , G10L17/22
CPC分类号: G10L17/20 , G10L25/51 , G10L21/0216 , G10L17/22
摘要: Embodiments include methods for voice/speech recognition in noisy environments executed by a processor of a computing device. In various embodiments, voice or speech recognition may be executed by a processor of a computing device, which may include determining a voice recognition model to use for voice and/or speech recognition based on a location where an audio input is received and performing voice and/or speech recognition on the audio input using the determined voice recognition model. Some embodiments my receive from a computing device, an audio input and location information associated with a location where the audio input was recorded. The received audio input may be used to generate a voice recognition model associated with the location where the audio input was recorded for use in voice and/or speech recognition. The generated voice recognition model associated with the location may be provided to the computing device.
-
公开(公告)号:US20180293988A1
公开(公告)日:2018-10-11
申请号:US15483246
申请日:2017-04-10
申请人: Intel Corporation
摘要: Techniques related to speaker recognition are discussed. Such techniques include determining context aware confidence values formed of false accept and false reject rates determined by using adaptively updated acoustic environment score distributions matched to current score distributions.
-
公开(公告)号:US09972323B2
公开(公告)日:2018-05-15
申请号:US15599578
申请日:2017-05-19
申请人: Google LLC
CPC分类号: G10L17/20 , G06F3/167 , G10L17/005 , G10L17/02 , G10L17/04 , G10L17/06 , G10L17/08 , G10L17/12 , G10L17/22 , G10L17/24 , G10L25/84 , H04M3/385
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a dynamic threshold for speaker verification are disclosed. In one aspect, a method includes the actions of receiving, for each of multiple utterances of a hotword, a data set including at least a speaker verification confidence score, and environmental context data. The actions further include selecting from among the data sets, a subset of the data sets that are associated with a particular environmental context. The actions further include selecting a particular data set from among the subset of data sets based on one or more selection criteria. The actions further include selecting, as a speaker verification threshold for the particular environmental context, the speaker verification confidence score. The actions further include providing the speaker verification threshold for use in performing speaker verification of utterances that are associated with the particular environmental context.
-
公开(公告)号:US09966062B2
公开(公告)日:2018-05-08
申请号:US15466448
申请日:2017-03-22
IPC分类号: G10L15/22 , G10L15/20 , G10L15/06 , G06F3/0484
CPC分类号: G10L15/063 , G06F3/04842 , G10L15/20 , G10L17/04 , G10L17/20 , G10L25/84 , G10L2015/0638 , H04W88/02
摘要: A method on a mobile device for voice recognition training is described. A voice training mode is entered. A voice training sample for a user of the mobile device is recorded. The voice training mode is interrupted to enter a noise indicator mode based on a sample background noise level for the voice training sample and a sample background noise type for the voice training sample. The voice training mode is returned to from the noise indicator mode when the user provides a continuation input that indicates a current background noise level meets an indicator threshold value.
-
公开(公告)号:US20180075851A1
公开(公告)日:2018-03-15
申请号:US15804220
申请日:2017-11-06
发明人: Horst J. Schroeter
CPC分类号: G10L17/24 , G10L17/005 , G10L17/04 , G10L17/08 , G10L17/20
摘要: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
-
公开(公告)号:US20180053512A1
公开(公告)日:2018-02-22
申请号:US15242882
申请日:2016-08-22
申请人: INTEL CORPORATION
发明人: Gokcen Cilingir , Narayan Biswal
IPC分类号: G10L17/04 , G10L17/12 , G10L17/20 , G10L21/0208
CPC分类号: G10L17/04 , G10L17/06 , G10L17/12 , G10L17/20 , G10L21/0208 , G10L2021/02082
摘要: Techniques are provided for reverberation compensation for far-field speaker recognition. A methodology implementing the techniques according to an embodiment includes receiving an authentication audio signal associated with speech of a user and extracting features from the authentication audio signal. The method also includes scoring results of application of one or more speaker models to the extracted features. Each of the speaker models is trained based on a training audio signal processed by a reverberation simulator to simulate selected far-field environmental effects to be associated with that speaker model. The method further includes selecting one of the speaker models, based on the score, and mapping the selected speaker model to a known speaker identification or label that is associated with the user.
-
公开(公告)号:US20170345430A1
公开(公告)日:2017-11-30
申请号:US15599578
申请日:2017-05-19
申请人: Google Inc.
CPC分类号: G10L17/20 , G06F3/167 , G10L17/005 , G10L17/02 , G10L17/04 , G10L17/06 , G10L17/08 , G10L17/12 , G10L17/22 , G10L17/24 , G10L25/84 , H04M3/385
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a dynamic threshold for speaker verification are disclosed. In one aspect, a method includes the actions of receiving, for each of multiple utterances of a hotword, a data set including at least a speaker verification confidence score, and environmental context data. The actions further include selecting from among the data sets, a subset of the data sets that are associated with a particular environmental context. The actions further include selecting a particular data set from among the subset of data sets based on one or more selection criteria. The actions further include selecting, as a speaker verification threshold for the particular environmental context, the speaker verification confidence score. The actions further include providing the speaker verification threshold for use in performing speaker verification of utterances that are associated with the particular environmental context.
-
-
-
-
-
-
-
-
-