摘要:
A device and method for pass-phrase modeling for speaker verification and a speaker verification system are provided. The device comprises a front end which receives enrollment speech from a target speaker, and a template generation unit which generates a pass-phrase template with a general speaker model based on the enrollment speech. With the device, method and system of the present disclosure, by taking the rich variations contained in a general speaker model into account, the robust pass-phrase modeling is ensured even the enrollment data is insufficient, even just one pass-phrase is available from a target speaker.
摘要:
The present invention provides a device that performs online self-adaption of anchor models for an acoustic space, and a method thereof, the anchor models being used for categorization of an AV stream which is performed based on an audio stream in the AV stream. The device divides an input audio stream into audio segments, each being estimated to have a single acoustic feature, and estimates a single probability model for each audio segment. Then, the device performs clustering on the estimated probability models and probability models stored therein, thereby generating a new anchor model.
摘要:
A modeling device comprises a front end which receives enrollment speech data from each target speaker, a reference anchor set generation unit which generates a reference anchor set using the enrollment speech data based on an anchor space, and a voice print generation unit which generates voice prints based on the reference anchor set and the enrollment speech data. By taking the enrollment speech and speaker adaptation technique into account, anchor models with a smaller size can be generated, so reliable and robust speaker recognition with a smaller size reference anchor set is possible.
摘要:
A device and method for pass-phrase modeling for speaker verification and a speaker verification system are provided. The device comprises a front end which receives enrollment speech from a target speaker, and a template generation unit which generates a pass-phrase template with a general speaker model based on the enrollment speech. With the device, method and system of the present disclosure, by taking the rich variations contained in a general speaker model into account, the robust pass-phrase modeling is ensured even the enrollment data is insufficient, even just one pass-phrase is available from a target speaker.
摘要:
A modeling device and method for speaker recognition and a speaker recognition system are provided. The modeling device comprises a front end which receives enrollment speech data from each target speaker, a reference anchor set generation unit which generates a reference anchor set using the enrollment speech data based on an anchor space, and a voice print generation unit which generates voice prints based on the reference anchor set and the enrollment speech data. With the present disclosure, by taking the enrollment speech and speaker adaptation technique into account, anchor models with smaller size can be generated, so reliable and robust speaker recognition with smaller size reference anchor set is possible. It brings great advantages for computation speed improvement and great memory reduction.