Abstract:
An apparatus for extracting features for speech recognition in accordance with the present invention includes: a frame forming portion configured to separate input speech signals in frame units having a prescribed size; a static feature extracting portion configured to extract a static feature vector for each frame of the speech signals; a dynamic feature extracting portion configured to extract a dynamic feature vector representing a temporal variance of the extracted static feature vector by use of a basis function or a basis vector; and a feature vector combining portion configured to combine the extracted static feature vector with the extracted dynamic feature vector to configure a feature vector stream.
Abstract:
Disclosed herein is a voice-based CAPTCHA method and apparatus which can perform a CAPTCHA procedure using the voice of a human being. In the voice-based CAPTCHA) method, a plurality of uttered sounds of a user are collected. A start point and an end point of a voice from each of the collected uttered sounds are detected and then speech sections are detected. Uttered sounds of the respective detected speech sections are compared with reference uttered sounds, and then it is determined whether the uttered sounds are correctly uttered sounds. It is determined whether the uttered sounds have been made by an identical speaker if it is determined that the uttered sounds are correctly uttered sounds. Accordingly, a CAPTCHA procedure is performed using the voice of a human being, and thus it can be easily checked whether a human being has personally made a response using a voice online