摘要:
A auto-recording method is disclosed for auto-recording further to user request, via generating user image and voice data, extracting feature points from the image data according to pre-defined user recognition and following by considering the user as an object of following according to extracted feature points, determining whether the image and voice data satisfy a recording reference needed to perform recording. If determined that the image and voice data satisfy the recording reference, editing the image and voice data in a pre-set edit form and generating and storing at least one of recording image and recording voice data.
摘要:
A face recognition system based on adaptive learning includes a specific person detection and tracking unit for detecting and tracking a specific person from a moving image. A facial feature extraction unit extracts a plurality of facial feature vectors from the detected and tracked specific person. A face recognition unit searches for a given registration model by comparing the extracted facial feature vectors with facial feature vectors of the registration models previously stored in a user registration model database. A learning target selection unit selects a facial feature vector to be added to a record of the given registration model from among the extracted facial feature vectors. A registration model learning unit adds and updates the selected facial feature vector to the record of the given registration model.
摘要:
A face recognition system based on adaptive learning includes a specific person detection and tracking unit for detecting and tracking a specific person from a moving image. A facial feature extraction unit extracts a plurality of facial feature vectors from the detected and tracked specific person. A face recognition unit searches for a given registration model by comparing the extracted facial feature vectors with facial feature vectors of the registration models previously stored in a user registration model database. A learning target selection unit selects a facial feature vector to be added to a record of the given registration model from among the extracted facial feature vectors. A registration model learning unit adds and updates the selected facial feature vector to the record of the given registration model.
摘要:
A method and apparatus controls an output level of a voice signal for video telephony by considering the distance between a user and a terminal and surrounding noises. An input image signal and an input voice signal of the user to be used for the video telephony are received at the user's terminal. A received image signal and a received voice signal received from the other party's terminal to which the video telephony is connected, are output on the user's terminal. The user's face region included in the input image signal is extracted. A size information of the extracted face region is checked. A distance information about a distance from the user is checked using the size information. And an output level of the received voice signal is controlled based on the distance information.
摘要:
An apparatus and method can effectively detect both hands and hand shape of a user from images input through cameras. A skin image detecting skin regions from one of the input images and a stereoscopic distance image are used. For hand detection, background and noise are eliminated from a combined image of the skin image and the distance image and regions corresponding to actual both hands are detected from effective images having a high probability of hands. For hand shape detection, a non-skin region is eliminated from the skin image based on the stereoscopic distance information, hand shape candidate regions are detected from the remaining region after elimination, and finally a hand shape is determined.
摘要:
An apparatus and method can effectively detect both hands and hand shape of a user from images input through cameras. A skin image detecting skin regions from one of the input images and a stereoscopic distance image are used. For hand detection, background and noise are eliminated from a combined image of the skin image and the distance image and regions corresponding to actual both hands are detected from effective images having a high probability of hands. For hand shape detection, a non-skin region is eliminated from the skin image based on the stereoscopic distance information, hand shape candidate regions are detected from the remaining region after elimination, and finally a hand shape is determined.
摘要:
A method and apparatus controls an output level of a voice signal for video telephony by considering the distance between a user and a terminal and surrounding noises. An input image signal and an input voice signal of the user to be used for the video telephony are received at the user's terminal. A received image signal and a received voice signal received from the other party's terminal to which the video telephony is connected, are output on the user's terminal. The user's face region included in the input image signal is extracted. A size information of the extracted face region is checked. A distance information about a distance from the user is checked using the size information. And an output level of the received voice signal is controlled based on the distance information.
摘要:
Provided is a system and method for controlling voice detection of a network terminal. The system includes the network terminal for, if detection of a voice signal is requested, detecting voice by receiving and setting a voice detection setting value corresponding to a predetermined service and generating a trigger signal for the voice detection according to the voice detection setting value corresponding to the service; and a server for determining the service of the network terminal and transmitting the voice detection setting value corresponding to the service to the network terminal. Accordingly, by controlling to commence voice detection according to a service, voice detection optimized to a relevant service can commence.
摘要:
Provided is a system and method for controlling voice detection of a network terminal. The system includes the network terminal for, if detection of a voice signal is requested, detecting voice by receiving and setting a voice detection setting value corresponding to a predetermined service and generating a trigger signal for the voice detection according to the voice detection setting value corresponding to the service; and a server for determining the service of the network terminal and transmitting the voice detection setting value corresponding to the service to the network terminal. Accordingly, by controlling to commence voice detection according to a service, voice detection optimized to a relevant service can commence.
摘要:
Disclosed is a method for speech speaker recognition of a speech speaker recognition apparatus, the method including detecting effective speech data from input speech; extracting an acoustic feature from the speech data; generating an acoustic feature transformation matrix from the speech data according to each of Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), mixing each of the acoustic feature transformation matrixes to construct a hybrid acoustic feature transformation matrix, and multiplying the matrix representing the acoustic feature with the hybrid acoustic feature transformation matrix to generate a final feature vector; and generating a speaker model from the final feature vector, comparing a pre-stored universal speaker model with the generated speaker model to identify the speaker, and verifying the identified speaker.