Abstract:
Avatar animation systems disclosed herein provide high quality, real-time avatar animation that is based on the varying countenance of a human face. In some example embodiments, the real-time provision of high quality avatar animation is enabled at least in part, by a multi-frame regressor that is configured to map information descriptive of facial expressions depicted in two or more images to information descriptive of a single avatar blend shape. The two or more images may be temporally sequential images. This multi-frame regressor implements a machine learning component that generates the high quality avatar animation from information descriptive of a subject's face and/or information descriptive of avatar animation frames previously generated by the multi-frame regressor. The machine learning component may be trained using a set of training images that depict human facial expressions and avatar animation authored by professional animators to reflect facial expressions depicted in the set of training images.
Abstract:
Techniques related to automatic target object selection from multiple tracked objects for imaging devices are discussed. Such techniques may include generating one or more object selection metrics such as accumulated distances from frame center, accumulated velocities, and trajectory comparisons of predicted to actual trajectories for tracked objects and selecting the target object based on the object selection metric or metrics.
Abstract:
Techniques related to object detection using binary coded images are discussed. Such techniques may include performing object detection based on multiple spatial correlation mappings between a generated binary coded image and a binary coded image based object detection model and nesting look up tables such that binary coded representations are grouped and such groups are associated with confidence values for performing object detection.
Abstract:
System, apparatus, method, and computer readable media for on-the-fly captured image data object tracking. An image or video stream is processed to detect and track an object in concurrence with generation of the stream by a camera module. In one exemplary embodiment, HD image frames are processed at a rate of 30 fps, or more, to track one or more target object. In embodiments, object detection is validated prior to employing detected object descriptor(s) as learning data to generate or update an object model. A device platform including a camera module and comporting with the exemplary architecture may provide 3A functions based on objects robustly tracked in accordance with embodiments.