Abstract:
Methods and apparatus to detect collision of a virtual camera with objects in a three-dimensional volumetric model are disclosed herein. An example virtual camera system disclosed herein includes cameras to obtain images of a scene in an environment. The example virtual camera system also includes a virtual camera generator to create a 3D volumetric model of the scene based on the images, identify a 3D location of a virtual camera to be disposed in the 3D volumetric model, and detect whether a collision occurs between the virtual camera and one or more objects in the 3D volumetric model.
Abstract:
Systems and techniques for sensor-derived swing hit and direction detection are described herein. A set of sensor values may be compressed into a first lower dimension (2105). Features may be extracted from the compressed set of sensor values (2110). The features may be clustered into a set of clusters (2115). A swing action may be detected based on a distance between members of the set of clusters (2120).
Abstract:
Examples of systems and methods for augmented facial animation are generally described herein. A method for mapping facial expressions to an alternative avatar expression may include capturing a series of images of a face, and detecting a sequence of facial expressions of the face from the series of images. The method may include determining an alternative avatar expression mapped to the sequence of facial expressions, and animating an avatar using the alternative avatar expression.
Abstract:
A method for providing hand segmentation for gesture analysis may include determining a target region based at least in part on depth range data corresponding to an intensity image. The intensity image may include data descriptive of a hand. The method may further include determining a point of interest of a hand portion of the target region, determining a shape corresponding to a palm region of the hand, and removing a selected portion of the target region to identify a portion of the target region corresponding to the hand. An apparatus and computer program product corresponding to the method are also provided.
Abstract:
It is inter alia disclosed to classify samples of a set of samples of an image regarding whether or not a respective sample comprises optical information on a specific object type based on classification information. Therein, the classification information comprises a plurality of classifiers, each classifier being associated exclusively with one colour channel of a plurality of colour channels.
Abstract:
The present invention relates to a method for invoking an operation of a communication terminal in response to registering and interpreting a predetermined motion or pattern of an object. It further relates to a computer-readable medium in which the present invention is implemented.
Abstract:
An example apparatus includes processor circuitry to extract features from image data obtained from a plurality of cameras, the extraction of features performed using a plurality of sequential neural network layers; in response to each of the plurality of sequential neural network layer extracting the features, identify the extracted features in a torso region of the image data via a plurality of attention modules; estimate body landmarks from image data to localize an area; generate an upper heatmap mask based on a geometric center of the image data; calculate a loss function for the image data based on a cross-entropy loss, a pixel-wise loss, and a triplet loss determined from the extracted features and the generated heatmap mask; select lowest correlated classes based on calculated correlations between pairs of a plurality of classes; and calculate voting scores for groups associated with the lowest correlated classes.
Abstract:
Techniques related to performing object or person association or correspondence in multi-view video are discussed. Such techniques include determining correspondences at a particular time instance based on separately optimizing correspondence sub-matrices for distance sub-matrices based on two-way minimum distance pairs between frame pairs, generating and fusing tracklets across time instances, and adjusting correspondence, after such tracklet processing, via elimination of outlier object positions and rearrangement of object correspondence.