Abstract:
A processor-implemented method with object tracking includes: determining an initial template image based on an input bounding box and an input image; generating an initial feature map by extracting features from the initial template image; generating a transformed feature map by performing feature transformation adapted to objectness on the initial feature map; generating an objectness probability map and a bounding box map indicating bounding box information corresponding to each coordinate of the objectness probability map by performing objectness-based bounding box regression analysis on the transformed feature map; and determining a refined bounding box based on the objectness probability map and the bounding box map.
Abstract:
An interactive method includes displaying image content received through a television (TV) network, identifying an object of interest of a user among a plurality of regions or a plurality of objects included in the image content, and providing additional information corresponding to the object of interest.
Abstract:
A method of controlling a viewpoint of a user or a virtual object on a two-dimensional (2D) interactive display is provided. The method may convert a user input to at least 6 degrees of freedom (DOF) structured data according to a number of touch points, a movement direction thereof, and a rotation direction thereof. Any one of the virtual object and the viewpoint of the user may be determined as a manipulation target based on a location of the touch point.
Abstract:
An interactive method includes displaying image content received through a television (TV) network, identifying an object of interest of a user among a plurality of regions or a plurality of objects included in the image content, and providing additional information corresponding to the object of interest.
Abstract:
A method of generating three-dimensional (3D) volumetric data may be performed by generating a multilayer image, generating volume information and a type of a visible part of an object, based on the generated multilayer image, and generating volume information and a type of an invisible part of the object, based on the generated multilayer image. The volume information and the type of each of the visible part and invisible part may be generated based on the generated multilayered image which may be include at least one of a ray-casting-based multilayer image, a chroma key screen-based multilayer image, and a primitive template-based multilayer image.
Abstract:
A method and apparatus for detecting a liveness based on a phase difference are provided. The method includes generating a first phase image based on first visual information of a first phase, generating a second phase image based on second visual information of a second phase, generating a minimum map based on a disparity between the first phase image and the second phase image, and detecting a liveness based on the minimum map.
Abstract:
A depth estimation method and apparatus are provided. The depth estimation method includes obtaining an image from an image sensor comprising upper pixels, each comprising N sub-pixels, obtaining N sub-images respectively corresponding to the N sub-pixels from the image, obtaining a viewpoint difference between the N sub-images using a first neural network, and obtaining a depth map of the image based on the viewpoint difference using a second neural network.
Abstract:
A processor-implemented recognition method includes: receiving query input data; determining a domain to which the query input data belongs using a neural network-based classifier; and in response to the query input data belonging to a first domain, generating second query data of a second domain based on the query input data.
Abstract:
A method and apparatus with emotion recognition acquires a plurality of pieces of data corresponding a plurality of inputs for each modality and corresponding to a plurality of modalities; determines a dynamics representation vector corresponding to each of the plurality of modalities based on a plurality of features for each modality extracted from the plurality of pieces of data; determines a fused representation vector based on the plurality of dynamics representation vectors corresponding to the plurality of modalities; and recognizes an emotion of a user based on the fused representation vector.
Abstract:
A method and apparatus for estimating a pose that estimates a pose of a user using a depth image is provided, the method including, recognizing a pose of a user from a depth image, and tracking the pose of the user using a user model exclusively of one another to enhance precision of estimating the pose.