Abstract:
The present disclosure relates to a technology for skeleton-based action recognition based on a graph convolutional network, in which an action processing device receives a frame including a skeleton with respect to actions of an object, extracts spatiotemporal features with respect to the skeleton by using a rank adjacency matrix in which a distance between one node and another node and an adjacency ranking are considered, merges an object and vertices in the input frame based on the extracted spatiotemporal features, and performs a classification task.
Abstract:
Embodiments relate to a user authentication device configured to detect a face region in a target object image including at least part of a face of a target object, recognize masked or unmasked in the face region, extract target object characteristics data from the face region of the target object image, call reference data and authenticate if the target object is a registered device user based on the called reference data and the target object characteristics data. The reference data is generated from an unmasked image of the registered device user.
Abstract:
A method for automatic facial impression transformation includes extracting landmark points for elements of a target face whose facial impression is to be transformed as well as distance vectors respectively representing distances of the landmark points, comparing the distance vectors to select a learning data set similar to the target face from a database, extracting landmark points and distance vectors from the learning data set, transforming a local feature of the target face based on the landmark points of the learning data set and score data for a facial impression, and transforming a global feature of the target face based on the distance vectors of the learning data set and the score data for the facial impression. Accordingly, a facial impression may be transformed in various ways while keeping an identity of a corresponding person.
Abstract:
A method for automatic facial impression transformation includes extracting landmark points for elements of a target face whose facial impression is to be transformed as well as distance vectors respectively representing distances of the landmark points, comparing the distance vectors to select a learning data set similar to the target face from a database, extracting landmark points and distance vectors from the learning data set, transforming a local feature of the target face based on the landmark points of the learning data set and score data for a facial impression, and transforming a global feature of the target face based on the distance vectors of the learning data set and the score data for the facial impression. Accordingly, a facial impression may be transformed in various ways while keeping an identity of a corresponding person.
Abstract:
Disclosed is a device and method for inferring a correlation between objects through image recognition. The device for inferring a correlation between objects through image recognition according to an embodiment comprises a communicator and an interaction inferencer configured to select a main object interacting with a target object at a predetermined distance in the input image or generate a social graph including the main object and the target object.
Abstract:
A kiosk for providing a recommendation service according to an embodiment displays an orderer's past ordered product as a recommended product on the screen of the kiosk, the past ordered product read based on a similarity calculation result between a current input attribute representing a contextual feature of a current order status and a past input attribute stored in memory.
Abstract:
Embodiments relate to a dynamic image capturing method and apparatus using an arbitrary viewpoint image generation technology, in which an image of background content displayed on a background content display unit or an image of background content implemented in a virtual space through a chroma key screen, having a view matching to a view of seeing a subject at a viewpoint of a camera is generated, and a final image including the image of the background content and a subject area is obtained.
Abstract:
Disclosed are an X-RAY image reading support method including the steps of acquiring a target X-RAY image photographed by transmitting or reflecting X-RAY in a reading space in which an object to be read is disposed; applying the target X-RAY image to a reading model that extracts features from an input image; and identifying the object to be read as an object corresponding to a classified class when the object to be read is classified as a set class based on a first feature set extracted from the target X-RAY image, and an X-RAY image reading support system performing the method.
Abstract:
A method of multi-view deblurring for 3-dimensional (3D) shape reconstruction includes: receiving images captured by multiple synchronized cameras at multiple viewpoints; performing iteratively estimation of depth map, latent image, and 3D motion at each viewpoint for the received images; determining whether image deblurring at each viewpoint is completed; and performing 3D reconstruction based on final depth maps and latent images at each viewpoint. Accordingly, it is possible to achieve accurate deblurring and 3D reconstruction even from any motion blurred images.
Abstract:
A video deblurring method based on a layered blur model includes estimating a latent image, an object motion and a mask for each layer in each frame using images consisting of a combination of layers during an exposure time of a camera when receiving a blurred video frame, applying the estimated latent image, object motion and mask for each layer in each frame to the layered blur model to generate a blurry frame, comparing the generated blurry frame and the received blurred video frame, and outputting a final latent image based on the estimated object motion and mask for each layer in each frame, when the generated blurry frame and the received blurred video frame match. Accordingly, by modeling a blurred image as an overlap of images consisting of a combination of foreground and background during exposure, more accurate deblurring results at object boundaries can be obtained.