Abstract:
A server for providing a city street search service includes a street information database configured to store city street images, a feature selection unit configured to select at least one feature according to a predetermined criterion when a city street image for searching and two or more features for the image are received from a user terminal, a candidate extraction unit configured to extract a candidate list of a city street image, a feature matching unit configured to match the city street image for registration included in the extracted candidate list and the at least one selected feature, and a search result provision unit configured to provide the user terminal with a result of the matching as result information regarding the city street image for searching.
Abstract:
An apparatus and method for managing a representative video image, which selects representative images based on human visual aesthetic criteria and creates an album by arranging the selected representative images in an album template with various layouts, based on the region of interest (ROI).
Abstract:
Provided is an advertisement service system for providing additional information regarding an item as an advertisement target. The system for providing additional information of an advertisement includes an original advertisement server configured to provide original advertisement content and identification information regarding the original advertisement content, and a processing server configured to generate processed advertisement content by inserting the identification information into the original advertisement content, extract feature information with respect to the original advertisement content, and store at least one of the identification information and the feature information, together with additional information with respect to the original advertisement content.
Abstract:
An apparatus for generating text from an image may comprise: a memory configured to store at least one instruction; and a processor configured to execute the at least one instruction, wherein the processor is further configured to generate encoding information for an image based on the image and extract text information related to content of the image based on a degree of association with the encoding information.
Abstract:
A method of generating a moving viewpoint motion picture, which is performed by a processor that executes at least one instruction stored in a memory, may comprise: obtaining an input image; generating a trimap from the input image; generating a depth map using the input image; generating a foreground mesh/texture map model based on a foreground alpha map obtained based on the trimap and foreground depth information obtained based on the trimap and the depth map; and generating a moving viewpoint motion picture based on the foreground mesh/texture map model.
Abstract:
A data learning device in a deep learning network characterized by a high image resolution and a thin channel at an input stage and an output stage and a low image resolution and a thick channel in an intermediate deep layer includes a feature information extraction unit configured to extract global feature information considering an association between all elements of data when generating an initial estimate in the deep layer; a direct channel-to-image conversion unit configured to generate expanded data having the same resolution as a final output from the generated initial estimate of the global feature information or intermediate outputs sequentially generated in subsequent layers; and a comparison and learning unit configured to calculate a difference between the expanded data generated by the direct channel-to-image conversion unit and a prepared ground truth value and update network parameters such that the difference is decreased.
Abstract:
Disclosed are a learning data generation method and apparatus needed to learn animation characters on the basis of deep learning. The learning data generation method needed to learn animation characters on the basis of deep learning may include collecting various images from an external source using wired/wireless communication, acquiring character images from the collected images using a character detection module, clustering the acquired character images, selecting learning data from among the clustered images, and inputting the selected learning data to an artificial neural network for character recognition.
Abstract:
Disclosed herein is a method of detecting a moving object including: predicting an optical flow in an input image clip using a first deep neural network which is trained to predict an optical flow in an image clip including a plurality of frames; obtaining an optical flow image which reflects a result of the optical flow prediction; and detecting a moving object in the image clip on the basis of the optical flow image using a second deep neural network trained using the first deep neural network.
Abstract:
A method for determining a video-related emotion and a method of generating data for learning video-related emotions include separating an input video into a video stream and an audio stream; analyzing the audio stream to detect a music section; extracting at least one video clip matching the music section; extracting emotion information from the music section; tagging the video clip with the extracted emotion information and outputting the video clip; learning video-related emotions by using the at least one video clip tagged with the emotion information to generate a video-related emotion classification model; and determining an emotion related to an input query video by using the video-related emotion classification model to provide the emotion.