Abstract:
Systems and methods are disclosed for selecting target objects within digital images utilizing a multi-modal object selection neural network trained to accommodate multiple input modalities. In particular, in one or more embodiments, the disclosed systems and methods generate a trained neural network based on training digital images and training indicators corresponding to various input modalities. Moreover, one or more embodiments of the disclosed systems and methods utilize a trained neural network and iterative user inputs corresponding to different input modalities to select target objects in digital images. Specifically, the disclosed systems and methods can transform user inputs into distance maps that can be utilized in conjunction with color channels and a trained neural network to identify pixels that reflect the target object.
Abstract:
Various embodiments of the present invention relate generally to systems and methods for analyzing and manipulating images and video. According to particular embodiments, the spatial relationship between multiple images and video is analyzed together with location information data, for purposes of creating a representation referred to herein as a surround view for presentation on a device. A visual guide can provided for capturing the multiple images used in the surround view. The visual guide can be a synthetic object that is rendered in real-time into the images output to a display of an image capture device. The visual guide can help user keep the image capture device moving along a desired trajectory.
Abstract:
Methods and systems are provided for generating mattes for input images. A neural network system can be trained where the training includes training a first neural network that generates mattes for input images where the input images are synthetic composite images. Such a neural network system can further be trained where the training includes training a second neural network that generates refined mattes from the mattes produced by the first neural network. Such a trained neural network system can be used to input an image and trimap pair for which the trained system will output a matte. Such a matte can be used to extract an object from the input image. Upon extracting the object, a user can manipulate the object, for example, to composite the object onto a new background.
Abstract:
Various embodiments of the present invention relate generally to systems and methods for analyzing and manipulating images and video. According to particular embodiments, the spatial relationship between multiple images and video is analyzed together with location information data, for purposes of creating a representation referred to herein as a surround view for presentation on a device. A visual guide can provided for capturing the multiple images used in the surround view. The visual guide can be a synthetic object that is rendered in real-time into the images output to a display of an image capture device. The visual guide can help user keep the image capture device moving along a desired trajectory.
Abstract:
Systems and methods for navigating a vehicle within an environment are provided. In one aspect, a method comprises: (a) selecting, with aid of a processor, a subset of a plurality of sensors to be used for navigating the vehicle within the environment based on one or more predetermined criteria, wherein the plurality of sensors are arranged on the vehicle such that each sensor of the plurality of sensors is configured to obtain sensor data from a different field of view; (b) processing, with aid of the processor, the sensor data from the selected sensor(s) so as to generate navigation information for navigating the vehicle within the environment; and (c) outputting, with aid of the processor, signals for controlling the vehicle based on the navigation information.
Abstract:
A terminal device includes an acquisition unit that acquires image information of a read-out original, a display unit that displays the image information acquired by the acquisition unit, a detection unit that detects pitch information by performing frequency analysis of the image information displayed by the display unit, and a control unit that controls the image information so as to be enlarged or reduced in accordance with a size of the original and controls to display the pitch information so as to be superimposed on the image information which is enlarged or reduced.
Abstract:
Disclosed is a computer implemented method for identifying an object in a plurality of images. The method may include a step of receiving, through an input device, a delineation of the object in at least one image of the plurality of images. Further, the method may include a step of identifying, using the processor, an image region corresponding to the object in the at least one image based on the delineation. Furthermore, the method may include a step of tracking, using the processor, the image region across the plurality of images.
Abstract:
Systems and methods for navigating a vehicle within an environment are provided. In one aspect, a method comprises: (a) selecting, with aid of a processor, a subset of a plurality of sensors to be used for navigating the vehicle within the environment based on one or more predetermined criteria, wherein the plurality of sensors are arranged on the vehicle such that each sensor of the plurality of sensors is configured to obtain sensor data from a different field of view; (b) processing, with aid of the processor, the sensor data from the selected sensor(s) so as to generate navigation information for navigating the vehicle within the environment; and (c) outputting, with aid of the processor, signals for controlling the vehicle based on the navigation information.
Abstract:
A computer implemented method and apparatus for identifying a desired object of an image by using a suggestive marking. The method comprises receiving a first marking to an image, the first marking suggesting a desired object of the image and the desired object being defined by a boundary; generating, based on the first marking, a plurality of output images, wherein each image of the plurality of output images indicates a computer-identified object; and displaying the plurality of output images.
Abstract:
Various embodiments of methods and apparatus for feature point localization are disclosed. A profile model and a shape model may be applied to an object in an image to determine locations of feature points for each object component. Input may be received to move one of the feature points to a fixed location. Other ones of the feature points may be automatically adjusted to different locations based on the moved feature point.