Abstract:
Embodiments of the present disclosure relate to image processing. In at least one embodiment, a video processing device comprises a processor and one or more computerreadable media having computer-executable instructions embodied thereon. When executed by the processor, the computer-executable instructions cause the processor to instantiate a plurality of components that comprise a segmenter and an object analyzer. The segmenter is configured to (1) receive video information, the video information comprising a plurality of video frames, and (2) segment at least one video frame to generate segmentation information. The object analyzer is configured to (1) receive, from the segmenter, the segmentation information, (2) identify, based on the segmentation information, the presence of at least one object in the video information, wherein the object analyzer identifies the presence of the at least one object in the video information by performing a max-flow/min-cut clustering technique.
Abstract:
Techniques are provided for fusion of image frames to generate panoramic background images using color and depth data provided from a 3D camera. An example system may include a partitioning circuit configured to partition an image frame into segments and objects, the segments comprising a group of pixels sharing common features associated with the color and depth data, the objects comprising one or more related segments. The system may also include an object consistency circuit configured to assign either 2D or 3D transformation types to each of the segments and objects to transform them to a co-ordinate system of a reference image frame. The system may further include a segment recombination circuit to combine the transformed objects and segments into a transformed image frame and an integration circuit to integrate the transformed image frame with the reference image frame to generate the panoramic image.
Abstract:
Generally described, aspects of the present disclosure relate to generation of an image representing a panned shot of an object by an image capture device. In one embodiment, a panned shot may be performed on a series of images of a scene. The series of images may include at least subject object moving within the scene. Motion data of the subject object may be captured by comparing the subject object in a second image of the series of images to the subject object in a first image of the series of images. A background image is generated by implementing a blur process using the first image and the second image based on the motion data. A final image is generated by including the image of the subject object in the background image.
Abstract:
The present disclosure relates to moving object detection in videos. In one embodiment, a plurality of frames in a video are transformed to a high dimensional image space in a non-linear way. Then the background of the plurality of frames can be modeled in the high dimensional image space. The foreground or moving object can be detected in the plurality of frames based on the modeling of the background in the high dimensional image space. By use of the non-linear model which is more powerful for describing complex factors such as changing background, changing background, illumination variation, camera motion, noise and the like, embodiments of the present invention is more robust and accurate to detect moving objects under the complex situations.
Abstract:
A method of generating a temporal saliency map is disclosed. In a particular embodiment, the method includes receiving an object bounding box from an object tracker. The method includes cropping a video frame based at least in part on the object bounding box to generate a cropped image. The method further includes performing spatial dual segmentation on the cropped image to generate an initial mask and performing temporal mask refinement on the initial mask to generate a refined mask. The method also includes generating a temporal saliency map based at least in part on the refined mask.