Abstract:
Systems and techniques are provided for performing video-based activity recognition. For example, a process can include generating a three-dimensional (3D) model of a first portion of an object based on one or more frames depicting the object. The process can also include generating a mask for the one or more frames, the mask including an indication of one or more regions of the object. The process can further include generating a 3D base model based on the 3D model of the first portion of the object and the mask, the 3D base model representing the first portion of the object and a second portion of the object. The process can include generating, based on the mask and the 3D base model, a 3D model of the second portion of the object.
Abstract:
Systems and techniques are provided for facial image augmentation. An example method can include obtaining a first image capturing a face. Using the first image, the method can determine, using a prediction model, a UV face position map including a two-dimensional (2D) representation of a three-dimensional (3D) structure of the face. The method can generate, based on the UV face position map, a 3D model of the face. The method can generate an extended 3D model of the face by extending the 3D model to include region(s) beyond a boundary of the 3D model. The region(s) can include a forehead region, a region surrounding at least a portion of the face, and/or other region. The method can generate, based on the extended 3D model, a second image depicting the face in a rotated position relative to a position of the face in the first image.
Abstract:
Techniques and systems are provided for tracking objects in one or more video frames. For example, a first set of one or more bounding regions are determined for a video frame based on a trained classification network applied to the video frame. The first set of one or more bounding regions are associated with one or more objects in the video frame. One or more blobs can be detected for the video frame. A blob includes pixels of at least a portion of an object in the video frame. A second set of one or more bounding regions are determined for the video frame that are associated with the one or more blobs. A final set of one or more bounding regions is determined for the video frame using the first set of one or more bounding regions and the second set of one or more bounding regions. Object tracking can then be performed for the video frame using the final set of one or more bounding regions.
Abstract:
Techniques and systems are provided for processing video data. For example, techniques and systems are provided for determining blob size thresholds. Blob sizes of blobs generated for a video frame can be determined. A lower boundary of a category of blob sizes can then be determined that corresponds to a minimum blob size of the video frame. The lower boundary is determined from a plurality of possible blob sizes including the blob sizes of the blobs and one or more other possible blob sizes. One of the possible blob sizes is determined as the lower boundary when one or more lower boundary conditions are met by characteristics of the possible blob size. A blob size threshold for the video frame is assigned as the minimum blob size corresponding to the lower boundary.
Abstract:
Techniques and systems are provided for processing video data. For example, techniques and systems are provided for performing content-adaptive morphology operations. A first erosion function can be performed on a foreground mask of a video frame, including setting one or more foreground pixels of the frame to one or more background pixels. A temporary foreground mask can be generated based on the first erosion function being performed on the foreground mask. One or more connected components can be generated for the frame by performing connected component analysis to connect one or more neighboring foreground pixels. A complexity of the frame (or of the foreground mask of the frame) can be determined by comparing a number of the one or more connected components to a threshold number. A second erosion function can be performed on the temporary foreground mask when the number of the one or more connected components is higher than the threshold number. The one or more connected components can be output for blob processing when the number of the one or more connected components is lower than the threshold number.
Abstract:
Provided are systems, methods, and computer-readable medium for including parameters that describe fisheye images in a 360-degree video with the 360-degree video. The 360-degree video can then be stored and/or transmitted as captured by the omnidirectional camera, without transforming the fisheye images into some other format. The parameters can later be used to map the fisheye images to an intermediate format, such as an equirectangular format. The intermediate format can be used to store, transmit, and/or display the 360-degree video. The parameters can alternatively or additionally be used to map the fisheye images directly to a format that can be displayed in a 360-degree video presentation, such as a spherical format.
Abstract:
Methods and apparatus for capturing an image using an automatic focus are disclosed herein. In one aspect, a method is disclosed which includes communicating, using a camera, with a wireless device via a wireless communication network. The method further includes determining a distance between the camera and the wireless device using the wireless communication network and adjusting a focus of the camera based upon the determined distance. Finally, the method includes capturing an image using the adjusted focus of the camera. In some aspects, this method may be done on a smartphone or digital camera which includes Wi-Fi capabilities.
Abstract:
The disclosed technology relates to image-capturing methods. In one aspect, a method includes receiving an image frame comprising a plurality of pixels and subtracting foreground pixels from the image frame to obtain background pixels. The method additionally includes determining an exposure condition for a next image frame based on at least a subset of the background pixels. The method further includes adjusting the foreground pixels such that a difference between a background luma value and a foreground luma value of the next image frame is within a predetermined range. Aspects are also directed to apparatuses configured for the methods.
Abstract:
Methods, systems, computer-readable media, and apparatuses for image-based status determination are presented. In some embodiments, a method includes capturing at least one image of a moving path. At least one feature within the at least one image is analyzed and based on the analysis of the at least one feature, a direction of movement of the moving path is determined. In some embodiments, a method includes capturing an image of an inclined path. At least one feature within the image is analyzed and based on analysis of the at least one feature, a determination is made whether the image was captured from a top position relative to the inclined path or a bottom position relative to the inclined path.
Abstract:
Embodiments include methods and systems which determine pixel displacement between frames based on a respective weighting-value for each pixel or a group of pixels. The weighting-values provide an indication as to which pixels are more pertinent to optical flow computations. Computational resources and effort can be focused on pixels with higher weights, which are generally more pertinent to optical flow determinations.