Abstract:
A method of stabilizing a video in real time using a single pass including receiving consecutive video frames, where the consecutive video frames include a current video frame and previous video frames, storing the consecutive video frames in a buffer, estimating a global motion for the current video frame by describing a camera's relative motion between the current video frame and one of the previous video frames adjacent to the current video frame, estimating a long-term camera motion for the current video frame by determining a geometric mean of an accumulation of the estimate of the global motion for the current video frame and an estimate of global motion for each of the previous video frames, and displaying the current video frame on a display of an electronic device, the current video frame stabilized based on the estimate of the long-term camera motion.
Abstract:
Methods and devices permit a user to insert multiple virtual objects into a real world video scene. Some inserted objects may be statically tied to the scene, while other objects are designated as moving with certain moving objects in the scene. Markers are not used to insert the virtual objects. Users of separate mobile devices can share their inserted virtual objects to create a multi-user, multi-object augmented reality (AR) experience.
Abstract:
Provided are an apparatus and a method of multi-stage image recognition. For the multi-stage image recognition, categorized object data is received from a first deep neural network. A second deep neural network is trained on subcategory customization data that relates to a non-ideal environment when the second deep neural network produces invalid subcategorized object data from the categorized object data, and generates an image recognition result using the second deep neural network as trained.
Abstract:
Methods and apparatus are described that enable augmented or virtual reality based on a light field. A geometric proxy of a mobile device such as a smart phone is used during the process of inserting a virtual object from the light field into the real world images being acquired. For example, a mobile device includes a processor and a camera coupled to the processor. The processor is configured to define a view-dependent geometric proxy, record images with the camera to produce recorded frames and, based on the view-dependent geometric proxy, render the recorded frames with an inserted light field virtual object.
Abstract:
An embodiment method for computationally adjusting images from a multi-camera system includes receiving calibrated image sequences, with each of the calibrated image sequences corresponding to a camera in a camera array and having one or more image frames. A target camera model is computed for each camera in the camera array and according to target camera poses or target camera intrinsic matrices for the respective camera. The computing generates a transformation matrix for each of the one or more first cameras. The transformation matrix for each of the one or more first cameras is applied to the calibrated image sequence corresponding to the respective camera. The transformation matrix warps each image frame of the calibrated image sequence and generates target image sequences.
Abstract:
A robotic device is disclose as having deep reinforcement learning capability. The device includes non-transitory memory comprising instructions and one or more processors in communication with the memory. The instructions cause the one or more processors to receive a sensing frame, from a sensor, comprising an image. The processors then determine a movement transition based on the sensing frame and the deep reinforcement learning, wherein the deep reinforcement learning uses at least one of a map coverage reward, a map quality reward, or a traversability reward to determine the movement transition. The processors then update an area map based on the sensing frame and the deep reinforcement learning using a visual simultaneous localization and mapping (SLAM) process to determine the map updates.
Abstract:
A system and method of tracking an object and navigating an object tracking robot includes receiving tracking sensor input representing the object and an environment at multiple times, responsive to the tracking sensor input, calculating positions of the robot and the object at the multiple times, and using a computer implemented deep reinforcement learning (DRL) network trained as a function of tracking quality rewards and robot navigation path quality rewards, the DRL network being responsive to the calculated positions of the robot and the object at the multiple times to determine possible actions specifying movement of the object tracking robot from a current position of the robot and target, determine quality values (Q-values) for the possible actions, and select an action as a function of the Q-values. A method of training the DRL network is also included.
Abstract:
Methods and devices permit a user to insert multiple virtual objects into a real world video scene. Some inserted objects may be statically tied to the scene, while other objects are designated as moving with certain moving objects in the scene. Markers are not used to insert the virtual objects. Users of separate mobile devices can share their inserted virtual objects to create a multi-user, multi-object augmented reality (AR) experience.
Abstract:
An apparatus is configured to perform a method of parallax tolerant video stitching. The method includes determining a plurality of video sequences to be stitched together; performing a spatial-temporal localized warping computation process on the video sequences to determine a plurality of target warping maps; warping a plurality of frames among the video sequences into a plurality of target virtual frames using the target warping maps; performing a spatial-temporal content-based seam finding process on the target virtual frames to determine a plurality of target seam maps; and stitching the video sequences together using the target seam maps.
Abstract:
A method of stabilizing a video in real time using a single pass including receiving consecutive video frames, where the consecutive video frames include a current video frame and previous video frames, storing the consecutive video frames in a buffer, estimating a global motion for the current video frame by describing a camera's relative motion between the current video frame and one of the previous video frames adjacent to the current video frame, estimating a long-term camera motion for the current video frame by determining a geometric mean of an accumulation of the estimate of the global motion for the current video frame and an estimate of global motion for each of the previous video frames, and displaying the current video frame on a display of an electronic device, the current video frame stabilized based on the estimate of the long-term camera motion.