摘要:
A system encodes videos acquired of a moving object in a scene by multiple fixed cameras. Camera calibration data of each camera are first determined. The camera calibration data of each camera are associated with the corresponding video. A segmentation mask for each frame of each video is determined. The segmentation mask identifies only foreground pixels in the frame associated with the object. A shape encoder then encodes the segmentation masks, a position encoder encodes a position of each pixel, and a color encoder encodes a color of each pixel. The encoded data can be combined into a single bitstream and transferred to a decoder. At the decoder, the bitstream is decoded to an output video having an arbitrary user selected viewpoint. A dynamic 3D point model defines a geometry of the moving object. Splat sizes and surface normals used during the rendering can be explicitly determined by the encoder, or explicitly by the decoder.
摘要:
A method and system generates 3D video images from point samples obtained from primary video data in a 3D coordinate system. Each point sample contains 3D coordinates in a 3D coordinate system, as well as colour and/or intensity information. On subsequently rendering, the point samples are modified continuously according to an updating of the 3D primary video data. The point samples are arranged in a hierarchic data structure in a manner such that each point sample is an end point, or leaf node, in a hierarchical tree, wherein the branch points in the hierarchy tree are average values of the nodes lower in the hierarchy of the tree.
摘要:
A method for generating a 3D representation of a dynamically changing 3D scene, which includes the steps of: acquiring at least two synchronised video streams (120) from at least two cameras located at different locations and observing the same 3D scene (102); determining camera parameters, which comprise the orientation and zoom setting, for the at least two cameras (103); tracking the movement of objects (310a,b, 312a,b; 330a,b, 331a,b, 332a,b; 410a,b, 411a,b; 430a,b, 431a,b; 420a,b, 421a,b) in the at least two video streams (104); determining the identity of the objects in the at least two video streams (105); determining the 3D position of the objects by combining the information from the at least two video streams (106); wherein the step of tracking (104) the movement of objects in the at least two video streams uses position information derived from the 3D position of the objects in one or more earlier instants in time. As a result, the quality, speed and robustness of the 2D tracking in the video streams is improved.
摘要:
A method for generating a 3D representation of a dynamically changing 3D scene, which includes the steps of: acquiring at least two synchronised video streams (120) from at least two cameras located at different locations and observing the same 3D scene (102); determining camera parameters, which comprise the orientation and zoom setting, for the at least two cameras (103); tracking the movement of objects (310a,b, 312a,b; 330a,b, 331 a,b, 332a,b; 410a,b, 411a,b; 430a,b, 431a,b; 420a,b, 421 a,b) in the at least two video streams (104); determining the identity of the objects in the at least two video streams (105); determining the 3D position of the objects by combining the information from the at least two video streams (106); wherein the step of tracking (104) the movement of objects in the at least two video streams uses position information derived from the 3D position of the objects in one or more earlier instants in time. As a result, the quality, speed and robustness of the 2D tracking in the video streams is improved.