摘要:
A method for real-time 2D to 3D video conversion includes receiving a decoded 2D video frame having an original resolution, downscaling the decoded 2D video frame into an associated 2D video frame having a lower resolution, and segmenting objects present in the downscaled 2D video frame into background objects and foreground objects. The method also includes generating a background depth map and a foreground depth map for the downscaled 2D video frame based on the segmented background and foreground objects, and deriving a frame depth map in the original resolution based on the background depth map and the foreground depth map. The method further includes providing a 3D video frame for display at a real-time playback rate. The 3D video frame is generated in the original resolution based on the frame depth map.
摘要:
A method for personalized video depth adjustment includes receiving a video frame, obtaining a frame depth map based on the video frame, and determining content genre of the video frame by classifying content of the video frame into one or more categories. The method also includes identifying a user viewing the video frame, retrieving depth preference information for the user from a user database, and deriving depth adjustment parameters based on the content genre and the depth preference information for the user. The method further includes adjusting the frame depth map based on the depth adjustment parameters, and providing a 3D video frame for display at a real-time playback rate on a user device of the user. The 3D video frame is generated based on the adjusted frame depth map.
摘要:
A method for real-time 2D to 3D video conversion includes receiving a decoded 2D video frame having an original resolution, downscaling the decoded 2D video frame into an associated 2D video frame having a lower resolution, and segmenting objects present in the downscaled 2D video frame into background objects and foreground objects. The method also includes generating a background depth map and a foreground depth map for the downscaled 2D video frame based on the segmented background and foreground objects, and deriving a frame depth map in the original resolution based on the background depth map and the foreground depth map. The method further includes providing a 3D video frame for display at a real-time playback rate. The 3D video frame is generated in the original resolution based on the frame depth map.
摘要:
A monoscopic low-power mobile device is capable of creating real-time stereo images and videos from a single captured view. The device uses statistics from an autofocusing process to create a block depth map of a single capture view. Artifacts in the block depth map are reduced and an image depth map is created. Stereo three-dimensional (3D) left and right views are created from the image depth map using a Z-buffer based 3D surface recover process and a disparity map which is a function of the geometry of binocular vision.
摘要:
A method is provided for a content recommendation module. The method includes receiving a user input related to viewing contents from a user and determining whether a recommendation pool containing a plurality of selected recommendation candidates has been changed corresponding to the input. The method also includes, when the recommendation pool has been changed, mapping the plurality of selected recommendation candidates in the changed recommendation pool into a hierarchical data structure with a plurality of levels such that each of the plurality of levels acts as a stage of a zoom operation on the selected recommendation candidates. Further, the method includes rendering mapped recommendation candidates from the plurality of levels to be displayed to the user.
摘要:
A method and apparatus for generating stereoscopic images of a scene is described. The apparatus may have a first image sensor, a second image sensor spaced apart from the first image sensor, a diversity combine module to combine image data from the first and second image sensors, and an image processing module configured to process combined image data from the diversity combine module may be used to generate stereoscopic images of a scene.
摘要:
Embodiments of the present invention include systems and methods for processing and coding image data. In one embodiment, image data is coded using a first image coding process. If a bit rate constraint is satisfied, the image data is output. If the bit rate constraint is not satisfied, the image data is coded using a second different coding process. In one embodiment, the second coding process is a layered coding process. In another embodiment, if the constraint is satisfied, quantization data may be included in the output, and may be coded using layered coding. Variable length coding processes and hardware implementations are further disclosed for efficient image processing.
摘要:
The disclosure is directed to decoder-side region-of-interest (ROI) video processing. A video decoder determines whether ROI assistance information is available. If not, the decoder defaults to decoder-side ROI processing. The decoder-side ROI processing may estimate the reliability of ROI extraction in the bitstream domain. If ROI reliability is favorable, the decoder applies bitstream domain ROI extraction. If ROI reliability is unfavorable, the decoder applies pixel domain ROI extraction. The decoder may apply different ROI extraction processes for intra-coded (I) and inter-coded (P or B) data. The decoder may use color-based ROI generation for intra-coded data, and coded block pattern (CBP)-based ROI generation for inter-coded data. ROI refinement may involve shape-based refinement for intra-coded data, and motion- and color-based refinement for inter-coded data.
摘要:
A mobile device comprising a first image sensor, a second image sensor configured to change position with respect to the first image sensor, a controller configured to control the position of the second image sensor, and an image processing module configured to process and combine images captured by the first and second image sensors.
摘要:
The rendering of 3D video images on a stereo-enabled display (e.g., stereoscopic or autostereoscopic display) is described. The process includes culling facets facing away from a viewer, defining foreground facets for Left and Right Views and common background facets, determining lighting for these facets, and performing screen mapping and scene rendering for one view (e.g., Right View) using computational results for facets of the other view (i.e., Left View). In one embodiment, visualization of images is provided on the stereo-enabled display of a low-power device, such as mobile phone, a computer, a video game platform, or a Personal Digital Assistant (PDA) device.