摘要:
Adaptive video processing for a target display panel may be implemented in or by a server/encoding pipeline. The adaptive video processing methods may obtain and take into account video content and display panel-specific information including display characteristics and environmental conditions (e.g., ambient lighting and viewer location) when processing and encoding video content to be streamed to the target display panel in an ambient setting or environment. The server-side adaptive video processing methods may use this information to adjust one or more video processing functions as applied to the video data to generate video content in the color gamut and dynamic range of the target display panel that is adapted to the display panel characteristics and ambient viewing conditions.
摘要:
Intelligent systems are disclosed that respond to user intent and desires based upon activity that may or may not be expressly directed at the intelligent system. In some embodiments, the intelligent system acquires a depth image of a scene surrounding the system. A scene geometry may be extracted from the depth image and elements of the scene may be monitored. In certain embodiments, user activity in the scene is monitored and analyzed to infer user desires or intent with respect to the system. The interpretation of the user's intent as well as the system's response may be affected by the scene geometry surrounding the user and/or the system. In some embodiments, techniques and systems are disclosed for interpreting express user communication, e.g., expressed through hand gesture movements. In some embodiments, such gesture movements may be interpreted based on real-time depth information obtained from, e.g., optical or non-optical type depth sensors.
摘要:
An estimate of distance between a user and a camera on a device is used to determine an illumination pattern density used for speckle pattern illumination of the user in subsequent images. The distance may be estimated using an image captured when the user is illuminated with flood infrared illumination. Either a sparse speckle (dot) pattern illumination pattern or a dense speckle pattern illumination pattern is used depending on the distance between the user's face and the camera.
摘要:
Varying embodiments of intelligent systems are disclosed that respond to user intent and desires based upon activity that may or may not be expressly directed at the intelligent system. In some embodiments, the intelligent system acquires a depth image of a scene surrounding the system. A scene geometry may be extracted from the depth image and elements of the scene, such as walls, furniture, and humans may be evaluated and monitored. In certain embodiments, user activity in the scene is monitored and analyzed to infer user desires or intent with respect to the system. The interpretation of the user's intent or desire as well as the system's response may be affected by the scene geometry surrounding the user and/or the system. In some embodiments, techniques and systems are disclosed for interpreting express user communication, for example, expressed through fine hand gesture movements. In some embodiments, such gesture movements may be interpreted based on real-time depth information obtained from, for example, optical or non-optical type depth sensors. The depth information may be interpreted in “slices” (three-dimensional regions of space having a relatively small depth) until one or more candidate hand structures are detected. Once detected, each candidate hand structure may be confirmed or rejected based on its own unique physical properties (e.g., shape, size and continuity to an arm structure). Each confirmed hand structure may be submitted to a depth-aware filtering process before its own unique three-dimensional features are quantified into a high-dimensional feature vector. A two-step classification scheme may be applied to the feature vectors to identify a candidate gesture (step 1), and to reject candidate gestures that do not meet a gesture-specific identification operation (step-2). The identified gesture may be used to initiate some action controlled by a computer system.
摘要:
A method and system for adaptively mixing video components with graphics/UI components, where the video components and graphics/UI components may be of different types, e.g., different dynamic ranges (such as HDR, SDR) and/or color gamut (such as WCG). The mixing may result in a frame optimized for a display device's color space, ambient conditions, viewing distance and angle, etc., while accounting for characteristics of the received data. The methods include receiving video and graphics/UI elements, converting the video to HDR and/or WCG, performing statistical analysis of received data and any additional applicable rendering information, and assembling a video frame with the received components based on the statistical analysis. The assembled video frame may be matched to a color space and displayed. The video data and graphics/UI data may have or be adjusted to have the same white point and/or primaries.
摘要:
Some embodiments provide a novel method for in-conference adjustment of encoded video pictures captured by a mobile device having at least first and second cameras. The method may involve real-time modifications of composite video displays that are generated by the mobile devices involved in such a conference. Specifically, in some embodiments, the mobile devices generate composite displays that simultaneously display multiple videos captured by multiple cameras of one or more devices. In some cases, the composite displays place the videos in adjacent display areas (e.g., in adjacent windows). In other cases, the composite display is a picture-in-picture (PIP) display that includes at least two display areas that show two different videos where one of the display areas is a background main display area and the other is a foreground inset display area that overlaps the background main display area.
摘要:
A video enhancement processing system improves perceptual quality of video data with limited processing complexity. The system may perform spatial denoising using filter weights that may vary based on estimated noise of an input image. Specifically, estimated noise of the input image may alter a search neighborhood over which the denoising filter operates, may alter a profile of weights to be applied based on pixel distances and may alter a profile of weights to be applied based on similarity of pixels for denoising processes. As such, the system finds application in consumer devices that perform such enhancement techniques in real time using general purpose processors such as CPUs or GPUs.
摘要:
The present disclosure generally relates to systems and methods for image data processing. In certain embodiments, an image processing pipeline may be configured to receive a frame of the image data having a plurality of pixels acquired using a digital image sensor. The image processing pipeline may then be configured to determine a first plurality of correction factors that may correct each pixel in the plurality of pixels for fixed pattern noise. The first plurality of correction factors may be determined based at least in part on fixed pattern noise statistics that correspond to the frame of the image data. After determining the first plurality of correction factors, the image processing pipeline may be configured to configured to apply the first plurality of correction factors to the plurality of pixels, thereby reducing the fixed pattern noise present in the plurality of pixels.
摘要:
Intelligent systems are disclosed that respond to user intent and desires based upon activity that may or may not be expressly directed at the intelligent system. In some embodiments, the intelligent system acquires a depth image of a scene surrounding the system. A scene geometry may be extracted from the depth image and elements of the scene may be monitored. In certain embodiments, user activity in the scene is monitored and analyzed to infer user desires or intent with respect to the system. The interpretation of the user's intent as well as the system's response may be affected by the scene geometry surrounding the user and/or the system. In some embodiments, techniques and systems are disclosed for interpreting express user communication, e.g., expressed through hand gesture movements. In some embodiments, such gesture movements may be interpreted based on real-time depth information obtained from, e.g., optical or non-optical type depth sensors.
摘要:
The present disclosure generally relates to systems and methods for image data processing. In certain embodiments, an image processing pipeline may be configured to receive a frame of the image data having a plurality of pixels acquired using a digital image sensor. The image processing pipeline may then be configured to determine a first plurality of correction factors that may correct each pixel in the plurality of pixels for fixed pattern noise. The first plurality of correction factors may be determined based at least in part on fixed pattern noise statistics that correspond to the frame of the image data. After determining the first plurality of correction factors, the image processing pipeline may be configured to configured to apply the first plurality of correction factors to the plurality of pixels, thereby reducing the fixed pattern noise present in the plurality of pixels.