摘要:
The subject disclosure relates to face recognition in video. Face detection data in frames of input data are used to generate face galleries, which are labeled and used in recognizing faces throughout the video. Metadata that associates the video frame and the face are generated and maintained for subsequent identification. Faces other than those found by face detection may be found by face tracking, in which facial landmarks found by the face detection are used to track a face over previous and/or subsequent video frames. Once generated, the maintained metadata may be accessed to efficiently determine the identity of a person corresponding to a viewer-selected face.
摘要:
Techniques and tools are presented for controlling artifacts such as banding artifacts, ringing artifacts and film scan artifacts in video. For example, before encoding, a pre-processor performs combined filtering and dithering on video such that the weight of dithering at a location depends on the results of filtering at the location. For the combined filtering and dithering, the pre-processor can determine a lowpass signal and highpass residual, weight dithering based on local characteristics of the highpass residual, and then combine the lowpass signal with the weighted dithering. Or, to determine the relative weight, the pre-processor can use a filter whose normalization factor varies depending on how many sample values around a location are within a threshold of similarity to a current sample value at the location. The filtering and dithering can use different strengths for luma and chroma channels.
摘要:
Whether interlaced video fields form a progressive video frame can be automatically determined. The presence or absence of a first characteristic of one or more video fields can be determined by analysis of the fields and/or related information such as flags, cadence, previous determinations, and others. Similarly, the presence or absence of a second characteristic can be detected. In accordance with the detecting, how likely or whether the two or more video fields form a progressive video frame can be determined based on a possibly predetermined likelihood that fields of progressive video frames in general have or do not have the first characteristic and based on a possibly predetermined likelihood that fields of interlaced video frames in general have or do not have the second characteristic.
摘要:
A multiple bitrate (MBR) video encoding management tool utilizes available processing units for parallel MBR video encoding. For example, instead of focusing only on multi-threading of encoding tasks for a single picture or group of pictures (GOP), the management tool parallelizes the encoding of multiple GOPs between different processing units and/or different computing systems. With this parallel MBR video encoding architecture, different GOPs can be encoded in parallel. To facilitate such parallel encoding, data dependencies between GOPs are removed. The management tool can adjust the number of GOPs to encode in parallel on a computing system so as to favor parallelism of encoding for different GOPs at the expense of parallelism of encoding inside a GOP, or vice versa, and thereby set a suitable balance between encoding latency and throughput.
摘要:
A video encoding system encodes video streams for multiple bit rate video streaming using an approach that permits the encoded resolution to vary based, at least in part, on motion complexity. The video encoding system dynamically decides an encoding resolution for segments of the multiple bit rate video streams that varies with video complexity so as to achieve a better visual experience for multiple bit rate streaming. Motion complexity may be considered separately, or along with spatial complexity, in making the resolution decision.
摘要:
A video encoding system encodes video streams for multiple bit rate video streaming using an approach that permits the encoded resolution to vary based, at least in part, on motion complexity. The video encoding system dynamically decides an encoding resolution for segments of the multiple bit rate video streams that varies with video complexity so as to achieve a better visual experience for multiple bit rate streaming. Motion complexity may be considered separately, or along with spatial complexity, in making the resolution decision.
摘要:
Strategies are set forth herein for quantizing and dithering original image information to produce quantized image information. According to one exemplary implementation, the strategies involve: quantizing a sum that combines an original value taken from the original image information, a noise value, and an error term, to produce a quantized value; and calculating an error term for a subsequent quantizing operation by computing a difference between the quantized value and the original value. By virtue of his process, the strategies essentially add noise information to the quantization process, not the original image information, which results in quantized image information having reduced artifacts. The strategies can be used in conjunction with the Floyd-Steinberg error dispersion algorithm. According to another feature, the noise value is computed using a random number generator having a long repeat period, which further reduces artifacts.
摘要:
Strategies for effectively discovering, selecting, configuring, and controlling components used in media processing applications are described. According to one exemplary implementation, the strategies described configure the components based on profile information, configuration information, and a hierarchical ordering of configuration parameters. The hierarchical ordering may combine different coding paradigms, where one or more high level nodes in the ordering may define configuration parameters which are common to multiple coding paradigms. In this ordering, selection of a configuration parameter may cascade down to affect lower-ranking dependent parameters in the hierarchical ordering. According to one advantage, the hierarchical ordering provides a more uniform, extensible, and problem-free approach to configuring components than unstructured approaches to configuration. Moreover, applications can utilize the hierarchical ordering at different levels of granularity.
摘要:
Strategies are described for processing image information in a linear form to reduce the amount of artifacts (compared to processing the data in nonlinear form). Exemplary types of processing operations can include, scaling, compositing, alpha-blending, edge detection, and so forth. In a more specific implementation, strategies are described for processing image information that is: a) linear; b) in the RGB color space; c) high precision (e.g., provided by floating point representation); d) progressive; and e) full channel. Other improvements provide strategies for: a) processing image information in a pseudo-linear space to improve processing speed; b) implementing an improved error dispersion technique; c) dynamically calculating and applying filter kernels; d) producing pipeline code in an optimal manner; and e) implementing various processing tasks using novel pixel shader techniques.
摘要:
Techniques and tools are presented for controlling artifacts such as banding artifacts, ringing artifacts and film scan artifacts in video. For example, before encoding, a pre-processor performs combined filtering and dithering on video such that the weight of dithering at a location depends on the results of filtering at the location. For the combined filtering and dithering, the pre-processor can determine a lowpass signal and highpass residual, weight dithering based on local characteristics of the highpass residual, and then combine the lowpass signal with the weighted dithering. Or, to determine the relative weight, the pre-processor can use a filter whose normalization factor varies depending on how many sample values around a location are within a threshold of similarity to a current sample value at the location. The filtering and dithering can use different strengths for luma and chroma channels.