摘要:
A video encoder/decoder utilizes a bistream syntax that provides an independently decodable, partial picture unit, which may be in the form of a unit containing one or more contiguous rows of macroblocks (called a slice). This slice layer provides a flexible combination of error-resilience and compression efficiency. The slice layer encodes an efficient addressing mechanism (e.g., a syntax element specifying a beginning macroblock row of the slice layer), as well as an efficient mechanism to optionally retransmit picture header information. The slice layer provides decoding and reconstruction independence by disabling all forms of prediction, overlap and loop-filtering across slice-boundaries. This permits a slice coded in intra-mode to be reconstructed error-free, irrespective of errors in other regions of the picture.
摘要:
An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.
摘要:
Architecture for enhancing the compression (e.g., luma, chroma) of a video signal and improving the perceptual quality of the video compression schemes. The architecture operates to reshape the normal multimodal energy distribution of the input video signal to a new energy distribution. In the context of luma, the algorithm maps the black and white (or contrast) information of a picture to a new energy distribution. For example, the contrast can be enhanced in the middle range of the luma spectrum, thereby improving the contrast between a light foreground object and a dark background. At the same time, the algorithm reduces the bit-rate requirements at a particular quantization step size. The algorithm can be utilized also in post-processing to improve the quality of decoded video.
摘要:
Techniques and tools are described for compensating for rounding when estimating sample-domain distortion in the transform domain. For example, a video encoder estimates pixel-domain distortion in the transform domain for a block of transform coefficients after compensating for rounding in the DC coefficient of the block. In this way, the video encoder improves the accuracy of pixel-domain distortion estimation but retains the computational advantages of performing the estimation in the transform domain. Rounding compensation includes, for example, looking up an index (from a de-quantized transform coefficient) in a rounding offset table to determine a rounding offset, then adjusting the coefficient by the offset. Other techniques and tools described herein are directed to creating rounding offset tables and encoders that make encoding decisions after considering rounding effects that occur after an inverse frequency transform on de-quantized transform coefficient values.
摘要:
Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.
摘要:
Various new and non-obvious apparatus and methods for using frame caching to improve packet loss recovery are disclosed. One of the disclosed embodiments is a method for using periodical and synchronized frame caching within an encoder and its corresponding decoder. When the decoder discovers packet loss, it informs the encoder which then generates a frame based on one of the shared frames stored at both the encoder and the decoder. When the decoder receives this generated frame it can decode it using its locally cached frame.
摘要:
Various new and non-obvious apparatus and methods for using frame caching to improve packet loss recovery are disclosed. One of the disclosed embodiments is a method for using periodical and synchronized frame caching within an encoder and its corresponding decoder. When the decoder discovers packet loss, it informs the encoder which then generates a frame based on one of the shared frames stored at both the encoder and the decoder. When the decoder receives this generated frame it can decode it using its locally cached frame.
摘要:
Various new and non-obvious apparatus and methods for using frame caching to improve packet loss recovery are disclosed. One of the disclosed embodiments is a method for using periodical and synchronized frame caching within an encoder and its corresponding decoder. When the decoder discovers packet loss, it informs the encoder which then generates a frame based on one of the shared frames stored at both the encoder and the decoder. When the decoder receives this generated frame it can decode it using its locally cached frame.
摘要:
Techniques and tools are described for compensating for rounding when estimating sample-domain distortion in the transform domain. For example, a video encoder estimates pixel-domain distortion in the transform domain for a block of transform coefficients after compensating for rounding in the DC coefficient of the block. In this way, the video encoder improves the accuracy of pixel-domain distortion estimation but retains the computational advantages of performing the estimation in the transform domain. Rounding compensation includes, for example, looking up an index (from a de-quantized transform coefficient) in a rounding offset table to determine a rounding offset, then adjusting the coefficient by the offset. Other techniques and tools described herein are directed to creating rounding offset tables and encoders that make encoding decisions after considering rounding effects that occur after an inverse frequency transform on de-quantized transform coefficient values.
摘要:
Techniques and tools for switching distortion metrics during motion estimation are described. For example, a video encoder determines a distortion metric selection criterion for motion estimation. The criterion can be based on initial results of the motion estimation. To evaluate the criterion, the encoder can compare the criterion to a threshold that depends on a current quantization parameter. The encoder selects between multiple available distortion metrics, which can include a sample-domain distortion metric (e.g., SAD) and a transform-domain distortion metric (e.g., SAHD). The encoder uses the selected distortion metric in the motion estimation. Selectively switching between SAD and SAHD provides rate-distortion performance superior to using only SAD or only SAHD. Moreover, due to the lower complexity of SAD, the computational complexity of motion estimation with SAD-SAHD switching is typically less than motion estimation that always uses SAHD.