Abstract:
Techniques and tools for signaling and using image tiling information (such as syntax elements relating index tables and header size), signaling and using windowing information (such as techniques for using windowing parameters when rotating, cropping or flipping images), and signaling and using alpha channel information are described.
Abstract:
Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.
Abstract:
An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.
Abstract:
Techniques and tools for performing fading compensation in video processing applications are described. For example, during encoding, a video encoder performs fading compensation using fading parameters comprising a scaling parameter and a shifting parameter on one or more reference images. During decoding, a video decoder performs corresponding fading compensation on the one or more reference images.
Abstract:
A decoder processes a first bitstream element (e.g., a pull-down flag) in a first syntax layer (e.g., sequence layer or entry point layer) above frame layer in a bitstream for a video sequence, the bitstream comprising encoded source video having a source type (e.g., progressive or interlace). The decoder processes frame data in a second syntax layer (e.g., frame layer) of the bitstream for a frame (such as an interlaced frame or progressive frame, depending on source type, or a skipped frame) in the video sequence. The first bitstream element indicates whether a repeat-picture element (e.g., a repeat-frame element or a repeat field-element) is present or absent in the frame data in the second syntax layer.
Abstract:
Filter taps for filters are specified by filter coefficient parameters. The filter taps are greater in number than the coefficient parameters from which the filter taps are calculated. For example, two coefficient parameters are used to specify a four-tap filter. Filter information can be signaled in a bitstream, such as by signaling one or more family parameters for a filter family and, for each filter in a family, signaling one or more filter tap parameters from which filter taps can be derived. Family parameters can include a number of filters parameter, a resolution parameter, a scaling bits parameter, and/or a full integer position filter present parameter that indicates whether or not the filters include an integer position filter. Filter parameters can be signaled and used to determine coefficient parameters from which filter taps are calculated.
Abstract:
Techniques and tools are described for flexible range reduction of samples of video. For example, an encoder signals a first set of one or more syntax elements for range reduction of luma samples and signals a second set of one or more syntax elements for range reduction of chroma samples. The encoder selectively scales down the luma samples and chroma samples in a manner consistent with the first syntax element(s) and second syntax element(s), respectively. Or, an encoder signals range reduction syntax element(s) in an entry point header for an entry point segment, where the syntax element(s) apply to pictures in the entry point segment. If range reduction is used for the pictures, the encoder scales down samples of the pictures. Otherwise, the encoder skips the scaling down. A decoder performs corresponding parsing and scaling up operations.
Abstract:
Techniques and tools are described for flexible range reduction of samples of video. For example, an encoder signals a first set of one or more syntax elements for range reduction of luma samples and signals a second set of one or more syntax elements for range reduction of chroma samples. The encoder selectively scales down the luma samples and chroma samples in a manner consistent with the first syntax element(s) and second syntax element(s), respectively. Or, an encoder signals range reduction syntax element(s) in an entry point header for an entry point segment, where the syntax element(s) apply to pictures in the entry point segment. If range reduction is used for the pictures, the encoder scales down samples of the pictures. Otherwise, the encoder skips the scaling down. A decoder performs corresponding parsing and scaling up operations.
Abstract:
A decoder receives a field start code for an entry point key frame. The field start code indicates a second coded interlaced video field in the entry point key frame following a first coded interlaced video field in the entry point key frame and indicates a point to begin decoding of the second coded interlaced video field. The first coded interlaced video field is a predicted field, and the second coded interlaced video field is an intra-coded field. The decoder decodes the second field without decoding the first field. The field start code can be followed by a field header. The decoder can receive a frame header for the entry point key frame. The frame header may comprise a syntax element indicating a frame coding mode for the entry point key frame and/or a syntax element indicating field types for the first and second coded interlaced video fields.
Abstract:
Described tools and techniques relate to signaling for DC coefficients at small quantization step sizes. The techniques and tools can be used in combination or independently. For example, a tool such as a video encoder or decoder processes a VLC that indicates a DC differential for a DC coefficient, a FLC that indicates a value refinement for the DC differential, and a third code that indicates the sign for the DC differential. Even with the small quantization step sizes, the tool uses a VLC table with DC differentials for DC coefficients above the small quantization step sizes. The FLCs for DC differentials have lengths that vary depending on quantization step size.