摘要:
An improved system for an interactive voice recognition system (400) includes a voice prompt generator (401) for generating voice prompt in a first frequency band (501). A speech detector (406) detects presence of speech energy in a second frequency band (502). The first and second frequency bands (501, 502) are essentially conjugate frequency bands. A voice data generator (412) generates voice data based on an output of the voice prompt generator (401) and audible speech of a voice response generator (402). A control signal (422) controls the voice prompt generator (401) based on whether the speech detector (406) detects presence of speech energy in the second frequency band (502). A back end (405) of the interactive voice recognition system (400) is configured to operate on an extracted front end voice feature based on whether the speech detector (406) detects presence of speech energy in the second frequency band (502).
摘要:
Techniques for intensity compensation in video processing are provided. In one configuration, a wireless communication device compliant with the VC1-SMPTE standard (e.g., cellular phone, etc.) comprises a processor that is configured to execute instructions operative to reconstruct reference frames from a received video bitstream. A non-intensity-compensated copy of a reference frame of the bitstream is stored in a memory of the device and used for defining the displayable images and for on-the-fly generation of a stream of intensity-compensated pixels to perform motion compensation calculations for frames of the video bitstream.
摘要:
Motion estimation in video compressions systems. A programmable motion estimator may be used to estimate a motion vector for a macroblock in a current frame by searching for a matching macroblock in a previous frame. A controller may be used to program the motion estimator to perform a particular search.
摘要:
This disclosure describes rate control techniques that can improve video coding based on a “two-pass” approach. The first pass codes a video sequence using a first set of quantization parameters (QPs) for the purpose of estimating rate-distortion characteristics of the video sequence based on the statistic of the first pass. A second set of QPs can then be defined for a second coding pass. The estimated rate-distortion characteristics of the first pass are used to select Qps for the second pass in a manner that minimizes quality fluctuation between the frames of the video sequence. Furthermore, selection of the second set of QPs may also substantially maximize quality of the frames at the substantially minimized quality flucuation in order to achieve low average frame distortion with the minimized quality fluctuation.
摘要:
This disclosure describes electronic video image stabilization techniques for imaging and video devices. The techniques involve determining motion and spatial statistics for individual macroblocks of a frame, and determining a global motion vector for the frame based on the statistics of each of the macroblocks. In one embodiment, a method of performing electronic image stabilization includes performing spatial estimation on each of a plurality of macroblocks within a frame of an image to obtain spatial statistics for each of the macroblocks, performing motion estimation on each of the plurality of macroblocks to obtain motion statistics for each of the macroblocks, integrating the spatial statistics and the motion statistics of each of the macroblocks to determine a global motion vector for the frame, and offsetting the image with respect to a reference window according to the global motion vector.
摘要:
A method for processing digitized speech signals by analyzing redundant features to provide more robust voice recognition. A primary transformation is applied to a source speech signal to extract primary features therefrom. Each of at least one secondary transformation is applied to the source speech signal or extracted primary features to yield at least one set of secondary features statistically dependant on the primary features. At least one predetermined function is then applied to combine the primary features with the secondary features. A recognition answer is generated by pattern matching this combination against predetermined voice recognition templates.
摘要:
The disclosure describes FGS video coding techniques that use cycle-aligned fragments (CAFs). The techniques may perform cycle-based coding of FGS video data block coefficients and syntax elements, and encapsulate cycles in fragments for transmission. The fragments may be cycle-aligned such that a start of a payload of each of the fragments substantially coincides with a start of one of the cycles. In this manner, cycles can be readily accessed via individual fragments. Some cycles may be controlled with a vector mode to scan to a predefined position within a block before moving to another block. In this manner, the number of cycles can be reduced, reducing the number of fragments and associated overhead. The CAFs may be entropy coded independently of one another so that each fragment may be readily accessed and decoded without waiting for decoding of other fragments. Independent entropy coding may permit parallel decoding and simultaneous processing of fragments.
摘要:
An embodiment is directed to a method for selecting a predictive macroblock partition from a plurality of candidate macroblock partitions in motion estimation and compensation in a video encoder including determining a bit rate signal for each of the candidate macroblock partitions, generating a distortion signal for each of the candidate macroblock partitions, calculating a cost for each of the candidate macroblock partitions based on respective bit rate and distortion signals to produce a plurality of costs, and determining a motion vector from the costs. The motion vector designates the predictive macroblock partition.
摘要:
This disclosure describes identifying key frames from a sequence of video frames. A first set of information generated by operating on uncompressed data is accessed. A second set of information generated by compressing the data is also accessed. The first and second sets of information are used to identify key frames from the video frames.
摘要:
The disclosure is directed to scalable motion estimation techniques for video encoding. According to the motion estimation techniques, a motion vector search is scaled according to the computing resources available. For example, the extent of the search may be dynamically adjusted according to available computing resources. A more extensive search may be performed when computing resources permit. When computing resources are scarce, the search may be more limited. In this manner, the scalable motion estimation technique balances video quality, computing overhead and power consumption. The scalable motion estimation technique may search a series of concentric regions, starting at a central anchor point and moving outward across several concentric regions. The number of concentric regions searched for a particular video frame or macroblock is adjusted according to computing resources. Upon searching the anchor point, the search proceeds outward to the next concentric region, and continues as permitted by available computing resources.