摘要:
In an image/video encoding and decoding system employing an artifact evaluator a method and/or apparatus to process video blocks comprising a decoder operable to synthesize an un-filtered reconstructed video block or frame and an artifact filter operable to receive the un-filtered reconstructed video block or frame, which generates a filtered reconstructed video block or frame. A memory buffer operable to store either the filtered reconstructed video block or frame or the un-filtered reconstructed video block or frame, and an artifact evaluator operable to update the memory buffer after evaluating and determining which of the filtered video block or frame, or the un-filtered video block or frame yields better image/video quality.
摘要:
Video encoding techniques are described. In one example, a video encoding technique includes identifying a pixel location associated with a video block in a search space based on motion vectors associated with a set of video blocks within a video frame to be encoded, wherein the video blocks in the set are spatially located at defined locations relative to a current video block of the video frame to be encoded. A motion estimation routine can then be initialized for the current video block at the identified pixel location. By identifying a pixel location associated with a video block in a search space based on motion vectors associated with a set of video blocks within a video frame, the phenomenon of spatial redundancy can be more readily exploited to accelerate and improve the encoding process.
摘要:
In an image/video encoding and decoding system employing an artifact evaluator a method and/or apparatus to process video blocks comprising a decoder operable to synthesize an un-filtered reconstructed video block or frame and an artifact filter operable to receive the un-filtered reconstructed video block or frame, which generates a filtered reconstructed video block or frame. A memory buffer operable to store either the filtered reconstructed video block or frame or the un-filtered reconstructed video block or frame, and an artifact evaluator operable to update the memory buffer after evaluating and determining which of the filtered video block or frame, or the un-filtered video block or frame yields better image/video quality.
摘要:
A system and method for forming a segmented speech signal from an input speech signal having a plurality of frames. The input speech signal is converted from a time domain signal to a frequency domain signal having a plurality of speech frames, wherein each speech frame in the frequency domain signal is represented by at least one spectral value associated with the speech frame. A spectral difference value is then determined for each pair of adjacent frames in the frequency domain signal, wherein the spectral difference value for each pair of adjacent frames is representative of a difference between the at least one spectral value associated with each frame in the pair of adjacent frames. An initial cluster boundary is set between each pair of adjacent frames in the frequency domain signal, and a variance value is assigned to each cluster in the frequency domain signal, wherein the variance value for each cluster is equal to one of the determined spectral difference values. Next, a plurality of cluster merge parameters is calculated, wherein each of the cluster merge parameters is associated with a pair of adjacent clusters in the frequency domain signal. A minimum cluster merge parameter is selected from the plurality of cluster merge parameters. A merged cluster is then formed by canceling a cluster boundary between the clusters associated with the minimum merge parameter and assigning a merged variance value to the merged cluster, wherein the merged variance value is representative of the variance values assigned to the clusters associated with the minimum merge parameter. The process is repeated in order to form a plurality of merged clusters, and the segmented speech signal is formed in accordance with the plurality of merged clusters.
摘要:
A distributed voice recognition system includes a digital signal processor (DSP), a nonvolatile storage medium, and a microprocessor. The DSP is configured to extract parameters from digitized input speech samples and provide the extracted parameters to the microprocessor. The nonvolatile storage medium contains a database of speech templates. The microprocessor is configured to read the contents of the nonvolatile storage medium, compare the parameters with the contents, and select a speech template based upon the comparison. The nonvolatile storage medium may be a flash memory. The DSP may be a vocoder. If the DSP is a vocoder, the parameters may be diagnostic data generated by the vocoder. The distributed voice recognition system may reside on an application specific integrated circuit (ASIC).
摘要:
A method and apparatus for communicating both voice and control data between a communication device (such as a cellular phone) and an external accessory (such as a hands-free kit) over a data bus. The method includes formatting a sequence of bits into a repeating sequence of first time slots and second time slots, transmitting the voice data in the first time slot, and transmitting the control data in the second time slot. Notably, a first bit of each of the second time slots comprises a clock bit that alternates between a high value and a low value (e.g. a ‘1’ or a ‘0’) as between consecutive second time slots.
摘要:
A mobile user interface suitable for mobile computing devices uses device position/orientation in real space to select a portion of content that is displayed. Content (e.g., documents, files or a desktop) is presumed fixed in virtual space with the mobile user interface displaying a portion of the content as if viewed through a camera or magnifying glass. Data from motion, distance or position sensors are used to determine the relative position/orientation of the device with respect to the content to select the portion for display. Content elements can be selected by centering the display on the desired portion, obviating the need for cursors and pointing devices (e.g., mouse or touchscreen). Magnification can be manipulated by moving the device away from or towards the user. 3-D content viewing may be enabled by sensing the device orientation and displaying content that is above or below the display in 3-D virtual space.
摘要:
This disclosure is directed to encoding techniques that can be used to improve encoding of digital video data. The techniques can be implemented by an encoder of a digital video device in order to reduce the number of computations and possibly reduce power consumption during video encoding. More specifically, video encoding techniques are describe which utilize one or more programmable thresholds in order to terminate the execution of various computations when the computations would be unlikely to improve the encoding. By terminating computations prematurely, the amount of processing required for video encoding can be reduced, and power can be conserved.
摘要:
A method and apparatus for eighth-rate random number generation for speech coders includes a random number generator configured to generate values of a first random variable. A lookup table is used to store values of a second random variable. The lookup table is addressed with the values of the first random variable. The second random variable is an inverse transform of a cumulative distribution function of the first random variable. An codec encodes input silence frames with the values of the first and second random variables, and regenerates the silence frames with the values of the first and second random variables. The speech coder may be an enhanced variable rate coder, and the silence frames may be encoded at eighth rate. The random variables are advantageously Gaussian random variables with values that are uniformly distributed between zero and one.
摘要:
A method and apparatus for implementing a vocoder in a application specific integrated circuit (ASIC) is provided. The apparatus contains a DSP core that performs computations in accordance with a reduced instruction set (RISC) architecture. The circuit further includes a specifically designed slave processor to the DSP core referred to as the minimization processor. The apparatus further comprises a specifically designed block normalization circuitry.