摘要:
A video transcoder for converting an encoded input video bit-stream having one spatial resolution into an encoded output video bit-stream having a lower spatial resolution, wherein learned statistics of intra-mode transcoding are used to constrain the search of intra modes for the output video bit-stream. The statistics of intra-mode transcoding can be gathered, e.g., by applying brute-force downsizing to a training set of video frames and then analyzing the observed intra-mode transcoding patterns to determine a transition-probability matrix for use during normal operation of the transcoder. The transition-probability matrix enables the transcoder to select appropriate intra modes for the output video bit-stream without performing a corresponding exhaustive full search, which advantageously reduces the computational complexity and processor load compared to those of a comparably performing prior-art video transcoder.
摘要:
A video transcoder for converting an encoded input video bit-stream having one spatial resolution into an encoded output video bit-stream having a lower spatial resolution, wherein learned statistics of intra-mode transcoding are used to constrain the search of intra modes for the output video bit-stream. The statistics of intra-mode transcoding can be gathered, e.g., by applying brute-force downsizing to a training set of video frames and then analyzing the observed intra-mode transcoding patterns to determine a transition-probability matrix for use during normal operation of the transcoder. The transition-probability matrix enables the transcoder to select appropriate intra modes for the output video bit-stream without performing a corresponding exhaustive full search, which advantageously reduces the computational complexity and processor load compared to those of a comparably performing prior-art video transcoder.
摘要:
A video transcoder for converting an encoded input video bit-stream having one spatial resolution into an encoded output video bit-stream having a lower spatial resolution, wherein motion-vector dispersion observed at the higher spatial resolution is quantified and used to configure the motion-vector search at the lower spatial resolution. For example, for video-frame areas characterized by relatively low motion-vector dispersion values, the motion-vector search may be performed over a relatively small vector space and with the use of fewer search patterns and/or hierarchical search levels. These constraints enable the transcoder to find appropriate motion vectors for inter-prediction coding without having to perform an exhaustive motion-vector search for these video-frame areas, which advantageously reduces the computational complexity and processor load compared to those of a comparably performing prior-art video transcoder.
摘要:
A video transcoder for converting an encoded input video bit-stream having one spatial resolution into an encoded output video bit-stream having a lower spatial resolution, wherein motion-vector dispersion observed at the higher spatial resolution is quantified and used to configure the motion-vector search at the lower spatial resolution. For example, for video-frame areas characterized by relatively low motion-vector dispersion values, the motion-vector search may be performed over a relatively small vector space and with the use of fewer search patterns and/or hierarchical search levels. These constraints enable the transcoder to find appropriate motion vectors for inter-prediction coding without having to perform an exhaustive motion-vector search for these video-frame areas, which advantageously reduces the computational complexity and processor load compared to those of a comparably performing prior-art video transcoder.
摘要:
A video transcoder for converting a compressed input video bit-stream having one spatial resolution into a compressed output video bit-stream having a different spatial resolution in a manner that enables the transcoder to dynamically change the amount of computational resources allocated to the conversion process. In one embodiment, the video transcoder has a plurality of configurable processing paths whose configuration determines the amount of allocated computational resources. Exemplary processing-path configuration changes may include, but are not limited to engaging or disengaging a processing path, redirecting a data flow from flowing through one processing path to flowing through another processing path, and attaching or detaching one or more processing modules to an engaged processing path. The capability to make these and other configuration changes enables the video transcoder to adjust the computational complexity and picture quality on the fly, without interrupting the video sequence in the output video bit-stream.
摘要:
A video transcoder for converting a compressed input video bit-stream having one spatial resolution into a compressed output video bit-stream having a different spatial resolution in a manner that enables the transcoder to dynamically change the amount of computational resources allocated to the conversion process. In one embodiment, the video transcoder has a plurality of configurable processing paths whose configuration determines the amount of allocated computational resources. Exemplary processing-path configuration changes may include, but are not limited to engaging or disengaging a processing path, redirecting a data flow from flowing through one processing path to flowing through another processing path, and attaching or detaching one or more processing modules to an engaged processing path. The capability to make these and other configuration changes enables the video transcoder to adjust the computational complexity and picture quality on the fly, without interrupting the video sequence in the output video bit-stream.
摘要:
A video transcoder for converting a compressed input video bit-stream having one spatial resolution into a compressed output video bit-stream having a different spatial resolution using a plurality of resizing channels. The transcoder has a kernel that partially decodes the compressed input video bit-stream to generate partially decoded video data. The data segments corresponding to picture portions that have both intra- and inter-predicted blocks in close spatial proximity to one another are applied to a mixed-mode resizing channel that is specifically designed for processing such data segments. For each received data segment, the control logic of the channel selects, from a bank of pre-configured resizers, a resizer that is deemed to be most suitable for resizing the image portion represented by that data segment in a computationally efficient manner. The data segment is processed in the selected resizer to generate the corresponding resized-image data. The resized-image data generated by the mixed-mode resizing channel are combined with the resized data generated by other resizing channels of the transcoder and then re-encoded to generate the compressed output video bit-stream.
摘要:
In one embodiment, an acoustic echo control (AEC) module receives an outgoing signal and an incoming signal, which, at various times, contains acoustic echo corresponding to the outgoing signal. The AEC module has a delay estimation block that estimates, in the time domain, the echo delay using an adaptive filtering technique. This delay estimation is used to align samples of the incoming signal having acoustic echo with the corresponding samples of the outgoing signal from which the acoustic echo originated. The AEC module determines whether or not samples of the incoming signal contain acoustic echo based on the aligned outgoing signal, and the determinations are applied to a hangover counter. The AEC module then suppresses acoustic echo in the incoming signal and adds comfort noise to the incoming signal. The amount of echo suppression performed is gradually increased or decreased based on comparisons of the counter to a hangover threshold.
摘要:
In one embodiment, an acoustic echo control (AEC) module receives an outgoing signal and an incoming signal, which, at various times, contains acoustic echo corresponding to the outgoing signal. The AEC module has a delay estimation block that estimates, in the time domain, the echo delay using an adaptive filtering technique. This delay estimation is used to align samples of the incoming signal having acoustic echo with the corresponding samples of the outgoing signal from which the acoustic echo originated. The AEC module determines whether or not samples of the incoming signal contain acoustic echo based on the aligned outgoing signal, and the determinations are applied to a hangover counter. The AEC module then suppresses acoustic echo in the incoming signal and adds comfort noise to the incoming signal. The amount of echo suppression performed is gradually increased or decreased based on comparisons of the counter to a hangover threshold.
摘要:
In one embodiment, a DSP having four arithmetic logic units (ALUs) and able to have two read/write operations per clock cycle performs silence detection and tone detection for data frames containing samples of an audio signal. The ALUs are used together in parallel to process the samples in the data frames received by the DSP. A received data frame is filtered by the silence detection so that substantially silent frames are dropped and non-silent frames are further processed. In the tone detection, a filtered data frame is processed, four samples at a time, to determine the power of the signal at a given frequency, where the power determination is used to determine whether a given tone (i.e., a signal at a given frequency) is present in the data frame.