摘要:
The method of prefetching data into cache to minimize CPU stall time uses a rough predictor to make rough predictions about what cache lines will be needed next by the CPU. The address difference generator uses the rough prediction and the actual cache miss address to determine the address difference. The prefetch engine builds a data structure to represent address differences and weights them according to the accumulated stall time produced by the cache misses given that the corresponding address is not prefetched. This stall time is modeled as a loss function of the form: L = ∑ j = 0 n L ( x j ) L ( x j ) = ∑ i = 0 sl ( j ) - 1 C i ( b j - i , x j ) The weights in the data structure change as the prefetch engine learns more information. The prefetch engine's goal is to predict the cache line needed and prefetch before the CPU requests it.
摘要:
An explicit DST-based filter that comprises a trigonometric transform module, first and second transform coefficient processors (TCPs), an inverse trigonometric transform module and first and second summing arrangements. The trigonometric transform module applies a trigonometric transform to blocks of DCT coefficients related to input blocks of DCT coefficients to generate corresponding input blocks of transform coefficients of a second type ("second coefficients"). The first TCP includes matrix multipliers that generate a multiplied block of DCT coefficients and a multiplied block of second coefficients by multiplying, by diagonal multiplying matrices, intermediate blocks of DCT coefficients derived from the input blocks of DCT coefficients. The second TCP includes matrix multipliers that generate at a multiplied block of DCT coefficients and a multiplied block of second coefficients by multiplying, by diagonal multiplying matrices, intermediate blocks of second coefficients derived from the input blocks of second coefficients. The first summing arrangement sums the multiplied blocks of DCT coefficients to generate a first final block of DCT coefficients, and sums the multiplied blocks of second coefficients to generate a first final block of second coefficients. The inverse trigonometric transform module applies an inverse trigonometric transform to the first final block of second coefficients to generate a second final block of DCT coefficients. The second summing arrangement sums the first and second final blocks of DCT coefficients to generate a block of DCT coefficients constituting a block of a filtered information signal.
摘要:
Downsampling and inverse motion compensation are performed on compressed domain representations for video. By directly manipulating the compressed domain representation instead of the spatial domain representation, computational complexity is significantly reduced. For downsampling, the compressed stream is processed in the compressed (DCT) domain without explicit decompression and spatial domain downsampling so that the resulting compressed stream corresponds to a scaled down image, ensuring that the resulting compressed stream conforms to the standard syntax of 8.times.8 DCT matrices. For typical data sets, this approach of downsampling in the compressed domain results in computation savings around 80% compared with traditional spatial domain methods for downsampling from compressed data. For inverse motion compensation, motion compensated compressed video is converted into a sequence of DCT domain blocks corresponding to the spatial domain blocks in the current picture alone. By performing inverse motion compensation directly in the compressed domain, the reduction in computation complexity is around 68% compared with traditional spatial domain methods for inverse motion compensation from compressed data. The techniques for downsampling and inverse motion compensation can be used in a variety of applications, such as multipoint video conferencing and video editing.
摘要:
A transform domain watermarking technique which is based on a new encoding scheme referred to as scaled bin encoding which encodes a message in a set of transform coefficients by modifying their values in a way that preserves high image quality (i.e., low distortion levels) and adapts to expected noise level. Recapturing of the watermark image is performed via a decoding method using a maximum likelihood procedure (i.e., maximum likelihood decoding), based on the statistics of the transform coefficients and a worst case statistical model of the noise introduced to these coefficients by image processing operations or attack noise.
摘要:
A motion vector between a current block and a reference block of a reference frame is determined by calculating the exact linear cross-correlation between the current block and the potential reference blocks of the reference picture. The current block is orthogonally transformed using DCT/DST transforms of a first type without prior zero padding of the current block to generate a current quadruple of transform coefficient blocks. The current quadruple is processed together with a reference quadruple of transform coefficient blocks generated from four of the search region blocks to generate a quadruple of processed transform coefficient blocks. The quadruple of processed transform coefficient blocks is inversely transformed using inverse DCT/DST transforms of a second type to generate a block of exact cross-correlations between the current block and the search region. The motion vector is determined from the block of exact cross-correlations.
摘要:
Multiplier-free implementation of an approximation of the DCT used in image and video processing. In accordance with the primary aspect of the present invention, image and video processing is done with no multiplications and a fewer number of operations through the application of a modified Arai, Agui, and Nakajima (AAN) scheme for eight-point DCT.
摘要:
An implicit DST-based filter having characteristics defined by a linear convolution kernel that may be causal or noncausal-symmetric. The filter filters an information signal composed of blocks of discrete cosine transform (DCT) coefficients to generate a filtered information signal also composed of blocks of DCT coefficients. The filter comprises multiplying matrices, a deriving module, matrix multiplying modules and a summing module. The multiplying matrices are obtained by absorbing a cosine-to-sine transform and a sine-to-cosine transform into kernel matrices derived from the linear convolution kernel. The deriving module derives intermediate blocks of DCT coefficients from neighboring ones of the blocks of DCT coefficients constituting the information signal. The matrix multiplying modules multiply the intermediate blocks of DCT coefficients by the multiplying matrices. The summing module sums the blocks of DCT coefficients generated by the matrix multiplying modules to generate the blocks of DCT coefficients constituting the filtered information signal.
摘要:
A method is described for filtering compressed images represented in the discrete-cosine-transform (DCT) domain. The filter includes three sparse, vertical submatrices which are sparse versions of the vertical filter components (VFCs) of a desired filter function that have been combined in such a way as to eliminate many of the non-zero elements. The filter also includes three sparse, horizontal transpose submatrices, which, like the vertical submatrices, are sparse versions of the horizontal filter components of the filter function. The sparseness of these sparse submatrices yields a significant reduction in the number of computations required to filter the image in the DCT domain. To take advantage of this discovery, the input DCT data blocks are "butterflied" to retain the relationship between the input data blocks and the filtered output data blocks as a function of these sparse submatrices. The sparseness of the vertical and horizontal submatrices reduces the number of computations required to filter the image. The sparseness of the DCT data blocks can also be used to further reduce the number of computations required.