摘要:
Systems and methods for encoding and decoding document images are disclosed. Document images are segmented into multiple layers according to a mask. The multiple layers are non-binary. The respective layers can then be processed and compressed separately in order to achieve better compression of the document image overall. A mask is generated from a document image. The mask is generated so as to reduce an estimate of compression for the combined size of the mask and multiple layers of the document image. The mask is then employed to segment the document image into the multiple layers. The mask determines or allocates pixels of the document image into respective layers. The mask and the multiple layers are processed and encoded separately so as to improve compression of the document image overall and to improve the speed of so doing. The multiple layers are non-binary images and can, for example, comprise a foreground image and a background image.
摘要:
A quality level determining the extent to which each image file is compressed is automatically computed for each image file in a set to ensure that the total size of the compressed image files does not exceed a predefined limit. The compressed size of each image file is initially determined when compressed at a predefined minimum acceptable level and at a nominal level. The relative complexity of the image files is determined based upon their high frequency energy content. As a function of the image file complexity, and starting with the compressed sizes initially determined, the appropriate quality level is determined for compressing each of the image files in an iterative process that ensures the total size of the compressed image files does not exceed the predefined limit, while retaining acceptable quality. Thus, a set of image files can be compressed optimally to fit within a limited storage.
摘要:
A method and system of lossless compression of integer data using a novel backward-adaptive technique. The adaptive Run-Length and Golomb/Rice (RLGR) encoder and decoder (codec) and method switches between a Golomb/Rice (G/R) encoder mode only and using the G/R encoder combined with a Run-Length encoder. The backward-adaptive technique includes novel adaptation rules that adjust the encoder parameters after each encoded symbol. An encoder mode parameter and a G/R parameter are adapted. The encoding mode parameter controls whether the adaptive RLGR encoder and method uses Run-Length encoding and, if so, it is used. The G/R parameter is used in both modes to encode every input value (in the G/R only mode) or to encode the number or value after an incomplete run of zeros (in the RLGR mode). The adaptive RLGR codec and method also includes a decoder that can be precisely implemented based on the inverse of the encoder rules.
摘要:
A quality level determining the extent to which each image file is compressed is automatically computed for each image file in a set to ensure that the total size of the compressed image files does not exceed a predefined limit. The compressed size of each image file is initially determined when compressed at a predefined minimum acceptable level and at a nominal level. The relative complexity of the image files is determined based upon their high frequency energy content. As a function of the image file complexity, and starting with the compressed sizes initially determined, the appropriate quality level is determined for compressing each of the image files in an iterative process that ensures the total size of the compressed image files does not exceed the predefined limit, while retaining acceptable quality. Thus, a set of image files can be compressed optimally to fit within a limited storage.
摘要:
A system for communicating audio data signals comprises a source computer that performs an action, generates an event message corresponding to the action, converts the event message into an audio data signal, and communicates the audio data signal through its speaker. A source telephone receives a voice signal from a participant and the audio data signal through its microphone and communicates the audio data signal and voice as coherent sound via an audio communications medium. A recipient telephone receives the audio data signal from the coherent sound communicated via the audio communications medium and communicates the audio data signal via its speaker. A recipient computer receives the audio data signal through its microphone, extracts the event message from the audio data signal, and performs an action based on the event message from the audio data signal. The audio communications medium can comprise a telephone communications system or air.
摘要:
The present invention is embodied in a system and method for compressing image data using a lapped biorthogonal transform (LBT). The present invention encodes data by generating coefficients using a hierarchical LBT, reorders the coefficients in a data-independent manner into groups of similar data, and encodes the reordered coefficients using adaptive run-length encoding. The hierarchical LBT computes multiresolution representations. The use of the LBT allows the present invention to encode image data in a single pass at any desired compression ratio and to make use of existing discrete cosine transform (DCT) software and hardware modules for fast processing and easy implementation.
摘要:
Compression of images that have masked or “don't care” regions which are delineated by a binary image mask is achieved using “masked wavelet transforms.” A unique mask-dependent lifting scheme is used to compute invertible wavelet transforms of the input image for use in encoding and decoding the input image. These mask-dependent wavelet transforms are derived from the input image based on the masked regions within the image. Masked wavelet coding automatically generates an appropriate linear combination of available, unmasked, neighboring pixels, for both the prediction and the update steps of “lifting” for each pixel. This pixel availability is then used to change the wavelet function on a case-by-case basis as a function of the mask by using a polynomial of degree k−1 for interpolation in both the predict and update steps of lifting where at least k unmasked neighboring pixel values are available.
摘要:
The present invention is embodied in a system and method for compressing image data using a lapped biorthogonal transform (LBT). The present invention encodes data by generating coefficients using a hierarchical LBT, reorders the coefficients in a data-independent manner into groups of similar data, and encodes the reordered coefficients using adaptive run-length encoding. The hierarchical LBT computes multiresolution representations. The use of the LBT allows the present invention to encode image data in a single pass at any desired compression ratio and to make use of existing discrete cosine transform (DCT) software and hardware modules for fast processing and easy implementation.
摘要:
A system and method for performing trainable nonlinear prediction of transform coefficients in data compression such that the number of bits required to represent the data is reduced. The nonlinear prediction data compression system includes a nonlinear predictor for generating predicted transform coefficients, a nonlinear prediction encoder that uses the predicted transform coefficients to encode original data, and a nonlinear prediction decoder that uses the predicted transform coefficients to decode the encoded bitstream and reconstruct the original data. The nonlinear predictor may be trained using training techniques, including a novel in-loop training technique of the present invention. The present invention also includes a method for using a nonlinear predictor to encode and decode data. The method also includes improving the performance of the nonlinear prediction data compression and decompression using several novel speedup techniques.
摘要:
The coder/decoder (codec) system of the present invention includes a coder and a decoder. The coder includes a multi-resolution transform processor, such as a modulated lapped transform (MLT) transform processor, a weighting processor, a uniform quantizer, a masking threshold spectrum processor, an entropy encoder, and a communication device, such as a multiplexor (MUX) for multiplexing (combining) signals received from the above components for transmission over a single medium. The decoder comprises inverse components of the encoder, such as an inverse multi-resolution transform processor, an inverse weighting processor, an inverse uniform quantizer, an inverse masking threshold spectrum processor, an inverse entropy encoder, and an inverse MUX. With these components, the present invention is capable of performing resolution switching, spectral weighting, digital encoding, and parametric modeling.