Abstract:
A processor for encoding fixed-length code words into variablelength code words and for decoding variable-length code words into fixed-length code words. The fixed-length code words are assigned to a number of groups and one of several possible coding sets determined by the probability of each word occurring after a preceding word. Each of the fixed-length code words is stored in a first associative memory unit along with its group number, its coding set assignment and a number of addresses arranged in groups. An input fixed-length code word is compared in the memory and will match the corresponding fixed-length stored word. One of the addresses is read out of the memory. The particular group from which the address is read out is determined by the group number of the previously received fixed-length code word. The selected address that is read out and the coding set membership, number of the previous word is entered into a second associative memory containing all the addresses arranged in several coding sets along with the variable-length code words. A match in the second memory unit on an address and a coding set number produces a variable-length code word for the input fixed-length code word. The first memory unit also provides the group number and coding set number for the next input fixed-length code word to be encoded. Decoding is performed in a similar but reverse manner starting with the variable-length coded data being entered into the second memory.
Abstract:
The present invention relates to a method practiceable on a general purpose electronic computer for statistically analyzing a data set and for producing a set of encoding and decoding (E/D) tables for achieving compaction of the original data set utilizing a variable length code. The method disclosed may operate under constraints of available core, desired compaction rate and speed of compaction/decompaction to produce differing sets of encoding/decoding tables depending upon the constraints imposed. The method would most normally be provided and utilized as a software package wherein the primary inputs are the data set itself and the above enumerated constraints. By utilizing a variable-length code wherein the code assignment is dependent upon the characteristic of preceding data good compaction rates may be achieved utilizing reasonable amounts of memory for the E/D tables. The method comprises three principle steps. The first is the construction of a matrix showing the probability of occurrence of every member of the data set with respect to the immediately preceding member. The second step comprises grouping various rows or columns of this matrix having similar probabilities of occurrence, the third step comprises a reordering of all of the previously grouped rows or columns and finally a second clustering into coding sets may be performed.