Data compression method and system
    1.
    发明授权
    Data compression method and system 失效
    数据压缩方法和系统

    公开(公告)号:US5572206A

    公开(公告)日:1996-11-05

    申请号:US271125

    申请日:1994-07-06

    摘要: A method and system for compressing an input stream of data bytes into a compressed stream of data bytes using an LZ77-based compression scheme. The method and system also includes a decompressor for decompressing the compressed stream into a decompressed stream of data bytes that is identical to the input stream. The compression system encodes matches using token offsets rather than the byte offsets used by prior art LZ77-based compression schemes. The compression system uses knowledge of the internal format of the input stream to identify tokens that are used to determine the token offsets. Preferably, the method parses the input stream by dividing it into tokens and assigning a token type to each token. The method searches the input stream for a matching sequence of already processed tokens that is identical to a current sequence of tokens. If a matching sequence is found, the method determines whether the token type of a selected token, such as the first token, of the current sequence matches the token type of a corresponding token of the matching sequence. The method determines a token offset indicating the number of bytes of the matching token type occurring between the selected byte and the corresponding byte of the matching sequence. The method determines the length of the match and encodes the current sequence as a match pair that includes the token offset and the length of the match.

    摘要翻译: 一种用于使用基于LZ77的压缩方案将数据字节的输入流压缩成数据字节的压缩流的方法和系统。 该方法和系统还包括解压缩器,用于将压缩流解压缩成与输入流相同的数据字节的解压缩流。 压缩系统使用令牌偏移来编码匹配,而不是由现有技术的基于LZ77的压缩方案使用的字节偏移。 压缩系统使用输入流的内部格式的知识来识别用于确定令牌偏移的令牌。 优选地,该方法通过将输入流分成令牌并将令牌类型分配给每个令牌来解析输入流。 该方法在输入流中搜索与当前的令牌序列相同的已处理令牌的匹配序列。 如果找到匹配序列,则该方法确定当前序列中所选令牌(例如第一令牌)的令牌类型是否与匹配序列的相应令牌的令牌类型匹配。 该方法确定表示在所选字节和匹配序列的对应字节之间出现的匹配令牌类型的字节数的令牌偏移量。 该方法确定匹配的长度,并将当前序列编码为包括令牌偏移量和匹配长度的匹配对。