发明授权
- 专利标题: Data compression method and system
- 专利标题(中): 数据压缩方法和系统
-
申请号: US271125申请日: 1994-07-06
-
公开(公告)号: US5572206A公开(公告)日: 1996-11-05
- 发明人: John W. Miller , Ben W. Slivka
- 申请人: John W. Miller , Ben W. Slivka
- 申请人地址: WA Redmond
- 专利权人: Microsoft Corporation
- 当前专利权人: Microsoft Corporation
- 当前专利权人地址: WA Redmond
- 主分类号: G06F5/00
- IPC分类号: G06F5/00 ; G06T9/00 ; H03M7/30 ; H03M7/40 ; H03M7/46
摘要:
A method and system for compressing an input stream of data bytes into a compressed stream of data bytes using an LZ77-based compression scheme. The method and system also includes a decompressor for decompressing the compressed stream into a decompressed stream of data bytes that is identical to the input stream. The compression system encodes matches using token offsets rather than the byte offsets used by prior art LZ77-based compression schemes. The compression system uses knowledge of the internal format of the input stream to identify tokens that are used to determine the token offsets. Preferably, the method parses the input stream by dividing it into tokens and assigning a token type to each token. The method searches the input stream for a matching sequence of already processed tokens that is identical to a current sequence of tokens. If a matching sequence is found, the method determines whether the token type of a selected token, such as the first token, of the current sequence matches the token type of a corresponding token of the matching sequence. The method determines a token offset indicating the number of bytes of the matching token type occurring between the selected byte and the corresponding byte of the matching sequence. The method determines the length of the match and encodes the current sequence as a match pair that includes the token offset and the length of the match.
公开/授权文献
信息查询