METHOD, APPARATUS, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR DATA COMPRESSION

    公开(公告)号:US20170255670A1

    公开(公告)日:2017-09-07

    申请号:US15444256

    申请日:2017-02-27

    发明人: Peng LEI

    IPC分类号: G06F17/30

    摘要: According to one aspect of the present application, a method for data compression comprises: creating a first trie for a first set of strings, the first set of strings comprising a plurality of raw data strings, wherein a trie consists of a plurality of nodes linked through parent-child relation, and wherein each edge of the trie is of at least one character and the edge corresponds to a state transition from a parent node of the edge to a child node of the edge; collecting edges of the first trie longer than a predetermined length and making these edges a first subset of strings of the first trie; segmenting a string in the first subset of strings into two or more fragments when the string satisfies a predetermined condition and collecting all segmented fragments and all un-segmented strings in the first subset of strings as a segmented set of strings; and storing the first set of strings using the first trie and the segmented set of strings so as to compress the raw data strings.