发明申请
US20100030796A1 EFFICIENT COLUMN BASED DATA ENCODING FOR LARGE-SCALE DATA STORAGE 有权
基于高效数据编码的大规模数据存储

EFFICIENT COLUMN BASED DATA ENCODING FOR LARGE-SCALE DATA STORAGE
摘要:
The subject disclosure relates to column based data encoding where raw data to be compressed is organized by columns, and then, as first and second layers of reduction of the data size, dictionary encoding and/or value encoding are applied to the data as organized by columns, to create integer sequences that correspond to the columns. Next, a hybrid greedy run length encoding and bit packing compression algorithm further compacts the data according to an analysis of bit savings. Synergy of the hybrid data reduction techniques in concert with the column-based organization, coupled with gains in scanning and querying efficiency owing to the representation of the compact data, results in substantially improved data compression at a fraction of the cost of conventional systems.
信息查询
0/0