发明授权
US08972201B2 Compression of genomic data file 有权
压缩基因组数据文件

Compression of genomic data file
摘要:
Systems and methods for compression of a genomic data file are described herein. In one embodiment, genomic sequences, sequence headers, and quality sequences associated with a plurality of data streams provided in a genomic data file are identified. Each of the genomic sequences includes at least one of primary characters and secondary characters. Further, the secondary characters from each of the genomic sequences may be removed to obtain an intermediate genomic sequence file and a quality score corresponding to the secondary character may be modified in quality sequences to obtain an intermediate quality sequence file. Based on the intermediate genomic sequence file and the intermediate quality sequence file, a modified genomic sequence file and a modified quality sequence file, respectively are generated. A compressed genomic data file is obtained using at least the modified genomic sequence and the modified quality sequence.
公开/授权文献
信息查询
0/0