Method and apparatus for biological sequence processing fastq files comprising lossless compression and decompression
Abstract:
This application provides a biological sequence data processing method including selecting a target base from bases in a biological sequence fastq file according to characteristic information of each base. A base patch file is generated by using characteristic information of the target base. Lossless compression is performed on the biological sequence fastq file to obtain a compressed fastq file, and lossless compression is performed on the base patch file to obtain a compressed patch file. The compressed patch file and the compressed fastq file are decompressed. In response to determining that characteristic information of the target base in the decompressed compressed patch file is inconsistent with characteristic information of the target base in the decompressed compressed fastq file, the characteristic information of the target base in the decompressed compressed fastq file is modified to the characteristic information of the target base in the decompressed compressed patch file.
Public/Granted literature
Information query
Patent Agency Ranking
0/0