-
公开(公告)号:WO2019232357A1
公开(公告)日:2019-12-05
申请号:PCT/US2019/034885
申请日:2019-05-31
IPC分类号: C12Q1/6888 , G16B20/00 , G16B30/00
摘要: Systems and methods for metagenomic analysis are provided. A method of metagenome sequence analysis of two or more samples can include (i) counting the abundance of each k-mer deconstructed from sequencing reads of nucleic acids in each sample, and (ii) using a vector space model to compute the genetic distance between each of the two or more samples according to the abundance of the k-mers. In some embodiments, counting includes (a) constructing a k-mer histogram containing the distribution of k-mers for each sample, and (b) dividing k-mers into partitions having approximately an equal number of k-mers based on the histogram, preparing an inverted index of the k-mers in each partition, and assigning a weight to each k-mer according to its abundance. Method of developing diagnostic and prognostic information using the methods of sequence analysis are also provided.