METHODS FOR COMPARATIVE METAGENOMIC ANALYSIS

    公开(公告)号:WO2019232357A1

    公开(公告)日:2019-12-05

    申请号:PCT/US2019/034885

    申请日:2019-05-31

    摘要: Systems and methods for metagenomic analysis are provided. A method of metagenome sequence analysis of two or more samples can include (i) counting the abundance of each k-mer deconstructed from sequencing reads of nucleic acids in each sample, and (ii) using a vector space model to compute the genetic distance between each of the two or more samples according to the abundance of the k-mers. In some embodiments, counting includes (a) constructing a k-mer histogram containing the distribution of k-mers for each sample, and (b) dividing k-mers into partitions having approximately an equal number of k-mers based on the histogram, preparing an inverted index of the k-mers in each partition, and assigning a weight to each k-mer according to its abundance. Method of developing diagnostic and prognostic information using the methods of sequence analysis are also provided.