-
1.
公开(公告)号:US20150317433A1
公开(公告)日:2015-11-05
申请号:US14701248
申请日:2015-04-30
Applicant: Complete Genomics, Inc.
Inventor: Bahram Ghaffarzadeh Kermani , Somayeh Bakhtiari
IPC: G06F19/22
CPC classification number: G16B30/00
Abstract: Systems, methods, and apparatuses are provided for determining a sequence of a heteropolymer molecule. For example, all or part of a chromosome or a protein can be determined using sequence data from a plurality of heteropolymer fragments corresponding to the heteropolymer molecule. As one example, a position in the sequence read of a DNA fragment can be identified where a single base call is not clear. A multiplet base call can then be used, where the multiplet base call includes two or more bases at the position, along with a score for each base. The scores can be carried through mapping and assembly procedures, where the scores can be used to determine a final base call for the position in a chromosome of a genome of an organism. Other examples can be used for other monomer units besides bases.
Abstract translation: 提供了用于确定杂聚物分子序列的系统,方法和装置。 例如,染色体或蛋白质的全部或部分可以使用来自对应于杂聚物分子的多个杂聚物片段的序列数据来确定。 作为一个示例,可以在单个基本呼叫不清楚的地方识别DNA片段读取序列中的位置。 然后可以使用多重基数调用,其中多重基数调用在位置包括两个或更多个基数,以及每个基数的分数。 分数可以通过测绘和组装程序进行,其中分数可用于确定生物体基因组染色体中位置的最终基本调用。 其他实例也可用于除碱基之外的其它单体单元。