METHOD AND SYSTEM FOR DETERMINING COPY NUMBER VARIATION
    4.
    发明申请
    METHOD AND SYSTEM FOR DETERMINING COPY NUMBER VARIATION 审中-公开
    用于确定复制数变化的方法和系统

    公开(公告)号:US20150056619A1

    公开(公告)日:2015-02-26

    申请号:US14389898

    申请日:2012-04-05

    IPC分类号: C12Q1/68

    摘要: Disclosed are a method and a system for determining genome copy number variation, which relates to the technical field of bioinformatics. The method comprises obtaining reads; determining sequence labels according to the reads; counting the number of sequence labels falling into each window; performing GC correction on the sequence label number of each window and a correction according to an expected sequence label number adjusted by a control set to obtain a corrected sequence label number; selecting a demarcation point with a small significance value as a candidate CNV breaking point; rejecting the least significant candidate CNV breaking point at every turn, updating difference significance values of two candidate CNV breaking points on the left and right of the rejected candidate CNV breaking point and performing cyclic iteration until difference significance values of all candidate CNV breaking points are smaller than a termination threshold value, thereby determining a CNV breaking point. The method and the system the present invention have clinical feasibility, and can precisely detect a micro-deletion/micro-duplication area of 0.5 M under the situation of using data of about 50 M.

    摘要翻译: 公开了用于确定与生物信息学技术领域有关的基因组拷贝数变异的方法和系统。 该方法包括获取读数; 根据读数确定序列标签; 计算落入每个窗口的序列标签的数量; 对每个窗口的序列标签号执行GC校正,并根据由控制集调整的预期序列标号进行校正,以获得校正的序列标号; 选择具有较小重要性值的分界点作为候选CNV断点; 拒绝每一回合中最不重要的候选CNV断点,更新拒绝的候选CNV断点左侧和右侧的两个候选CNV断点的差异有效值,并执行循环迭代,直到所有候选CNV断点的差值显着值较小 比终止阈值,从而确定CNV断点。 本发明的方法和系统具有临床可行性,并且可以在使用约50M的数据的情况下精确地检测到0.5M的微缺失/微复制区域。

    Error correcting method of test sequence, corresponding system and gene assembly equipment
    8.
    发明授权
    Error correcting method of test sequence, corresponding system and gene assembly equipment 有权
    测试序列错误纠正方法,相应的系统和基因组装设备

    公开(公告)号:US08751165B2

    公开(公告)日:2014-06-10

    申请号:US13132038

    申请日:2009-12-11

    IPC分类号: G01N33/48 G06F19/00 G06F19/24

    摘要: The present invention provides an error correcting method of test sequence, which involves receiving test sequences, configuring high frequency short string list based on a preset high frequency threshold value, traversing each received test sequence, searching an area with the largest number of continuous high frequency short strings on each test sequence in combination with high frequency short string list, configuring whole left sequence and/or right sequence of high frequency short strings at left side and/or right side of searched area according to corresponding received test sequence and high frequency short string list, and constituting corresponding test sequence according to configured left and/or right sequence and searched area. The present invention also provides corresponding error correcting system of test sequence and gene assembly equipment.

    摘要翻译: 本发明提供一种测试序列的纠错方法,其包括接收测试序列,基于预设的高频阈值配置高频短串列表,遍历每个接收的测试序列,搜索具有最大数目的连续高频区域 每个测试序列上的短串组合高频短串列表,根据相应的接收测试序列和高频短信配置搜索区域的左侧和/或右侧的全部左序列和/或右序列的高频短串 字符串列表,并根据配置的左和/或右序列和搜索区域构成对应的测试序列。 本发明还提供了相应的测试序列和基因组装设备的纠错系统。

    METHOD AND SYSTEM TO DETERMINE BIOMARKERS RELATED TO ABNORMAL CONDITION
    9.
    发明申请
    METHOD AND SYSTEM TO DETERMINE BIOMARKERS RELATED TO ABNORMAL CONDITION 审中-公开
    用于确定与异常条件相关的生物标志物的方法和系统

    公开(公告)号:US20150376697A1

    公开(公告)日:2015-12-31

    申请号:US13640448

    申请日:2012-08-22

    IPC分类号: C12Q1/68

    CPC分类号: C12Q1/6883 C12Q2600/112

    摘要: A method and system to determine biomarkers related to abnormal condition in a subject are provided, comprising:sequencing nucleic acid samples from a first and a second subject in order to obtain multiple sequences respectively consisting of the first and the second sequencing results, wherein the first subject is in the abnormal condition; and the second subject is not in the abnormal condition; and the nucleic acid samples from the first and the second subject are both isolated from the samples of the same type; and the first and the second subject belong to the same species; and determining the biomarkers related to the abnormal condition in the subject based on the difference between the first and the second sequencing results.

    摘要翻译: 提供了确定与受试者异常状况相关的生物标志物的方法和系统,其包括:对来自第一和第二受试者的核酸样品进行测序,以获得分别由第一和第二测序结果组成的多个序列,其中第一个 受试者处于异常状况; 第二个受试者不处于异常状态; 并且来自第一和第二受试者的核酸样品均从相同类型的样品中分离; 第一和第二主题属于同一物种; 并且基于第一和第二测序结果之间的差异来确定与受试者的异常状况相关的生物标志物。

    METHOD OF DETECTING FUSED TRANSCRIPTS AND SYSTEM THEREOF

    公开(公告)号:US20140323320A1

    公开(公告)日:2014-10-30

    申请号:US14369566

    申请日:2011-12-31

    IPC分类号: G06F19/22

    摘要: Provided is a method of detecting method of detecting fusion transcripts in a sample to be analyzed. The method may comprises: subjecting the sample to be analyzed containing a RNA transcriptome to paired-end sequencing, to obtain paired-end RNA-Seq data of the sample to be analyzed; aligning the paired-end RNA-Seq data to a human reference genome sequence, to obtain first paired-end mapped reads, first single-end mapped reads, and first unmapped reads; evaluating an insertsize between two ends of the paired-end mapped reads by means of the first paired-end mapped reads, to obtain a proportion of paired-end mapped reads with overlapped 3′-ends; aligning the first unmapped reads to annotated transcripts, to obtain second single-end mapped reads and second unmapped reads; aligning the second unmapped reads to the annotated transcripts, to filter out unmapped reads caused by indel and obtain third unmapped reads; merging all single-end mapped reads, to obtain a set of single-end mapped reads; obtaining a gene pair linked by a cross-read as a primary set of candidate gene pairs based on the set of single-end mapped reads and combining with a relationship of the mapped paired-end reads; subjecting the primary set of candidate gene pairs to a filtration, to obtain a candidate set of fused gene pairs; bisecting the third unmapped read, to obtain a half-unmapped read; aligning the half-unmapped read to a gene-junction sequence in the candidate set of fused gene pairs, to obtain a potent region of a fused junction site in the gene in which the half-unmap read locates; outputting original reads of mapped half-unmapped reads, to obtain useful unmapped reads; subjecting the candidate set of fused gene pairs to a fusion simulation; aligning the useful unmapped reads to a junction library, to obtain a fused gene supported by the useful unmapped reads; calculating and gathering the fused sequence supported by the useful unmapped reads, to obtain information of the fused gene. And a system for detecting fusion transcripts is also provided.