Methods for estimating genome-wide copy number variations
    1.
    发明授权
    Methods for estimating genome-wide copy number variations 有权
    估计全基因组拷贝数变异的方法

    公开(公告)号:US08725422B2

    公开(公告)日:2014-05-13

    申请号:US13270989

    申请日:2011-10-11

    IPC分类号: G01N33/48 G06F19/00 G06F19/18

    摘要: Methods for determining the copy number of a genomic region at a detection position of a target sequence in a sample are disclosed. Genomic regions of a target sequence in a sample are sequenced and measurement data for sequence coverage is obtained. Sequence coverage bias is corrected and may be normalized against a baseline sample. Hidden Markov Model (HMM) segmentation, scoring, and output are performed, and in some embodiments population-based no-calling and identification of low-confidence regions may also be performed. A total copy number value and region-specific copy number value for a plurality of regions are then estimated.

    摘要翻译: 公开了用于确定样品中靶序列的检测位置处的基因组区域的拷贝数的方法。 对样品中靶序列的基因组区域进行测序,并获得序列覆盖度的测量数据。 校正序列覆盖偏差并且可以相对于基线样本进行归一化。 执行隐马尔科夫模型(HMM)分割,评分和输出,并且在一些实施例中,也可以执行基于群体的无呼叫和低置信区域的识别。 然后估计多个区域的总拷贝数值和区域特定拷贝数值。

    METHODS FOR DETERMINING ABSOLUTE GENOME-WIDE COPY NUMBER VARIATIONS OF COMPLEX TUMORS
    2.
    发明申请
    METHODS FOR DETERMINING ABSOLUTE GENOME-WIDE COPY NUMBER VARIATIONS OF COMPLEX TUMORS 审中-公开
    确定复杂肿瘤的绝对基因组复制数变异的方法

    公开(公告)号:US20130316915A1

    公开(公告)日:2013-11-28

    申请号:US13888146

    申请日:2013-05-06

    IPC分类号: C12Q1/68

    摘要: Methods for interpreting absolute copy number of complex tumors and for determining the copy number of a genomic region at a detection position of a target sequence in a sample are disclosed. In certain aspects, genomic regions of a target sequence in a sample are sequenced and measurement data for sequence coverage is obtained. Sequence coverage bias is corrected and may be normalized against a baseline sample. Hidden Markov Model (HMM) segmentation, scoring, and output are performed, and in some embodiments population-based no-calling and identification of low-confidence regions may also be performed. A total copy number value and region-specific copy number value for a plurality of regions are then estimated.

    摘要翻译: 公开了用于解释复合肿瘤的绝对拷贝数和用于确定样品中靶序列的检测位置的基因组区域的拷贝数的方法。 在某些方面,对样品中靶序列的基因组区域进行测序,并获得序列覆盖率的测量数据。 校正序列覆盖偏差并且可以相对于基线样本进行归一化。 执行隐马尔科夫模型(HMM)分割,评分和输出,并且在一些实施例中,也可以执行基于群体的无呼叫和低置信区域的识别。 然后估计多个区域的总拷贝数值和区域特定拷贝数值。

    DETERMINING VARIANTS IN GENOME OF A HETEROGENEOUS SAMPLE
    3.
    发明申请
    DETERMINING VARIANTS IN GENOME OF A HETEROGENEOUS SAMPLE 审中-公开
    确定异构样品基因组中的变量

    公开(公告)号:US20130110407A1

    公开(公告)日:2013-05-02

    申请号:US13621716

    申请日:2012-09-17

    IPC分类号: G06F17/18

    CPC分类号: G16B40/00 G16B30/00

    摘要: After DNA fragments are sequenced and mapped to a reference, various hypotheses for the sequences in a variant region can be scored to find which sequence hypotheses are more likely. A hypothesis can include a specific variable fraction for the plurality of alleles that comprise the sequence hypothesis in the region. A likelihood of each hypothesis can be determined using a probability that accounts for the fraction of the alleles specified in the respective sequence hypothesis. Thus, other hypotheses besides standard homozygous and equal heterozygous (i.e., one chromosome with A and one with B in a cell) can be explored by explicitly including the variable fractions of the alleles as a parameter in the optimization. Also, a variant score can be determined for a variant relative to a reference. The variant score can be used to determine a variant calibrated score indicating a likelihood that the variant call is correct.

    摘要翻译: 对DNA片段进行测序并映射到参考文献后,可以对变体区域中的序列进行各种假设,以确定哪些序列假设更有可能。 假设可以包括在该区域中构成序列假设的多个等位基因的特定可变部分。 每个假设的可能性可以使用考虑各个序列假设中规定的等位基因部分的概率来确定。 因此,除了标准纯合和等同杂合(即,具有A和一个在细胞中具有B的一个染色体)之外的其他假设可以通过在优化中明确地包括等位基因的可变部分作为参数来探索。 此外,可以针对相对于参考的变体确定变体得分。 变体得分可用于确定变体校准分数,指示变体调用正确的可能性。

    METHODS FOR ESTIMATING GENOME-WIDE COPY NUMBER VARIATIONS
    4.
    发明申请
    METHODS FOR ESTIMATING GENOME-WIDE COPY NUMBER VARIATIONS 有权
    估计基因型复制数变化的方法

    公开(公告)号:US20120095697A1

    公开(公告)日:2012-04-19

    申请号:US13270989

    申请日:2011-10-11

    摘要: Methods for determining the copy number of a genomic region at a detection position of a target sequence in a sample are disclosed. Genomic regions of a target sequence in a sample are sequenced and measurement data for sequence coverage is obtained. Sequence coverage bias is corrected and may be normalized against a baseline sample. Hidden Markov Model (HMM) segmentation, scoring, and output are performed, and in some embodiments population-based no-calling and identification of low-confidence regions may also be performed. A total copy number value and region-specific copy number value for a plurality of regions are then estimated.

    摘要翻译: 公开了用于确定样品中靶序列的检测位置的基因组区域的拷贝数的方法。 对样品中靶序列的基因组区域进行测序,并获得序列覆盖度的测量数据。 校正序列覆盖偏差并且可以相对于基线样本进行归一化。 执行隐马尔科夫模型(HMM)分割,评分和输出,并且在一些实施例中,也可以执行基于群体的无呼叫和低置信区域的识别。 然后估计多个区域的总拷贝数值和区域特定拷贝数值。

    Duplication and deletion detection using transformation processing of depth vectors

    公开(公告)号:US09773031B1

    公开(公告)日:2017-09-26

    申请号:US15489473

    申请日:2017-04-17

    IPC分类号: G06F17/30

    摘要: Techniques for accurately identifying duplications and deletions using depth vectors. A depth vector is generated for each of multiple clients based on a set of reads that is received and aligned to a reference data set. A transformation processing of the depth vectors is performed to produce multiple components. Each of the components is assigned an order based on the extent to which it accounts for cross-client differences in the depth vectors. Each of the components includes an intensity, multiple values, and multiple client weights. A subset of the components is identified based on the order. A sparse indicator and positional data for the sparse indicator can be determined from the components in the subset, and one or more clients can be identified as being associated with the components.

    Allele-specific expression patterns
    7.
    发明申请
    Allele-specific expression patterns 审中-公开
    等位基因特异性表达模式

    公开(公告)号:US20050003410A1

    公开(公告)日:2005-01-06

    申请号:US10845316

    申请日:2004-05-12

    IPC分类号: C12Q1/68

    摘要: The invention provides methods of analyzing genes for differential relative allelic expression patterns. Haplotype blocks throughout the genomes of individuals are analyzed to identify haplotype patterns that are associated with specific differential relative allelic expression patterns. Haplotype blocks that contain associated haplotype patterns may be further investigated to identify genes or variants of genes involved in differential relative allelic expression patterns.

    摘要翻译: 本发明提供了分析差异相对等位基因表达模式的基因的方法。 分析个体基因组中的单倍型区域,以鉴定与特异性差异相对等位基因表达模式相关联的单倍型模式。 可以进一步研究包含相关单体型模式的单倍型区域,以鉴定参与差异相对等位基因表达模式的基因的基因或变体。

    Load balancing and conflict processing in workflow with task dependencies

    公开(公告)号:US09811391B1

    公开(公告)日:2017-11-07

    申请号:US15449579

    申请日:2017-03-03

    摘要: Embodiments in the disclosure are directed to the use of distributed computing to align reads against multiple portions of a reference dataset. Aligned portions of the reference dataset that correspond with an above-threshold alignment score can be assessed for the presence of sparse indicators that can be categorized and used to influence a determination of a state transition likelihood. Various tasks associated with the processing of reads (e.g., alignment, sparse indicator detection, and/or determination of a state transition likelihood) may be able to take advantage of parallel processing and can be distributed among the machines while considering the resource utilization of those machines. Different load-balancing mechanisms can be employed in order to achieve even resource utilization across the machines, and in some cases may involve assessing various processing characteristics that reflect a predicted resource expenditure and/or time profile for each task to be processed by a machine.