Systems and methods for smart tools in sequence pipelines
    81.
    发明授权
    Systems and methods for smart tools in sequence pipelines 有权
    序列管线中智能工具的系统和方法

    公开(公告)号:US09558321B2

    公开(公告)日:2017-01-31

    申请号:US14877378

    申请日:2015-10-07

    IPC分类号: G06F9/44 G06F19/18 G06F19/28

    摘要: A tool in a bioinformatics pipeline can include a smart wrapper and an executable. The smart wrapper can cause the executable to analyze the sequence data it receives and can also selectively change to the pipeline when circumstances warrant. In certain aspects, a system for genomic analysis includes a processor coupled to a non-transitory memory. The system is operable to present to a user a plurality of genomic tools organized into a pipeline. At least a first one of the tools comprises an executable and a wrapper script. The system can receive instructions from the user and sequence data—instructions that call for the sequence data to be analyzed by the pipeline—and select, using the wrapper script, a change to the pipeline.

    摘要翻译: 生物信息学管道中的工具可以包括智能包装和可执行文件。 智能包装器可以使可执行程序分析其接收到的序列数据,并且还可以在情况需要时选择性地更改为管道。 在某些方面,用于基因组分析的系统包括耦合到非暂时性存储器的处理器。 该系统可操作以向用户呈现组织成管道的多个基因组工具。 至少第一个工具包括可执行文件和包装器脚本。 该系统可以接收来自用户的指令和序列数据指令,该数据指令调用要由流水线分析的序列数据,并使用包装器脚本来选择对流水线的更改。

    METHODS AND SYSTEMS FOR DETECTING SEQUENCE VARIANTS

    公开(公告)号:US20160306921A1

    公开(公告)日:2016-10-20

    申请号:US15196345

    申请日:2016-06-29

    发明人: Deniz Kural

    IPC分类号: G06F19/22 G06F19/28

    CPC分类号: G06F19/22 G06F19/28

    摘要: The invention provides methods for identifying rare variants near a structural variation in a genetic sequence, for example, in a nucleic acid sample taken from a subject. The invention additionally includes methods for aligning reads (e.g., nucleic acid reads) to a reference sequence construct accounting for the structural variation, methods for building a reference sequence construct accounting for the structural variation or the structural variation and the rare variant, and systems that use the alignment methods to identify rare variants. The method is scalable, and can be used to align millions of reads to a construct thousands of bases long, or longer.

    Methods and systems for detecting sequence variants

    公开(公告)号:US09390226B2

    公开(公告)日:2016-07-12

    申请号:US14811057

    申请日:2015-07-28

    发明人: Deniz Kural

    CPC分类号: G06F19/22 G06F19/28

    摘要: The invention provides methods for identifying rare variants near a structural variation in a genetic sequence, for example, in a nucleic acid sample taken from a subject. The invention additionally includes methods for aligning reads (e.g., nucleic acid reads) to a reference sequence construct accounting for the structural variation, methods for building a reference sequence construct accounting for the structural variation or the structural variation and the rare variant, and systems that use the alignment methods to identify rare variants. The method is scalable, and can be used to align millions of reads to a construct thousands of bases long, or longer.

    SYSTEMS AND METHODS FOR ANALYZING SEQUENCE DATA
    84.
    发明申请
    SYSTEMS AND METHODS FOR ANALYZING SEQUENCE DATA 有权
    用于分析序列数据的系统和方法

    公开(公告)号:US20150227685A1

    公开(公告)日:2015-08-13

    申请号:US14177958

    申请日:2014-02-11

    发明人: Deniz Kural

    IPC分类号: G06F19/22 G06F19/26

    摘要: The invention provides methods for comparing one set of genetic sequences to another without discarding any information within either set. A set of genetic sequences is represented using a directed acyclic graph (DAG) avoiding any unwarranted reduction to a linear data structure. The invention provides a way to align one sequence DAG to another to produce an alignment that can itself be stored as a DAG. DAG-to-DAG alignment is a natural choice wherever a set of genomic information consisting of more than one string needs to be compared to any non-linear reference. For example, a subpopulation DAG could be compared to a population DAG in order to compare the genetic features of that subpopulation to those of the population.

    摘要翻译: 本发明提供了将一组遗传序列与另一组遗传序列进行比较而不丢弃任一组内的任何信息的方法。 使用有向无环图(DAG)表示一组遗传序列,避免了线性数据结构的任何不必要的减少。 本发明提供了将一个序列DAG与另一序列对准的方法,以产生本身可以作为DAG存储的对准。 无论从多个字符串组成的一组基因组信息需要与任何非线性参考进行比较,DAG到DAG对齐是一种自然选择。 例如,可以将亚群DAG与群体DAG进行比较,以便将该亚群体的遗传特征与群体的遗传特征进行比较。

    METHODS AND SYSTEMS FOR GENOTYPING GENETIC SAMPLES
    85.
    发明申请
    METHODS AND SYSTEMS FOR GENOTYPING GENETIC SAMPLES 审中-公开
    遗传样品基因的方法与系统

    公开(公告)号:US20150199472A1

    公开(公告)日:2015-07-16

    申请号:US14517406

    申请日:2014-10-17

    发明人: Deniz Kural

    IPC分类号: G06F19/22

    摘要: The invention provides methods and system for making specific base calls at specific loci using a reference sequence construct, e.g., a directed acyclic graph (DAG) that represents known variants at each locus of the genome. Because the sequence reads are aligned to the DAG during alignment, the subsequent step of comparing a mutation, vis-à-vis the reference genome, to a table of known mutations can be eliminated. The disclosed methods and systems are notably efficient in dealing with structural variations within a genome or mutations that are within a structural variation.

    摘要翻译: 本发明提供使用参考序列构建体(例如在基因组的每个基因座处表示已知变体的有向无环图(DAG))在特定基因座进行特异性碱基呼叫的方法和系统。 因为序列读数在对齐期间与DAG对齐,所以可以消除将突变相对于参考基因组与已知突变表进行比较的后续步骤。 所公开的方法和系统在处理基因组内的结构变异或在结构变异内的突变中是显着的。

    METHODS AND SYSTEMS FOR ALIGNING SEQUENCES
    86.
    发明申请
    METHODS AND SYSTEMS FOR ALIGNING SEQUENCES 有权
    用于校准序列的方法和系统

    公开(公告)号:US20150057946A1

    公开(公告)日:2015-02-26

    申请号:US14016833

    申请日:2013-09-03

    发明人: Deniz Kural

    IPC分类号: G06F19/22

    CPC分类号: G06F19/22

    摘要: The invention includes methods for aligning reads (e.g., nucleic acid reads, amino acid reads) to a reference sequence construct, methods for building the reference sequence construct, and systems that use the alignment methods and constructs to produce sequences. The method is scalable, and can be used to align millions of reads to a construct thousands of bases or amino acids long. The invention additionally includes methods for identifying a disease or a genotype based upon alignment of nucleic acid reads to a location in the construct.

    摘要翻译: 本发明包括用于将阅读(例如,核酸读取,氨基酸读取)与参考序列构建体对准的方法,用于构建参考序列构建体的方法以及使用比对方法和构建体产生序列的系统。 该方法是可扩展的,并且可以用于将数百万次读取与构建数千个碱基或氨基酸进行比较。 本发明另外包括基于核酸读取与构建体中的位置的比对来鉴定疾病或基因型的方法。

    System and method for sequence identification in reassembly variant calling

    公开(公告)号:US12046325B2

    公开(公告)日:2024-07-23

    申请号:US16276070

    申请日:2019-02-14

    发明人: Ivan Johnson

    摘要: In one embodiment, a method for identifying candidate sequences for genotyping a genomic sample comprises obtaining a plurality of sequence reads mapping to a genomic region of interest. The plurality of sequence reads are assembled into a directed acyclic graph (DAG) comprising a plurality of branch sites representing variation present in the set of sequence reads, each branch site comprising two or more branches. A path through the DAG comprises a set of successive branches over two or more branch sites and represents a possible candidate sequence of the genomic sample. One or more paths through the DAG are ranked by calculating scores for one or more branch sites, wherein the calculated score comprises a number of sequence reads that span multiple branch sites in a given path. At least one path is selected as a candidate sequence based at least in part on its rank.

    SYSTEMS AND METHODS FOR ADAPTIVE LOCAL ALIGNMENT FOR GRAPH GENOMES

    公开(公告)号:US20240096450A1

    公开(公告)日:2024-03-21

    申请号:US18464597

    申请日:2023-09-11

    IPC分类号: G16B30/10 G16B30/00

    CPC分类号: G16B30/10 G16B30/00

    摘要: Systems and methods for analyzing genomic information can include obtaining a sequence read including genetic information; identifying, within a graph representing a reference genome, a plurality of candidate mapping positions that relate to the genetic information, the graph comprising nodes representing genetic sequences and edges connecting pairs of nodes; determining, by means of a computer system, whether an alignment with the graph surrounding each of the plurality of candidate mapping positions is advanced or basic; and performing for each candidate mapping position, by means of the computer system, a local alignment based on whether the local alignment is advanced or basic. The advanced local alignment can include a first-local-alignment algorithm, and the basic local alignment includes a second-local-alignment algorithm. Based on the local alignments, the mapped position of the sequence read can be identified within the genome.

    Methods and systems for detecting sequence variants

    公开(公告)号:US11837328B2

    公开(公告)日:2023-12-05

    申请号:US17933260

    申请日:2022-09-19

    发明人: Deniz Kural

    摘要: The invention provides methods for identifying rare variants near a structural variation in a genetic sequence, for example, in a nucleic acid sample taken from a subject. The invention additionally includes methods for aligning reads (e.g., nucleic acid reads) to a reference sequence construct accounting for the structural variation, methods for building a reference sequence construct accounting for the structural variation or the structural variation and the rare variant, and systems that use the alignment methods to identify rare variants. The method is scalable, and can be used to align millions of reads to a construct thousands of bases long, or longer.

    SYSTEMS AND METHODS FOR ANALYZING VIRAL NUCLEIC ACIDS

    公开(公告)号:US20230366046A1

    公开(公告)日:2023-11-16

    申请号:US18324799

    申请日:2023-05-26

    IPC分类号: C12Q1/70 G16B30/00 C12Q1/6809

    摘要: The invention provides systems and methods for analyzing viruses by representing viral genetic diversity with a directed acyclic graph (DAG), which allows genetic sequencing technology to detect rare variations and represent otherwise difficult-to-document diversity within a sample. Additionally, a host-specific sequence DAG can be used to effectively segregate viral nucleic acid sequence reads from host sequence reads when a sample from a host is subject to sequencing. Known viral genomes can be represented using a viral reference DAG and the viral sequence reads from the sample can be compared to viral DAG to identify viral species or strains from which the reads were derived. Where the viral sequence reads indicate great genetic diversity in the virus that was infecting the host, those reads can be assembled into a DAG that itself properly represents that diversity.