-
公开(公告)号:US11810648B2
公开(公告)日:2023-11-07
申请号:US16663243
申请日:2019-10-24
发明人: Kaushik Ghose , Wan-Ping Lee
摘要: Systems and methods for analyzing genomic information can include obtaining a sequence read including genetic information; identifying, within a graph representing a reference genome, a plurality of candidate mapping positions that relate to the genetic information, the graph comprising nodes representing genetic sequences and edges connecting pairs of nodes; determining, by means of a computer system, whether an alignment with the graph surrounding each of the plurality of candidate mapping positions is advanced or basic; and performing for each candidate mapping position, by means of the computer system, a local alignment based on whether the local alignment is advanced or basic. The advanced local alignment can include a first-local-alignment algorithm, and the basic local alignment includes a second-local-alignment algorithm. Based on the local alignments, the mapped position of the sequence read can be identified within the genome.
-
公开(公告)号:US11649495B2
公开(公告)日:2023-05-16
申请号:US16798759
申请日:2020-02-24
发明人: Devin Locke , Piotr Szamel
IPC分类号: G01N33/48 , C12Q1/6874 , C12Q1/6888 , G16B20/00 , G16B30/10
CPC分类号: C12Q1/6874 , C12Q1/6888 , G16B20/00 , G16B30/10 , C12Q2600/156
摘要: The invention provides methods of analyzing an individual's mtDNA by transforming available reference sequences into a directed graph that compactly represents all the information without duplication and comparing sequence reads from the mtDNA to the graph to identify the individual or describe their mtDNA. A directed graph can represent all of the genetic variation found among the mitochondrial genomes across all of a number of reference organisms while providing a single article to which sequence reads can be aligned or compared. Thus any sequence read or other sequence fragment can be compared, in a single operation, to the article that represents all of the reference mitochondrial sequences.
-
公开(公告)号:US11447828B2
公开(公告)日:2022-09-20
申请号:US16106996
申请日:2018-08-21
发明人: Deniz Kural
IPC分类号: G16B20/20 , G16B20/00 , G16B30/10 , C12Q1/6886 , C12Q1/6883
摘要: The invention includes methods and systems for identifying diseased-induced mutations by producing multi-dimensional reference sequence constructs that account for variations between individuals, different diseases, and different stages of those diseases. Once constructed, these reference sequence constructs can be used to align sequence reads corresponding to genetic samples from patients suspected of having a disease, or who have had the disease and are in suspected remission. The reference sequence constructs also provide insight to the genetic progression of the disease.
-
公开(公告)号:US20220261384A1
公开(公告)日:2022-08-18
申请号:US17729896
申请日:2022-04-26
发明人: Vladimir Semenyuk
IPC分类号: G06F16/22 , G06F16/901 , G06F16/2455 , G16B50/00 , G16B50/20 , G16B30/10 , G16B50/30 , G16B50/10 , G16B50/50
摘要: Methods of the invention include representing biological data in a memory subsystem within a computer system with a data structure that is particular to a location in the memory subsystem and serializing the data structure into a stream of bytes that can be deserialized into a clone of the data structure. In a preferred genomic embodiment, the biological data comprises genomic sequences and the data structure comprises a genomic directed acyclic graph (DAG) in which objects have adjacency lists of pointers that indicate the location of any object adjacent to that object. After serialization and deserialization, the clone genomic DAG has the same structure as the original to represent the same sequences and relationships among them as the original.
-
公开(公告)号:US20210398616A1
公开(公告)日:2021-12-23
申请号:US17359338
申请日:2021-06-25
发明人: Deniz Kural
IPC分类号: G16B30/10 , G16B30/00 , C12Q1/6869
摘要: The invention includes methods for aligning reads (e.g., nucleic acid reads) comprising repeating sequences, methods for building reference sequence constructs comprising repeating sequences, and systems that can be used to align reads comprising repeating sequences. The method is scalable, and can be used to align millions of reads to a construct thousands of bases long. The methods and systems can additionally account for variability within a repeating sequence, or near to a repeating sequence, due to genetic mutation.
-
公开(公告)号:US20210258399A1
公开(公告)日:2021-08-19
申请号:US17191187
申请日:2021-03-03
发明人: Nemanja Zbiljic
IPC分类号: H04L29/08 , G06F16/182 , G06F12/0862 , G16B50/00 , G06F12/02
摘要: A method for stream-processing biomedical data includes receiving, by a file system on a computing device, a first request for access to at least a first portion of a file stored on a remotely located storage device. The method includes receiving, by the file system, a second request for access to at least a second portion of the file. The method includes determining, by a pre-fetching component executing on the computing device, whether the first request and the second request are associated with a sequential read operation. The method includes automatically retrieving, by the pre-fetching component, a third portion of the requested file, before receiving a third request for access to least the third portion of the file, based on a determination that the first request and the second request are associated with the sequential read operation.
-
公开(公告)号:US10832797B2
公开(公告)日:2020-11-10
申请号:US14517419
申请日:2014-10-17
发明人: Deniz Kural
摘要: The invention includes methods for aligning reads (e.g., nucleic acid reads, amino acid reads) to a reference sequence construct, methods for building the reference sequence construct, and systems that use the alignment methods and constructs to produce sequences. The invention also includes methods and systems for evaluating the quality of the alignment between the reads and the reference sequence construct. The method is scalable, and can be used to align millions of reads to a construct thousands of bases or amino acids long. The invention additionally includes methods for identifying a disease or a genotype based upon alignment of nucleic acid reads to a location in the construct.
-
公开(公告)号:US10793895B2
公开(公告)日:2020-10-06
申请号:US15007874
申请日:2016-01-27
发明人: Devin Locke , Wan-Ping Lee
IPC分类号: G01N33/48 , G01N31/00 , C12Q1/6806 , C12Q1/6874 , G16B30/00 , C12Q1/6869
摘要: The invention provides systems and methods for determining patterns of modification to a genome of a subject by representing the genome using a graph, such as a directed acyclic graph (DAG) with divergent paths for regions that are potentially subject to modification, profiling segments of the genome for evidence of epigenetic modification, and aligning the profiled segments to the DAG to determine locations and patterns of the epigenetic modification within the genome.
-
公开(公告)号:US20200232029A1
公开(公告)日:2020-07-23
申请号:US16798759
申请日:2020-02-24
发明人: Devin Locke , Piotr Szamel
IPC分类号: C12Q1/6874 , C12Q1/6888
摘要: The invention provides methods of analyzing an individual's mtDNA by transforming available reference sequences into a directed graph that compactly represents all the information without duplication and comparing sequence reads from the mtDNA to the graph to identify the individual or describe their mtDNA. A directed graph can represent all of the genetic variation found among the mitochondrial genomes across all of a number of reference organisms while providing a single article to which sequence reads can be aligned or compared. Thus any sequence read or other sequence fragment can be compared, in a single operation, to the article that represents all of the reference mitochondrial sequences.
-
公开(公告)号:US10678613B2
公开(公告)日:2020-06-09
申请号:US16176833
申请日:2018-10-31
发明人: Christian Frech , Raunaq Malhotra
摘要: Some embodiments relate to systems for processing one or more computational workflows. In one embodiment, a description of a computational comprises a plurality of applications, in which applications are represented as nodes and edges connect the nodes indicate the flow of data elements between applications. A task execution module is configured to create and execute tasks. An application programming interface (API) is in communication with the task execution module and comprises a plurality of function calls for controlling at least one function of the task execution module. An API script includes instructions to the API to create and execute a plurality of tasks corresponding to the execution of the computational workflow for a plurality of samples. A graphical user interface (GUI) is in communication with the task execution module and configured to receive input from an end user to initiate execution of the API script.
-
-
-
-
-
-
-
-
-