-
公开(公告)号:US11681917B2
公开(公告)日:2023-06-20
申请号:US17111257
申请日:2020-12-03
Applicant: Deep Genomics Incorporated
Inventor: Hui Yuan Xiong , Andrew Delong , Brendan Frey
Abstract: Systems and methods for training a neural network or an ensemble of neural networks are described. A hyper-parameter that controls the variance of the ensemble predictors is used to address overfitting. For larger values of the hyper-parameter, the predictions from the ensemble have more variance, so there is less overfitting. This technique can be applied to ensemble learning with various cost functions, structures and parameter sharing. A cost function is provided and a set of techniques for learning are described.
-
公开(公告)号:US11183271B2
公开(公告)日:2021-11-23
申请号:US15841106
申请日:2017-12-13
Applicant: Deep Genomics Incorporated
Inventor: Brendan Frey , Andrew Delong
Abstract: We describe a system and a method that ascertains the strengths of links between pairs of biological sequence variants, by determining numerical link distances that measure the similarity of the molecular phenotypes of the variants. The link distances may be used to associate knowledge about labeled variants to other variants and to prioritize the other variants for subsequent analysis or interpretation. The molecular phenotypes are determined using a neural network, called a molecular phenotype neural network, and may include numerical or descriptive attributes, such as those describing protein-DNA interactions, protein-RNA interactions, protein-protein interactions, splicing patterns, polyadenylation patterns, and microRNA-RNA interactions. Linked genetic variants may be used to ascertain pathogenicity in genetic testing, to identify drug targets, to identify patients that respond similarly to a drug, to ascertain health risks, or to connect patients that have similar molecular phenotypes.
-
3.
公开(公告)号:US20190252041A1
公开(公告)日:2019-08-15
申请号:US16197146
申请日:2018-11-20
Applicant: Deep Genomics Incorporated
Inventor: Brendan Frey , Michael K.K. Leung , Andrew Thomas Delong , Hui Yuan Xiong , Babak Alipanahi , Leo J. Lee , Hannes Bretschneider
Abstract: Described herein are systems and methods that receive as input a DNA or RNA sequence, extract features, and apply layers of processing units to compute one ore more condition-specific cell variables, corresponding to cellular quantities measured under different conditions. The system may be applied to a sequence containing a genetic variant, and also to a corresponding reference sequence to determine how much the condition-specific cell variables change because of the variant. The change in the condition-specific cell variables are used to compute a score for how deleterious a variant is, to classify a variant's level of deleteriousness, to prioritize variants for subsequent processing, and to compare a test variant to variants of known deleteriousness. By modifying the variant or the extracted features so as to incorporate the effects of DNA editing, oligonucleotide therapy, DNA- or RNA-binding protein therapy or other therapies, the system may be used to determine if the deleterious effects of the original variant can be reduced.
-
公开(公告)号:US20220336049A1
公开(公告)日:2022-10-20
申请号:US17706227
申请日:2022-03-28
Applicant: Deep Genomics Incorporated
Inventor: Brendan Frey , Michael Ka Kit Leung
IPC: G16B25/00 , G06N20/00 , G16B20/20 , G16B40/00 , G16B30/00 , C12N15/113 , C12Q1/6869 , G06N3/00
Abstract: The present disclosure provides systems and methods for determining effects of genetic variants on selection of polyadenylation sites (PAS) during polyadenylation processes. In an aspect, the present disclosure provides a polyadenylation code, a computational model that can predict alternative polyadenylation patterns from transcript sequences. A score can be calculated that describes or corresponds to the strength of a PAS, or the efficiency in which it is recognized by the 3′-end processing machinery. The polyadenylation model may be used, for example, to assess the effects of anti-sense oligonucleotides to alter transcript abundance. As another example, the polyadenylation model may be used to scan the 3′-UTR of a human genome to find potential PAS.
-
公开(公告)号:US20210133573A1
公开(公告)日:2021-05-06
申请号:US17111257
申请日:2020-12-03
Applicant: Deep Genomics Incorporated
Inventor: Hui Yuan Xiong , Andrew Delong , Brendan Frey
Abstract: Systems and methods for training a neural network or an ensemble of neural networks are described. A hyper-parameter that controls the variance of the ensemble predictors is used to address overfitting. For larger values of the hyper-parameter, the predictions from the ensemble have more variance, so there is less overfitting. This technique can be applied to ensemble learning with various cost functions, structures and parameter sharing. A cost function is provided and a set of techniques for learning are described.
-
公开(公告)号:US10885435B2
公开(公告)日:2021-01-05
申请号:US16541683
申请日:2019-08-15
Applicant: Deep Genomics Incorporated
Inventor: Hui Yuan Xiong , Andrew Delong , Brendan Frey
Abstract: Systems and methods for training a neural network or an ensemble of neural networks are described. A hyper-parameter that controls the variance of the ensemble predictors is used to address overfitting. For larger values of the hyper-parameter, the predictions from the ensemble have more variance, so there is less overfitting. This technique can be applied to ensemble learning with various cost functions, structures and parameter sharing. A cost function is provided and a set of techniques for learning are described.
-
公开(公告)号:US10185803B2
公开(公告)日:2019-01-22
申请号:US14739432
申请日:2015-06-15
Applicant: DEEP GENOMICS INCORPORATED
Inventor: Brendan Frey , Michael K. K. Leung , Andrew Thomas Delong , Hui Yuan Xiong , Babak Alipanahi , Leo J. Lee , Hannes Bretschneider
Abstract: Described herein are systems and methods that receive as input a DNA or RNA sequence, extract features, and apply layers of processing units to compute one ore more condition-specific cell variables, corresponding to cellular quantities measured under different conditions. The system may be applied to a sequence containing a genetic variant, and also to a corresponding reference sequence to determine how much the condition-specific cell variables change because of the variant. The change in the condition-specific cell variables are used to compute a score for how deleterious a variant is, to classify a variant's level of deleteriousness, to prioritize variants for subsequent processing, and to compare a test variant to variants of known deleteriousness. By modifying the variant or the extracted features so as to incorporate the effects of DNA editing, oligonucleotide therapy, DNA- or RNA-binding protein therapy or other therapies, the system may be used to determine if the deleterious effects of the original variant can be reduced.
-
公开(公告)号:US20210407622A1
公开(公告)日:2021-12-30
申请号:US17378404
申请日:2021-07-16
Applicant: Deep Genomics Incorporated
Inventor: Brendan Frey , Andrew DeLong
Abstract: We describe a system and a method that ascertains the strengths of links between pairs of biological sequence variants, by determining numerical link distances that measure the similarity of the molecular phenotypes of the variants. The link distances may be used to associate knowledge about labeled variants to other variants and to prioritize the other variants for subsequent analysis or interpretation. The molecular phenotypes are determined using a neural network, called a molecular phenotype neural network, and may include numerical or descriptive attributes, such as those describing protein-DNA interactions, protein-RNA interactions, protein-protein interactions, splicing patterns, polyadenylation patterns, and microRNA-RNA interactions. Linked genetic variants may be used to ascertain pathogenicity in genetic testing, to identify drug targets, to identify patients that respond similarly to a drug, to ascertain health risks, or to connect patients that have similar molecular phenotypes.
-
公开(公告)号:US20210241852A1
公开(公告)日:2021-08-05
申请号:US17162224
申请日:2021-01-29
Applicant: Deep Genomics Incorporated
Inventor: Brendan Frey , Michael Ka Kit Leung
IPC: G16B25/00 , C12N15/113 , G16B40/00
Abstract: The present disclosure provides systems and methods for determining effects of genetic variants on selection of polyadenylation sites (PAS) during polyadenylation processes. In an aspect, the present disclosure provides a polyadenylation code, a computational model that can predict alternative polyadenylation patterns from transcript sequences. A score can be calculated that describes or corresponds to the strength of a PAS, or the efficiency in which it is recognized by the 3′-end processing machinery. The polyadenylation model may be used, for example, to assess the effects of anti-sense oligonucleotides to alter transcript abundance. As another example, the polyadenylation model may be used to scan the 3′-UTR of a human genome to find potential PAS.
-
公开(公告)号:US20190220740A1
公开(公告)日:2019-07-18
申请号:US16230149
申请日:2018-12-21
Applicant: Deep Genomics Incorporated
Inventor: Hui Yuan Xiong , Brendan Frey
IPC: G06N3/08 , G16H10/40 , G16H50/70 , G16H50/30 , G16H50/20 , G16B5/00 , G16B20/40 , G16B20/20 , G16B40/30 , G16B50/20
CPC classification number: G06N3/08 , G06N3/0445 , G06N3/0454 , G06N3/082 , G06N3/084 , G16B5/00 , G16B20/00 , G16B20/20 , G16B20/40 , G16B40/00 , G16B40/30 , G16B50/20 , G16H10/40 , G16H50/20 , G16H50/30 , G16H50/70
Abstract: We describe systems and methods for generating and training convolutional neural networks using biological sequences and relevance scores derived from structural, biochemical, population and evolutionary data. The convolutional neural networks take as input biological sequences and additional information and output molecular phenotypes. Biological sequences may include DNA, RNA and protein sequences. Molecular phenotypes may include protein-DNA interactions, protein-RNA interactions, protein-protein interactions, splicing patterns, polyadenylation patterns, and microRNA-RNA interactions, which may be described using numerical, categorical or ordinal attributes. Intermediate layers of the convolutional neural networks are weighted using relevance score sequences, for example, conservation tracks. The resulting molecular phenotype convolutional neural networks may be used in genetic testing, to identify drug targets, to identify patients that respond similarly to a drug, to ascertain health risks, or to connect patients that have similar molecular phenotypes.
-
-
-
-
-
-
-
-
-