Neural network architectures for linking biological sequence variants based on molecular phenotype, and systems and methods therefor

    公开(公告)号:US11183271B2

    公开(公告)日:2021-11-23

    申请号:US15841106

    申请日:2017-12-13

    Abstract: We describe a system and a method that ascertains the strengths of links between pairs of biological sequence variants, by determining numerical link distances that measure the similarity of the molecular phenotypes of the variants. The link distances may be used to associate knowledge about labeled variants to other variants and to prioritize the other variants for subsequent analysis or interpretation. The molecular phenotypes are determined using a neural network, called a molecular phenotype neural network, and may include numerical or descriptive attributes, such as those describing protein-DNA interactions, protein-RNA interactions, protein-protein interactions, splicing patterns, polyadenylation patterns, and microRNA-RNA interactions. Linked genetic variants may be used to ascertain pathogenicity in genetic testing, to identify drug targets, to identify patients that respond similarly to a drug, to ascertain health risks, or to connect patients that have similar molecular phenotypes.

    SYSTEM AND METHOD FOR TRAINING NEURAL NETWORKS

    公开(公告)号:US20210133573A1

    公开(公告)日:2021-05-06

    申请号:US17111257

    申请日:2020-12-03

    Abstract: Systems and methods for training a neural network or an ensemble of neural networks are described. A hyper-parameter that controls the variance of the ensemble predictors is used to address overfitting. For larger values of the hyper-parameter, the predictions from the ensemble have more variance, so there is less overfitting. This technique can be applied to ensemble learning with various cost functions, structures and parameter sharing. A cost function is provided and a set of techniques for learning are described.

    System and method for training neural networks

    公开(公告)号:US10885435B2

    公开(公告)日:2021-01-05

    申请号:US16541683

    申请日:2019-08-15

    Abstract: Systems and methods for training a neural network or an ensemble of neural networks are described. A hyper-parameter that controls the variance of the ensemble predictors is used to address overfitting. For larger values of the hyper-parameter, the predictions from the ensemble have more variance, so there is less overfitting. This technique can be applied to ensemble learning with various cost functions, structures and parameter sharing. A cost function is provided and a set of techniques for learning are described.

    NEURAL NETWORK ARCHITECTURES FOR LINKING BIOLOGICAL SEQUENCE VARIANTS BASED ON MOLECULAR PHENOTYPE, AND SYSTEMS AND METHODS THEREFOR

    公开(公告)号:US20210407622A1

    公开(公告)日:2021-12-30

    申请号:US17378404

    申请日:2021-07-16

    Abstract: We describe a system and a method that ascertains the strengths of links between pairs of biological sequence variants, by determining numerical link distances that measure the similarity of the molecular phenotypes of the variants. The link distances may be used to associate knowledge about labeled variants to other variants and to prioritize the other variants for subsequent analysis or interpretation. The molecular phenotypes are determined using a neural network, called a molecular phenotype neural network, and may include numerical or descriptive attributes, such as those describing protein-DNA interactions, protein-RNA interactions, protein-protein interactions, splicing patterns, polyadenylation patterns, and microRNA-RNA interactions. Linked genetic variants may be used to ascertain pathogenicity in genetic testing, to identify drug targets, to identify patients that respond similarly to a drug, to ascertain health risks, or to connect patients that have similar molecular phenotypes.

    SYSTEMS AND METHODS FOR DETERMINING EFFECTS OF THERAPIES AND
GENETIC VARIATION ON POLYADENYLATION SITE SELECTION

    公开(公告)号:US20210241852A1

    公开(公告)日:2021-08-05

    申请号:US17162224

    申请日:2021-01-29

    Abstract: The present disclosure provides systems and methods for determining effects of genetic variants on selection of polyadenylation sites (PAS) during polyadenylation processes. In an aspect, the present disclosure provides a polyadenylation code, a computational model that can predict alternative polyadenylation patterns from transcript sequences. A score can be calculated that describes or corresponds to the strength of a PAS, or the efficiency in which it is recognized by the 3′-end processing machinery. The polyadenylation model may be used, for example, to assess the effects of anti-sense oligonucleotides to alter transcript abundance. As another example, the polyadenylation model may be used to scan the 3′-UTR of a human genome to find potential PAS.

Patent Agency Ranking