-
公开(公告)号:US20190258776A1
公开(公告)日:2019-08-22
申请号:US15900048
申请日:2018-02-20
发明人: FILIPPO UTRO , ALDO GUZMAN SAENZ , CHAYA LEVOVITZ , LAXMI PARIDA
摘要: A computer-implemented method includes generating, by a processor, a set of training data for each phenotype in a database including a set of subjects. The set of training data is generated by dividing genomic information of N subjects selected with or without repetition into windows, computing a distribution of genomic events in the windows for each of N subjects, and extracting, for each window, a tensor that represents the distribution of genomic events for each of N subjects. A set of test data is generated for each phenotype in the database, a distribution of genomic events in windows for each phenotype is computed, and a tensor is extracted for each window that represents a distribution of genomic events for each phenotype. The method includes classifying each phenotype of the test data with a classifier, and assigning a phenotype to a patient.
-
公开(公告)号:US20200251182A1
公开(公告)日:2020-08-06
申请号:US16266733
申请日:2019-02-04
IPC分类号: G16B40/00 , G06N20/00 , G16B20/00 , C12Q1/6883 , G16B30/10
摘要: Embodiments of the present invention are directed to methods for adapting machine learning, redescription, and computational homology techniques to the identification of pathogenic pathways. A non-limiting example of the computer-implemented method includes receiving genetic and biological data and generating a data matrix based on the data. The data matrix can include one or more features, and each feature can be associated with a known feature value. A collection of sets of features representing pathways, genes, or a genetic combination of genotype values can be determined. The method also includes determining a first prediction for a feature value of a selected feature to be predicted in the collection, permuting one or more rows of the data matrix, and recalculating a second prediction for the feature value based on the permutation. A prediction score can be determined based on the first prediction, the second prediction, and a known feature value.
-