-
公开(公告)号:US20230245305A1
公开(公告)日:2023-08-03
申请号:US18160855
申请日:2023-01-27
发明人: Tobias Hamp , Hong Gao , Kai-How Farh
CPC分类号: G06T7/0012 , G06T7/70 , G06T17/00 , G06T2207/20084
摘要: Described herein are technologies for classifying a protein structure (such as technologies for classifying the pathogenicity of a protein structure related to a nucleotide variant). Such a classification is based on two-dimensional images taken from a three-dimensional image of the protein structure. With respect to some implementations, described herein are multi-view convolutional neural networks (CNNs) for classifying a protein structure based on inputs of two-dimensional images taken from a three-dimensional image of the protein structure. In some implementations, a computer-implemented method of determining pathogenicity of variants includes accessing a structural rendition of amino acids, capturing images of those parts of the structural rendition that contain a target amino acid from the amino acids, and, based on the images, determining pathogenicity of a nucleotide variant that mutates the target amino acid into an alternate amino acid.
-
公开(公告)号:US11538555B1
公开(公告)日:2022-12-27
申请号:US17533091
申请日:2021-11-22
发明人: Tobias Hamp , Hong Gao , Kai-How Farh
摘要: The technology disclosed relates to determining pathogenicity of nucleotide variants. In particular, the technology disclosed relates to specifying a particular amino acid at a particular position in a protein as a gap amino acid, and specifying remaining amino acids at remaining positions in the protein as non-gap amino acids. The technology disclosed further relates to generating a gapped spatial representation of the protein that includes spatial configurations of the non-gap amino acids, and excludes a spatial configuration of the gap amino acid, and determining a pathogenicity of a nucleotide variant based at least in part on the gapped spatial representation, and a representation of an alternate amino acid created by the nucleotide variant at the particular position.
-
公开(公告)号:US11515010B2
公开(公告)日:2022-11-29
申请号:US17468411
申请日:2021-09-07
发明人: Tobias Hamp , Hong Gao , Kai-How Farh
摘要: The technology disclosed relates to determining pathogenicity of variants. In particular, the technology disclosed relates to generating amino acid-wise distance channels for a plurality of amino acids in a protein. Each of the amino acid-wise distance channels has voxel-wise distance values for voxels in a plurality of voxels. A tensor includes the amino acid-wise distance channels and at least an alternative allele of the protein expressed by a variant. A deep convolutional neural network determines a pathogenicity of the variant based at least in part on processing the tensor. The technology disclosed further augments the tensor with supplemental information like a reference allele of the protein, evolutionary conservation data about the protein, annotation data about the protein, and structure confidence data about the protein.
-
公开(公告)号:US20240120024A1
公开(公告)日:2024-04-11
申请号:US18483313
申请日:2023-10-09
申请人: Illumina Software, Inc. , Illumina, Inc. , Illumina Australia Pty Ltd , Illumina Netherlands BV , Illumina France SARL , Illumina Cambridge Limited
发明人: Yair Field , Jacob Christopher Ulirsch , Cinzia Malangone , Miguel Madrid-Mencia , Geoffrey Nilsen , Pam Tang Cheng , Ileena Mitra , Petko Plamenov Fiziev , Sabrina Rashid , Anthonius Petrus Nicolaas de Boer , Pierrick Wainschtein , Vlad Mihai Sima , Francois Aguet , Kai-How Farh
摘要: Genome-wide association studies may allow for detection of variants that are statistically significantly associated with disease risk. However, inferring which are the genes underlying these variant associations may be difficult. The presently disclosed approaches utilize machine learning techniques to predict genes from genome-wide association study summary statistics that substantially improves causal gene identification in terms of both precision and recall compared to other techniques.
-
公开(公告)号:US11705219B2
公开(公告)日:2023-07-18
申请号:US16247487
申请日:2019-01-14
IPC分类号: G16B40/20 , G16B20/20 , G06F18/214 , G06F18/2431 , G06N3/045 , G16B40/00 , G16B20/00 , G06F9/38 , G06N3/04 , G06N3/084
CPC分类号: G16B40/20 , G06F9/3877 , G06F18/2148 , G06F18/2431 , G06N3/04 , G06N3/045 , G06N3/084 , G16B20/00 , G16B20/20 , G16B40/00
摘要: The technology disclosed directly operates on sequencing data and derives its own feature filters. It processes a plurality of aligned reads that span a target base position. It combines elegant encoding of the reads with a lightweight analysis to produce good recall and precision using lightweight hardware. For instance, one million training examples of target base variant sites with 50 to 100 reads each can be trained on a single GPU card in less than 10 hours with good recall and precision. A single GPU card is desirable because it a computer with a single GPU is inexpensive, almost universally within reach for users looking at genetic data. It is readily available on could-based platforms.
-
6.
公开(公告)号:US20240242075A1
公开(公告)日:2024-07-18
申请号:US18513367
申请日:2023-11-17
申请人: Illumina, Inc.
摘要: We disclose computational models that alleviate the effects of human ascertainment biases in curated pathogenic non-coding variant databases by generating pathogenicity scores for variants occurring in the promoter regions (referred to herein as promoter single nucleotide variants (pSNVs)). We train deep learning networks (referred to herein as pathogenicity classifiers) using a semi-supervised approach to discriminate between a set of labeled benign variants and an unlabeled set of variants that were matched to remove biases.
-
公开(公告)号:US11244246B2
公开(公告)日:2022-02-08
申请号:US16799071
申请日:2020-02-24
申请人: ILLUMINA, INC.
发明人: Kai-How Farh , Donavan Cheng , John Shon , Jorg Hakenberg , Eugene Bolotin , James Casey Geaney , Hong Gao , Pam Cheng , Inderjit Singh , Daniel Roche , Milan Karangutkar
IPC分类号: G06F7/00 , G06N20/00 , G06F16/248 , G06F16/2458 , G06F16/25 , G16H70/60 , G06N5/02 , G06Q50/00 , G06N5/04 , H04L29/08 , G16H80/00 , G16H10/60 , H04L29/06
摘要: Systems, computer-implemented methods, and non-transitory computer readable media are provided for sharing medical data. The disclosed systems may be configured to create a first workgroup having a first knowledgebase. This first knowledgebase may be federated with a common knowledgebase, and with a second knowledgebase of a second workgroup. At least one of the first knowledgebase, common knowledgebase, and second knowledgebase may be configured to store data items comprising associations, signs, and evidence. The signs may comprise measurements and contexts, and the associations may describe the relationships between the measurements and contexts. The evidence may support these associations. The disclosed systems may be configured to receive a request from a user in the first workgroup, retrieve matching data items, and optionally then output to the user at least some of the retrieved matching data items. The request may comprise at least one of a first association and a first measurement.
-
公开(公告)号:US10558915B2
公开(公告)日:2020-02-11
申请号:US16413476
申请日:2019-05-15
申请人: Illumina, Inc.
摘要: The technology disclosed relates to constructing a convolutional neural network-based classifier for variant classification. In particular, it relates to training a convolutional neural network-based classifier on training data using a backpropagation-based gradient update technique that progressively match outputs of the convolutional neutral network-based classifier with corresponding ground truth labels. The convolutional neural network-based classifier comprises groups of residual blocks, each group of residual blocks is parameterized by a number of convolution filters in the residual blocks, a convolution window size of the residual blocks, and an atrous convolution rate of the residual blocks, the size of convolution window varies between groups of residual blocks, the atrous convolution rate varies between groups of residual blocks. The training data includes benign training examples and pathogenic training examples of translated sequence pairs generated from benign variants and pathogenic variants.
-
公开(公告)号:USD829738S1
公开(公告)日:2018-10-02
申请号:US29575141
申请日:2016-08-22
申请人: Illumina, Inc.
设计人: Kai-How Farh , Donavan Cheng , Andrew Warren , Ian D. Patrick
-
公开(公告)号:US20240167020A1
公开(公告)日:2024-05-23
申请号:US18549343
申请日:2022-03-08
申请人: Illumina, Inc.
发明人: Hongxia Xu , Tong Liu , Shi Min Xiao , Dan Cao , Victor Quijano , Kai-How Farh , Mohan Sun
IPC分类号: C12N15/10
CPC分类号: C12N15/1082 , C12N15/1062 , C12N15/1065
摘要: Analyzing expression of protein-coding variants in cells is provided herein. A method may include replacing a protein coding-region of the DNA in a cell with a donor vector including a variant of the protein-coding region and a first barcode identifying that variant. The cell may generate mRNA including an expression of the variant and an expression of the first barcode. A second barcode corresponding to the cell may be coupled to the mRNA. The mRNA. having the second barcode coupled thereto, may be reverse transcribed into complementary cDNA. The cDNA may be sequenced. The donor vector or cDNA may be sequenced using amplicon sequencing. The donor vector sequence and the cDNA sequence may be correlated to identify the variant and the cell's expression of the variant.
-
-
-
-
-
-
-
-
-