-
公开(公告)号:US20210125690A1
公开(公告)日:2021-04-29
申请号:US17026353
申请日:2020-09-21
发明人: Thomas Joseph , Aditya Ramkrishna Rao , Saipradeep Vangala Govindakrishnan , Naveen Sivadasan , Uma Sunderam , Sujatha Kotte , Rajgopal Srinivasan
摘要: Diagnosis of rare human diseases using DNA sequencing is a fast growing area of research. Conventional methods carries a risk of incorrect phenotype interpretation. However, obtaining a correct genotype and phenotype matching is challenging. A system for matching phenotype descriptions and pathogenic variants provides a one to one mapping of the phenotype and genotypes of a plurality of subjects under test. Initially, a plurality of phenotypes and a plurality of genome sequences are segmented based on metadata. A phenotype driven gene prioritization and a variant prioritization is applied on the segmented data method. A similarity score is calculated between the phenotype driven gene prioritization output and the variant prioritization output. The similarity score is further utilized to obtain a one to one matching of the plurality of phenotypes and the plurality of genotype sequences of the plurality of subjects under test.
-
公开(公告)号:US20220392565A1
公开(公告)日:2022-12-08
申请号:US17737106
申请日:2022-05-05
摘要: This disclosure relates generally to a method and a system for profiling of metagenome samples. Most state of-art techniques for metagenomic profiling use homology-based, curated database of identified marker sequences generated after complex and costly pre-processing. The disclosed method and system for profiling of metagenome samples are a non-homology based, a non-marker based and an alignment free strain level profiling tools for microbe profiling. The disclosure works with a several k-mer based indexing techniques for constructing a compact and comprehensive multi-level indexing, wherein the multi-level indexing includes a L1-Index and a L2-Index. The multi-level indexing is used for profiling metagenomics by abundance estimation, wherein the abundance estimation includes a relative abundance and an absolute abundance.
-
公开(公告)号:US11915792B2
公开(公告)日:2024-02-27
申请号:US17737106
申请日:2022-05-05
CPC分类号: G16B10/00 , G06F16/2228 , G16B30/10 , G16B40/20
摘要: This disclosure relates generally to a method and a system for profiling of metagenome samples. Most state of-art techniques for metagenomic profiling use homology-based, curated database of identified marker sequences generated after complex and costly pre-processing. The disclosed method and system for profiling of metagenome samples are a non-homology based, a non-marker based and an alignment free strain level profiling tools for microbe profiling. The disclosure works with a several k-mer based indexing techniques for constructing a compact and comprehensive multi-level indexing, wherein the multi-level indexing includes a L1-Index and a L2-Index. The multi-level indexing is used for profiling metagenomics by abundance estimation, wherein the abundance estimation includes a relative abundance and an absolute abundance.
-
公开(公告)号:US11348693B2
公开(公告)日:2022-05-31
申请号:US16378265
申请日:2019-04-08
发明人: Thomas Joseph , Aditya Rao , Naveen Sivadasan , Saipradeep Govindakrishnan Vangala , Sujatha Kotte , Rajgopal Srinivasan
IPC分类号: G16H70/60 , G16H10/20 , G06F16/28 , G06F16/90 , G16B45/00 , G16B35/10 , G16B50/10 , G06N5/04 , G06F16/901
摘要: This disclosure relates generally to method and system for graph convolution based gene prioritization on heterogeneous networks. The method includes obtaining a set of entities for human rare diseases from one or more sources containing rare diseases, genes, phenotypes for rare diseases and biological pathways and constructing an initial heterogeneous network using each of an entity from the set of entities. the initial heterogeneous network applying Graph Convolution-based Association Scoring (GCAS) to the initial heterogeneous network to derive inferred associations and creating a Heterogeneous Association Network for Rare Diseases (HANRD) by adding the inferred associations to the initial heterogeneous network and generating a prioritized set of genes for an input query being received in the HANRD.
-
公开(公告)号:US10176188B2
公开(公告)日:2019-01-08
申请号:US13752620
申请日:2013-01-29
发明人: Rajgopal Srinivasan , Thomas Joseph , Venkat Raghavan Ganesh Sekar , Saipradeep Govindakrishnan Vangala , Naveen Sivadasan
摘要: Systems and methods for automated creation of a dictionary of scientific terms are described herein. Initially, input data is filtered to obtain a primary file having a plurality of term-ID pairs with each term-ID pair having a unique term ID and a scientific term. Further, a remove-term file is generated based on one or more term-ID pairs identified from the primary file such that the scientific terms of each term-ID pair corresponds to one of additional terms, frequent scientific terms, and undesirable terms. At least one term-ID pair from among the one or more term-ID pairs is altered to obtain a modified term-ID pair based on modification rules. The modified term-ID pair is added to an add-term file and a modified file is obtained based on the remove-term file and the add-term file. Duplicate term-ID pairs present in the modified file are removed to obtain the dictionary of scientific terms.
-
公开(公告)号:US20130218849A1
公开(公告)日:2013-08-22
申请号:US13752620
申请日:2013-01-29
发明人: Rajgopal Srinivasan , Thomas Joseph , Venkat Raghavan Ganesh Sekar , Saipradeep Govindakrishnan Vangala , Naveen Sivadasan
IPC分类号: G06F17/30
CPC分类号: G06F17/30156 , G06F17/2735 , G06F17/30731 , G06F19/28
摘要: Systems and methods for automated creation of a dictionary of scientific terms are described herein. Initially, input data is filtered to obtain a primary file having a plurality of term-ID pairs with each term-ID pair having a unique term ID and a scientific term. Further, a remove-term file is generated based on one or more term-ID pairs identified from the primary file such that the scientific terms of each term-ID pair corresponds to one of additional terms, frequent scientific terns, and undesirable terms. At least one term-ID pair from among the one or more term-ID pairs is altered to obtain a modified term-ID pair based on modification rules. The modified term-ID pair is added to an add-term file and a modified file is obtained based on the remove-term file and the add-term file. Duplicate term-ID pairs present in the modified file are removed to obtain the dictionary of scientific terms.
摘要翻译: 本文描述了用于自动创建科学术语词典的系统和方法。 最初,输入数据被过滤以获得具有多个术语ID对的主文件,每个术语ID对具有唯一的术语ID和科学术语。 此外,基于从主文件识别的一个或多个术语ID对生成删除术语文件,使得每个术语-ID对的科学术语对应于附加术语之一,频繁的科学分类和不期望的术语。 一个或多个术语ID对中的至少一个术语ID对被改变以基于修改规则获得修改的术语ID对。 修改后的术语ID对被添加到添加项文件中,并且基于删除项文件和添加项文件获得修改的文件。 删除修改文件中存在的重复术语ID对,以获得科学术语的字典。
-
-
-
-
-