Gene discovery through comparisons of networks of structural and functional relationships among known genes and proteins
    1.
    发明申请
    Gene discovery through comparisons of networks of structural and functional relationships among known genes and proteins 审中-公开
    通过比较已知基因和蛋白质之间的结构和功能关系网络进行基因发现

    公开(公告)号:US20060069512A1

    公开(公告)日:2006-03-30

    申请号:US10921286

    申请日:2004-08-18

    IPC分类号: G06F19/00

    摘要: The present invention relates to methods for identifying novel genes comprising: (i) generating one or more specialized databases containing information on gene/protein structure, function and/or regulatory interactions; and (ii) searching the specialized databases for homology or for a particular motif and thereby identifying a putative novel gene of interest. The invention may further comprise performing simulation and hypothesis testing to identify or confirm that the putative gene is a novel gene of interest. The present invention also relates to natural language processing and extraction of relational information associated with genes and proteins that are found in genomics journal articles. To enable access to information in textual form, the natural language processing system of the present invention provides a method for extracting and structuring information found in the literature in a form appropriate for subsequent applications.

    摘要翻译: 本发明涉及用于鉴定新基因的方法,其包括:(i)生成含有关于基因/蛋白质结构,功能和/或调节相互作用信息的一个或多个专门数据库; 和(ii)搜索专门的数据库以获得同源性或特定基序,从而鉴定出推定的新型感兴趣的基因。 本发明还可以包括执行模拟和假设检验以鉴定或确认推定的基因是新的感兴趣的基因。 本发明还涉及在基因组杂志文章中发现的与基因和蛋白质相关的关系信息的自然语言处理和提取。 为了能够以文本形式访问信息,本发明的自然语言处理系统提供了一种用于以适合于后续应用的形式提取和构造文献中发现的信息的方法。

    Methods for extracting information on interactions between biological entities from natural language text data
    3.
    发明授权
    Methods for extracting information on interactions between biological entities from natural language text data 失效
    从自然语言文本数据中提取生物实体之间相互作用信息的方法

    公开(公告)号:US06950753B1

    公开(公告)日:2005-09-27

    申请号:US09549827

    申请日:2000-04-14

    摘要: The present invention relates to methods for identifying novel genes comprising: (i) generating one and/or more specialized databases containing information on gene/protein structure, function and/or regulatory interactions; and (ii) searching the specialized databases for homology or for a particular motif and thereby identifying a putative novel gene of interest. The invention may further comprise performing simulation and hypothesis testing to identify or confirm that the putative gene is a novel gene of interest. The present invention also relates to natural language processing and extraction of relational information associated with genes and proteins that are found in genomics journal articles. To enable access to information in textual form, the natural language processing system of the present invention provides a method for extracting and structuring information found in the literature in a form appropriate for subsequent applications.

    摘要翻译: 本发明涉及用于鉴定新基因的方法,其包括:(i)产生含有关于基因/蛋白质结构,功能和/或调节相互作用的信息的一个和/或多个专门数据库; 和(ii)搜索专门的数据库以获得同源性或特定基序,从而鉴定出推定的新型感兴趣的基因。 本发明还可以包括执行模拟和假设检验以鉴定或确认推定的基因是新的感兴趣的基因。 本发明还涉及在基因组杂志文章中发现的与基因和蛋白质相关的关系信息的自然语言处理和提取。 为了能够以文本形式访问信息,本发明的自然语言处理系统提供了一种用于以适合于后续应用的形式提取和构造文献中发现的信息的方法。

    Gene discovery through comparisons of networks of structural and functional relationships among known genes and proteins
    5.
    发明授权
    Gene discovery through comparisons of networks of structural and functional relationships among known genes and proteins 失效
    通过比较已知基因和蛋白质之间的结构和功能关系网络进行基因发现

    公开(公告)号:US06633819B2

    公开(公告)日:2003-10-14

    申请号:US09327983

    申请日:1999-06-08

    IPC分类号: G01N3100

    摘要: The present invention relates to methods for identifying novel genes comprising: (i) generating one or more specialized databases containing information on gene/protein structure, function and/or regulatory interactions; and (ii) searching the specialized databases for homology or for a particular motif and thereby identifying a putative novel gene of interest. The invention may further comprise performing simulation and hypothesis testing to identify or confirm that the putative gene is a novel gene of interest.

    摘要翻译: 本发明涉及用于鉴定新基因的方法,其包括:(i)生成含有关于基因/蛋白质结构,功能和/或调节相互作用信息的一个或多个专门数据库; 和(ii)搜索专门的数据库以获得同源性或特定基序,从而鉴定出推定的新型感兴趣的基因。 本发明还可以包括执行模拟和假设检验以鉴定或确认推定的基因是新的目的基因。

    SYSTEMS AND METHODS FOR USING MOLECULAR NETWORKS IN GENETIC LINKAGE ANALYSIS OF COMPLEX TRAITS
    6.
    发明申请
    SYSTEMS AND METHODS FOR USING MOLECULAR NETWORKS IN GENETIC LINKAGE ANALYSIS OF COMPLEX TRAITS 审中-公开
    使用分子网络进行遗传连锁分析的系统与方法

    公开(公告)号:US20090138203A1

    公开(公告)日:2009-05-28

    申请号:US12207024

    申请日:2008-09-09

    IPC分类号: G06F17/18 G06F19/00

    摘要: The present disclosed subject matter relates to methods of using molecular networks in whole genome genetic linkage analysis of complex inherited disorders, including determining gene-specific linkage probability values for one or more genes represented in a predetermined molecular interaction network. The present disclosed subject matter further relates to methods of identifying one or more gene that is associated with one or more heritable diseases, and methods of diagnosing the heritable diseases.

    摘要翻译: 本公开的主题涉及在复杂遗传病症的全基因组遗传连锁分析中使用分子网络的方法,包括确定在预定分子相互作用网络中表示的一个或多个基因的基因特异性连锁概率值。 本公开的主题还涉及鉴定与一种或多种遗传性疾病相关的一种或多种基因的方法,以及诊断可遗传性疾病的方法。

    Methods and systems for extracting synonymous gene and protein terms from biological literature
    7.
    发明申请
    Methods and systems for extracting synonymous gene and protein terms from biological literature 审中-公开
    从生物学文献中提取同义基因和蛋白质术语的方法和系统

    公开(公告)号:US20050033568A1

    公开(公告)日:2005-02-10

    申请号:US10915168

    申请日:2004-08-09

    IPC分类号: G06F17/27 G06F7/00

    摘要: The present invention generally provides methods for extracting gene and/or protein synonyms from text by processing a plurality of documents making up a text corpus, tagging a plurality of terms, each term identifying at least one of a gene and a protein from the text corpus, and determining whether at least two of the tagged terms are synonyms identifying a common gene or protein using one or more of expert knowledge or machine learning techniques, including unsupervised, partially supervised, and supervised machine learning techniques.

    摘要翻译: 本发明通常提供了通过处理构成文本语料库的多个文档,标记多个术语,从术语语料库中识别基因和蛋白质中的至少一个的每个术语来提取来自文本的基因和/或蛋白质同义词的方法。 并且使用专家知识或机器学习技术(包括无监督,部分监督和监督的机器学习技术)来确定标记术语中至少两个是否是识别共同基因或蛋白质的同义词。