发明申请
- 专利标题: Methods of Clustering Gene and Protein Sequences
- 专利标题(中): 聚类基因和蛋白质序列的方法
-
申请号: US12086717申请日: 2006-12-19
-
公开(公告)号: US20090327170A1公开(公告)日: 2009-12-31
- 发明人: Claudio Donati , Duccio Medini , Antonello Covacci
- 申请人: Claudio Donati , Duccio Medini , Antonello Covacci
- 国际申请: PCT/IB2006/003901 WO 20061219
- 主分类号: G06F15/18
- IPC分类号: G06F15/18 ; C07K14/00 ; C07K16/18 ; C12N15/11 ; A61K39/395 ; A61K39/00 ; A61K31/7088 ; A61K38/16
摘要:
The invention relates to methods for clustering gene and protein sequences. In particular, it involves generation of networks of sequences where the interconnections are based upon a measure of similarity. The invention also provides methods of optimizing and improving the networks by re-wiring of the network based upon overlap of the nearest neighbors of given pairs of nodes. The invention further provides methods of identifying clusters of sequences within the networks and the optimized networks based upon the topology of the network. The clusters identified represent groups of sequences that are related by function and/or evolution. The invention has particular applicability in annotation of sequences in databases and identification of functional homologs which can be very useful for novel therapeutic and diagnostic targets based upon such targets belonging to a cluster or family that contains a known sequence such as a diagnostic sequence, antigen or other therapeutic target.
信息查询