Aligning and clustering sequence patterns to reveal classificatory functionality of sequences
摘要:
A system and method of discovering sequence patterns with variations is provided. The method includes: accessing or acquiring a data set including a family of sequences or related families of sequences; a) applying a pattern discovery process to the sequences; b) grouping and aligning the similar patterns that may have different lengths into one or more Aligned Pattern Clusters; c) discovering the co-occurrence relation between Aligned Patterns and/or Aligned Pattern Clusters to reveal the distal function between segments represented by the aligned Pattern Clusters and d) breaking down an Aligned Pattern Cluster into sub-clusters with stable cluster configuration that reveals sub-clusters with distinct and shared characteristic among sub-family of the sequences.
信息查询
0/0