METHODS FOR IDENTIFYING SEQUENCE MOTIFS, AND APPLICATIONS THEREOF
    3.
    发明申请
    METHODS FOR IDENTIFYING SEQUENCE MOTIFS, AND APPLICATIONS THEREOF 审中-公开
    识别序列运动的方法及其应用

    公开(公告)号:US20090208955A1

    公开(公告)日:2009-08-20

    申请号:US12302199

    申请日:2006-11-30

    IPC分类号: C12Q1/68

    CPC分类号: G16B30/00 C12P21/00 G16B10/00

    摘要: The present invention relates to methods and algorithms that can be used to identify sequence motifs that are either under- or over-represented in a given nucleotide sequence as compared to the frequency of those sequences that would be expected to occur by chance, or that are either under- or over-represented as compared to the frequency of those sequences that occur in other nucleotide sequences, and to methods of scoring sequences based on the occurrence of these sequence motifs. Such sequence motifs may be biologically significant, for example they may constitute transcription factor binding sites, mRNA stability/instability signals, epigenetic signals, and the like. The methods of the invention can also be used, inter alia, to classify sequences or organisms in terms of their phylogenetic relationships, or to identify the likely host of a pathogenic organism. The methods of the present invention can also be used to optimize expression of proteins.

    摘要翻译: 本发明涉及可用于鉴定在给定核苷酸序列中低于或过表达的序列基序的方法和算法,与预期偶然发生的那些序列的频率相比,或者 与在其他核苷酸序列中发生的那些序列的频率相比或低于或过度表达,以及基于这些序列基序的出现对序列进行评分的方法。 这样的序列基序可能是生物学上重要的,例如它们可能构成转录因子结合位点,mRNA稳定性/不稳定性信号,表观遗传信号等。 本发明的方法还可以用于根据其系统发育关系对序列或生物体进行分类,或鉴定病原体的可能宿主。 本发明的方法也可用于优化蛋白质的表达。