Ribonucleic acid interference molecules and binding sites derived by analyzing intergenic and intronic regions of genomes
    1.
    发明申请
    Ribonucleic acid interference molecules and binding sites derived by analyzing intergenic and intronic regions of genomes 有权
    通过分析基因组的基因和内含子区域导致的核糖核酸干扰分子和结合位点

    公开(公告)号:US20110178283A1

    公开(公告)日:2011-07-21

    申请号:US11408557

    申请日:2006-04-21

    IPC分类号: C07H21/02

    CPC分类号: G06F19/22 G06F19/18

    摘要: Sequences that can be used in the context of controlled gene regulation are provided. In one aspect, at least one sequence comprising at least one of one or more sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326 is provided. One or more of the provided sequences may be computationally predicted, e.g., from publicly available genomes, using a method based on pattern discovery. In another aspect, a method for regulating the expression of a transcript comprises the step of said transcript containing a region that corresponds to at least one of the provided sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326, the region being targeted either by a naturally occurring, or appropriately designed, interfering RNA molecule that regulates the expression of said transcript through post-transcriptional silencing. In a third aspect, a method for regulating the expression of a transcript comprises the step of at least one of the provided sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326 being used to design an interfering RNA molecule that contains a region that corresponds to the reverse complement of one or more of the one or more sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326, the interfering molecule regulating, through post-transcriptional silencing, one or more transcripts that contain said sequence of the one or more sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326, or a substantial fraction thereof.

    摘要翻译: 提供了可在受控基因调控环境中使用的序列。 在一个方面,提供至少一个包含具有SEQ ID NO:1至SEQ ID NO:747,326的一个或多个序列中的至少一个的序列。 所提供的序列中的一个或多个可以使用基于模式发现的方法从计算上预测,例如来自公众可获得的基因组。 在另一方面,调节转录物表达的方法包括所述转录物的步骤,所述转录物含有对应于至少一个所提供的具有SEQ ID NO:1至SEQ ID NO:747,326的序列的区域,所述区域被靶向 通过天然存在或适当设计的干扰RNA分子,其通过转录后沉默调节所述转录物的表达。 在第三方面,用于调节转录物表达的方法包括提供的具有SEQ ID NO:1至SEQ ID NO:747,326的序列中的至少一个的步骤用于设计干扰RNA分子,所述干扰RNA分子含有 对应于具有SEQ ID NO:1至SEQ ID NO:747,326的一个或多个一个或多个序列的反向互补序列,干扰分子通过转录后沉默调节一个或多个含有所述序列的所述序列的转录本 或更多的具有SEQ ID NO:1至SEQ ID NO:747,326的序列,或其相当一部分。

    RIBONUCLEIC ACID INTERFERENCE MOLECULES OF ARABIDOPSIS THALIANA
    2.
    发明申请
    RIBONUCLEIC ACID INTERFERENCE MOLECULES OF ARABIDOPSIS THALIANA 有权
    ARIBIDOPSIS THALIANA的RIBONUCLEIC ACID INTERFERENCE MOLECULES

    公开(公告)号:US20080311661A1

    公开(公告)日:2008-12-18

    申请号:US12183166

    申请日:2008-07-31

    IPC分类号: C12N15/82 C12N15/11 C12N15/00

    摘要: Sequences of ribonucleic acid interference molecules are provided. For example, in one aspect, at least one nucleic acid molecule comprising at least one of one or more precursor sequences having SEQ_ID NO: 1 through SEQ_ID NO: 3,197 and one or more corresponding mature sequences having SEQ_ID NO: 3,198 through SEQ_ID NO: 6,565 is provided. Techniques are also provided for regulating gene expression.

    摘要翻译: 提供了核糖核酸干扰分子的序列。 例如,在一个方面,至少一个核酸分子包含至少一个具有SEQ ID NO:1至SEQ ID NO:3,197的一个或多个前体序列的一个或多个具有SEQ ID NO:3,198至SEQ ID NO:6,565的相应成熟序列 被提供。 还提供了调节基因表达的技术。

    Method and system for the detection of atypical sequences via generalized compositional methods
    3.
    发明申请
    Method and system for the detection of atypical sequences via generalized compositional methods 失效
    通过广义组成方法检测非典型序列的方法和系统

    公开(公告)号:US20050267692A1

    公开(公告)日:2005-12-01

    申请号:US10855367

    申请日:2004-05-28

    CPC分类号: G06F19/22 Y10S707/99936

    摘要: A method and system for determining whether a sequence fragment g is atypical with respect to a reference sequence G using compositional methods and including constructing a template from G and g respectively containing a sequence of characters for a comparison with one another, wherein a number of characters contained in the template exceeds two. For the case where the sequences at hand are genetic, the atypicality detection can be used to determine whether a given sequence fragment g is the result of a horizontal transfer event.

    摘要翻译: 一种用于使用组合方法确定序列片段g是否不典型的方法和系统,并且包括构建分别包含用于彼此比较的字符序列的G和g的模板,其中多个字符 包含在模板中超过两个。 对于手头序列是遗传的情况,可以使用非典型性检测来确定给定的序列片段g是否是水平传递事件的结果。

    Methods and apparatus for performing sequence homology detection
    4.
    发明授权
    Methods and apparatus for performing sequence homology detection 失效
    用于进行序列同源性检测的方法和装置

    公开(公告)号:US06785672B1

    公开(公告)日:2004-08-31

    申请号:US09582044

    申请日:2000-06-21

    IPC分类号: G06F1730

    摘要: In a sequence homology detection aspect of the invention, a computer-based method of detecting homologies between a plurality of sequences in a database and a query sequence comprises the following steps. First, the method includes accessing patterns associated with the database, each pattern representing at least a portion of one or more sequences in the database. Next, the query sequence is compared to the patterns to detect whether one or more portions of the query sequence are homologous to portions of the sequences of the database represented by the patterns. Then, a score is generated for each sequence detected to be homologous to the query sequence, wherein the sequence score is based on individual scores generated in accordance with each homologous portion of the sequence detected, and the sequence score represents a degree of homology between the query sequence and the detected sequence.

    摘要翻译: 在本发明的序列同源性检测方面,基于计算机的检测数据库中的多个序列与查询序列之间的同源性的方法包括以下步骤。 首先,该方法包括访问与数据库相关联的模式,每个模式表示数据库中的一个或多个序列的至少一部分。 接下来,将查询序列与模式进行比较,以检测查询序列的一个或多个部分是否与由模式表示的数据库的序列的部分同源。 然后,对于检测为与查询序列同源的每个序列产生得分,其中序列分数基于根据检测到的序列的每个同源部分产生的个体得分,并且序列分数表示 查询序列和检测到的序列。

    Tandem repeat detection using pattern discovery
    5.
    发明授权
    Tandem repeat detection using pattern discovery 失效
    使用模式发现进行串联重复检测

    公开(公告)号:US06446011B1

    公开(公告)日:2002-09-03

    申请号:US09528601

    申请日:2000-03-20

    IPC分类号: G01N3348

    摘要: An algorithm which detects tandem repeats (TR) is provided. In an illustrative embodiment, a set of repeating units contained in an input sequence is identified, wherein each given repeating unit satisfies at least the following conditions: (a) a first measure of similarity between adjacent repeating units in the set is greater than a first user defined threshold, and (b) the given repeating unit includes at least one unit having a second measure of similarity with any other unit in the set that is a greater than a second user defined threshold. The method then provides for reporting positions in the input sequence that are covered by the set of repeating units.

    摘要翻译: 提供了一种检测串联重复(TR)的算法。 在说明性实施例中,识别包含在输入序列中的一组重复单元,其中每个给定的重复单元至少满足以下条件:(a)该组中相邻重复单元之间的相似度的第一测量值大于第一 用户定义的阈值,以及(b)所述给定的重复单元包括至少一个单元,所述至少一个单元具有与所述集合中的大于第二用户定义的阈值的任何其他单元的相似度的第二度量。 该方法然后提供在输入序列中报告由该组重复单元覆盖的位置。

    Method and apparatus for pattern discovery in protein sequences
    6.
    发明授权
    Method and apparatus for pattern discovery in protein sequences 失效
    蛋白质序列中图案发现的方法和装置

    公开(公告)号:US06373971B1

    公开(公告)日:2002-04-16

    申请号:US09023792

    申请日:1998-02-13

    IPC分类号: G06K968

    摘要: The method of the present invention discovers patterns in a protein sequences in two phases. In a sampling phase, preferably proper templates corresponding to a group of protein sequences are generated. Patterns corresponding to the templates are then generated and stored in memory. In a convolution phase, the patterns stored in memory are combined to identify a set of maximal patterns.

    摘要翻译: 本发明的方法在两相中发现蛋白质序列中的模式。 在取样阶段,优选产生对应于一组蛋白质序列的适当的模板。 然后生成与模板相对应的模式并将其存储在存储器中。 在卷积阶段,存储在存储器中的图案被组合以识别一组最大图案。

    Sequence pattern descriptors for transmembrane structural details
    7.
    发明授权
    Sequence pattern descriptors for transmembrane structural details 失效
    跨膜结构细节的序列模式描述符

    公开(公告)号:US07698067B2

    公开(公告)日:2010-04-13

    申请号:US10305552

    申请日:2002-11-27

    IPC分类号: G01N33/48 G01N31/00 G06G7/48

    CPC分类号: G06F19/24 G06F19/16 G06F19/22

    摘要: The relationship between an amino acid sequence of a protein and its three-dimensional structure is at the very core of structural biology and bioinformatics. The occurrence and conservation of non-canonical conformations is a “local” phenomenon, i.e., non-canonical conformations are encoded intra-helically by short peptide sequences (heptapeptides at most). Effective descriptors can be formed for these short sequences employing training sets. Multiple, distinct patterns are created representing these sequences. A composite descriptor is formed by selecting from among the patterns discovered. The composite descriptor has a high level of sensitivity and specificity while, at the same time, a boosted signal-to-noise ratio.

    摘要翻译: 蛋白质的氨基酸序列与其三维结构之间的关系是结构生物学和生物信息学的核心。 非规范构象的发生和保守是“局部”现象,即非规范构象由短肽序列(最多七肽)螺旋内编码。 可以为采用训练集的这些短序列形成有效描述符。 创建表示这些序列的多个不同的模式。 通过从发现的模式中进行选择形成复合描述符。 复合描述符具有高水平的灵敏度和特异性,同时具有提升的信噪比。

    Apparatus, machine-readable medium, and system for the detection of atypical sequences via generalized compositional methods
    8.
    发明授权
    Apparatus, machine-readable medium, and system for the detection of atypical sequences via generalized compositional methods 失效
    仪器,机器可读介质和用于通过广义组成方法检测非典型序列的系统

    公开(公告)号:US07613662B2

    公开(公告)日:2009-11-03

    申请号:US10855367

    申请日:2004-05-28

    IPC分类号: G06N3/12

    CPC分类号: G06F19/22 Y10S707/99936

    摘要: A method and system for determining whether a sequence fragment g is atypical with respect to a reference sequence G using compositional methods and including constructing a template from G and g respectively containing a sequence of characters for a comparison with one another, wherein a number of characters contained in the template exceeds two. For the case where the sequences at hand are genetic, the atypicality detection can be used to determine whether a given sequence fragment g is the result of a horizontal transfer event.

    摘要翻译: 一种用于使用组合方法确定序列片段g是否不典型的方法和系统,并且包括构建分别包含用于彼此比较的字符序列的G和g的模板,其中多个字符 包含在模板中超过两个。 对于手头序列是遗传的情况,可以使用非典型性检测来确定给定的序列片段g是否是水平传递事件的结果。

    System and method for identifying genes

    公开(公告)号:US07561974B2

    公开(公告)日:2009-07-14

    申请号:US10059421

    申请日:2002-01-31

    IPC分类号: G01N33/48

    CPC分类号: G06F19/22

    摘要: A system and method for identifying genes that employs a pattern database, an input device for inputting a DNA sequence, and a processor for processing the DNA sequence and patterns to identify a putative gene. The processor may determine open reading frames (ORFs) in the DNA sequence, generate an amino acid translation for each ORF, and identify a match of a pattern in the amino acid translation.

    Techniques for Linking Non-Coding and Gene-Coding Deoxyribonucleic Acid Sequences and Applications Thereof
    10.
    发明申请
    Techniques for Linking Non-Coding and Gene-Coding Deoxyribonucleic Acid Sequences and Applications Thereof 有权
    用于连接非编码和基因编码脱氧核糖核酸序列的技术及其应用

    公开(公告)号:US20080052008A1

    公开(公告)日:2008-02-28

    申请号:US11928611

    申请日:2007-10-30

    IPC分类号: G06F19/00 C07H21/04 C12Q1/68

    CPC分类号: G06F19/22 G06F19/18

    摘要: Techniques for linking non-coding and gene coding regions of a genome are provided. In one aspect, a method of determining associations between non-coding sequences and gene coding sequences in a genome of an organism comprises the following steps. At least one conserved region is identified from one or more non-coding sequences. Additional instances of the conserved region are located in the untranslated or amino acid coding regions of one or more genes in the organism under consideration, and the conserved region is associated with the one or more biological processes in which these one or more genes participate.

    摘要翻译: 提供了用于连接基因组的非编码区和基因编码区的技术。 一方面,确定生物体基因组中非编码序列与基因编码序列之间的关联的方法包括以下步骤。 从一个或多个非编码序列鉴定至少一个保守区。 保守区域的另外的实例位于正在考虑的生物体中的一个或多个基因的非翻译区或氨基酸编码区,并且保守区与这些一个或多个基因参与的一个或多个生物学过程相关。