摘要:
Sequences that can be used in the context of controlled gene regulation are provided. In one aspect, at least one sequence comprising at least one of one or more sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326 is provided. One or more of the provided sequences may be computationally predicted, e.g., from publicly available genomes, using a method based on pattern discovery. In another aspect, a method for regulating the expression of a transcript comprises the step of said transcript containing a region that corresponds to at least one of the provided sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326, the region being targeted either by a naturally occurring, or appropriately designed, interfering RNA molecule that regulates the expression of said transcript through post-transcriptional silencing. In a third aspect, a method for regulating the expression of a transcript comprises the step of at least one of the provided sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326 being used to design an interfering RNA molecule that contains a region that corresponds to the reverse complement of one or more of the one or more sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326, the interfering molecule regulating, through post-transcriptional silencing, one or more transcripts that contain said sequence of the one or more sequences having SEQ ID NO: 1 through SEQ ID NO: 747,326, or a substantial fraction thereof.
摘要翻译:提供了可在受控基因调控环境中使用的序列。 在一个方面,提供至少一个包含具有SEQ ID NO:1至SEQ ID NO:747,326的一个或多个序列中的至少一个的序列。 所提供的序列中的一个或多个可以使用基于模式发现的方法从计算上预测,例如来自公众可获得的基因组。 在另一方面,调节转录物表达的方法包括所述转录物的步骤,所述转录物含有对应于至少一个所提供的具有SEQ ID NO:1至SEQ ID NO:747,326的序列的区域,所述区域被靶向 通过天然存在或适当设计的干扰RNA分子,其通过转录后沉默调节所述转录物的表达。 在第三方面,用于调节转录物表达的方法包括提供的具有SEQ ID NO:1至SEQ ID NO:747,326的序列中的至少一个的步骤用于设计干扰RNA分子,所述干扰RNA分子含有 对应于具有SEQ ID NO:1至SEQ ID NO:747,326的一个或多个一个或多个序列的反向互补序列,干扰分子通过转录后沉默调节一个或多个含有所述序列的所述序列的转录本 或更多的具有SEQ ID NO:1至SEQ ID NO:747,326的序列,或其相当一部分。
摘要:
Sequences of ribonucleic acid interference molecules are provided. For example, in one aspect, at least one nucleic acid molecule comprising at least one of one or more precursor sequences having SEQ_ID NO: 1 through SEQ_ID NO: 3,197 and one or more corresponding mature sequences having SEQ_ID NO: 3,198 through SEQ_ID NO: 6,565 is provided. Techniques are also provided for regulating gene expression.
摘要翻译:提供了核糖核酸干扰分子的序列。 例如,在一个方面,至少一个核酸分子包含至少一个具有SEQ ID NO:1至SEQ ID NO:3,197的一个或多个前体序列的一个或多个具有SEQ ID NO:3,198至SEQ ID NO:6,565的相应成熟序列 被提供。 还提供了调节基因表达的技术。
摘要:
A method and system for determining whether a sequence fragment g is atypical with respect to a reference sequence G using compositional methods and including constructing a template from G and g respectively containing a sequence of characters for a comparison with one another, wherein a number of characters contained in the template exceeds two. For the case where the sequences at hand are genetic, the atypicality detection can be used to determine whether a given sequence fragment g is the result of a horizontal transfer event.
摘要:
In a sequence homology detection aspect of the invention, a computer-based method of detecting homologies between a plurality of sequences in a database and a query sequence comprises the following steps. First, the method includes accessing patterns associated with the database, each pattern representing at least a portion of one or more sequences in the database. Next, the query sequence is compared to the patterns to detect whether one or more portions of the query sequence are homologous to portions of the sequences of the database represented by the patterns. Then, a score is generated for each sequence detected to be homologous to the query sequence, wherein the sequence score is based on individual scores generated in accordance with each homologous portion of the sequence detected, and the sequence score represents a degree of homology between the query sequence and the detected sequence.
摘要:
An algorithm which detects tandem repeats (TR) is provided. In an illustrative embodiment, a set of repeating units contained in an input sequence is identified, wherein each given repeating unit satisfies at least the following conditions: (a) a first measure of similarity between adjacent repeating units in the set is greater than a first user defined threshold, and (b) the given repeating unit includes at least one unit having a second measure of similarity with any other unit in the set that is a greater than a second user defined threshold. The method then provides for reporting positions in the input sequence that are covered by the set of repeating units.
摘要:
The method of the present invention discovers patterns in a protein sequences in two phases. In a sampling phase, preferably proper templates corresponding to a group of protein sequences are generated. Patterns corresponding to the templates are then generated and stored in memory. In a convolution phase, the patterns stored in memory are combined to identify a set of maximal patterns.
摘要:
The relationship between an amino acid sequence of a protein and its three-dimensional structure is at the very core of structural biology and bioinformatics. The occurrence and conservation of non-canonical conformations is a “local” phenomenon, i.e., non-canonical conformations are encoded intra-helically by short peptide sequences (heptapeptides at most). Effective descriptors can be formed for these short sequences employing training sets. Multiple, distinct patterns are created representing these sequences. A composite descriptor is formed by selecting from among the patterns discovered. The composite descriptor has a high level of sensitivity and specificity while, at the same time, a boosted signal-to-noise ratio.
摘要:
A method and system for determining whether a sequence fragment g is atypical with respect to a reference sequence G using compositional methods and including constructing a template from G and g respectively containing a sequence of characters for a comparison with one another, wherein a number of characters contained in the template exceeds two. For the case where the sequences at hand are genetic, the atypicality detection can be used to determine whether a given sequence fragment g is the result of a horizontal transfer event.
摘要:
A system and method for identifying genes that employs a pattern database, an input device for inputting a DNA sequence, and a processor for processing the DNA sequence and patterns to identify a putative gene. The processor may determine open reading frames (ORFs) in the DNA sequence, generate an amino acid translation for each ORF, and identify a match of a pattern in the amino acid translation.
摘要:
Techniques for linking non-coding and gene coding regions of a genome are provided. In one aspect, a method of determining associations between non-coding sequences and gene coding sequences in a genome of an organism comprises the following steps. At least one conserved region is identified from one or more non-coding sequences. Additional instances of the conserved region are located in the untranslated or amino acid coding regions of one or more genes in the organism under consideration, and the conserved region is associated with the one or more biological processes in which these one or more genes participate.