Process and apparatus for using the sets of pseudo random subsequences present in genomes for identification of species
    1.
    发明申请
    Process and apparatus for using the sets of pseudo random subsequences present in genomes for identification of species 审中-公开
    使用存在于基因组中的用于物种鉴定的伪随机子序列的方法和装置

    公开(公告)号:US20050255459A1

    公开(公告)日:2005-11-17

    申请号:US10879061

    申请日:2004-06-30

    摘要: Our research conducted with the genome sequences of more than 250 species of organisms (including viral, microbial, and multi-cellular organisms, and human) results in the discovery that the occurrence of a particular subsequence (the so-called “motifs” or “n-mers,” (n being the length of the subsequences), which can be up to 25 and higher) in the genome of a particular species can be considered as a nearly random event; and that the occurrences of a particular subsequence in the genome sequences of different species can be considered as nearly independent events (with the exception of the cases where extremely closely related species are compared). The set of subsequences that occur in a particular species' genome can therefore be used as a genomic “fingerprint” of this species. This discovery leads to the concept of utilizing a set of pseudo-randomly designed subsequences for species identification or discrimination. These subsequences (probes, primers, motifs, n-mers) can be used with hybridization-based technologies (including, but not limited to, the microarray or PCR technologies) and any other technology allow to identity the fact of presence/absence of particular subsequence in genomic DNA for identification of species. The same approach can also be used to identify individuals of the same species (including the human species), to estimate the genome size of unknown organisms, and to estimate the total genome size in samples containing several viral, microbial, and eukaryotic genomes. The identification methods currently in use for these purposes require sequencing of the genomic sequences of the species or the individuals of interest. The introduction of the proposed computational method eradicates such requirement, and will tremendously reduce the expense of these tests.

    摘要翻译: 我们对超过250种生物(包括病毒,微生物和多细胞生物体以及人类)的基因组序列进行的研究导致发现特定亚序列的发生(所谓的“基序”或“ 在某一种类的基因组中,“(n是子序列的长度)可以达到25以上)被认为是近似随机的事件; 并且不同物种的基因组序列中特定子序列的发生可以被认为是几乎独立的事件(除了比较非常密切相关的物种的情况)。 因此,在特定物种基因组中发生的一系列子序列可以用作该物种的基因组“指纹图谱”。 这一发现导致利用一组伪随机设计的子序列来进行物种识别或鉴别的概念。 这些子序列(探针,引物,基序,n-mers)可以与基于杂交的技术(包括但不限于微阵列或PCR技术)一起使用,并且任何其他技术允许识别特定的存在/不存在的事实 用于鉴定物种的基因组DNA中的亚序列。 也可以使用相同的方法来鉴定同一物种(包括人类物种)的个体,估计未知生物体的基因组大小,并估计含有几种病毒,微生物和真核生物基因组的样品中的总基因组大小。 目前用于这些目的的识别方法需要对该物种或感兴趣的个体的基因组序列进行测序。 提出的计算方法的引入消除了这种要求,并将大大降低这些测试的费用。

    Compositions, processes and algorithms for microbial detection
    2.
    发明申请
    Compositions, processes and algorithms for microbial detection 审中-公开
    用于微生物检测的组成,过程和算法

    公开(公告)号:US20170039316A1

    公开(公告)日:2017-02-09

    申请号:US10973113

    申请日:2004-10-25

    IPC分类号: G06F19/20 G06F19/22 C12Q1/68

    摘要: Processes for identifying whether any parasite or other organism is present in a host comprising: a. scanning for non-host signatures, b. scanning for one-error-removed non-host signatures; c. scanning for N-error removed non-host signatures; where N is selected to give the desired statistical certainty of the presence or absence of any parasite in the host. Algorithms useful for such detections and listings of specific signatures” (sequences or subsequences) for identifying specific microorganisms are also both provided.

    摘要翻译: 用于鉴定宿主中是否存在任何寄生虫或其他生物体的方法,包括:a。 扫描非主机签名,b。 扫描一个错误删除的非主机签名; C。 扫描N错误删除的非主机签名; 其中选择N以给出宿主中存在或不存在任何寄生虫的期望的统计确定性。 还提供了用于鉴定特定微生物的用于这种检测和特定特征的列表的“算法(序列或子序列)”。

    METHOD AND APPARATUS FOR SEQUENCING DATA SAMPLES
    3.
    发明申请
    METHOD AND APPARATUS FOR SEQUENCING DATA SAMPLES 审中-公开
    用于序列数据样本的方法和装置

    公开(公告)号:US20100049445A1

    公开(公告)日:2010-02-25

    申请号:US12487496

    申请日:2009-06-18

    IPC分类号: G06F19/00

    CPC分类号: G16B30/00

    摘要: A method for identifying non-host nucleic acid sequence using sequence data. The method of identifying non-host nucleic acid can include sequencing a sample into sequences and associating the sequences with a host genome and then exclude any sequences that are associated with the host genome. The method can then associate the sequences with any known genomes and exclude any sequences that are associated with any known genome. The remaining sequences can be used as seed sequences to assemble a non-host nucleic acid.

    摘要翻译: 使用序列数据鉴定非宿主核酸序列的方法。 识别非宿主核酸的方法可以包括将样品测序成序列并将序列与宿主基因组结合,然后排除与宿主基因组相关的任何序列。 然后该方法可以将序列与任何已知的基因组相关联,并排除与任何已知基因组相关联的任何序列。 其余序列可用作种子序列以组装非宿主核酸。