Abstract:
The invention relates to an automated method for high-throughput DNA sequencing from high density DNA arrays by (a) initiating a first sequencing reaction on a first high density DNA array; and imaging said first high density DNA array using a detector, and (b) initiating a first sequencing reaction on a second high density DNA array; and imaging said second high density DNA array using the detector, wherein the first sequencing reaction in (a) is initiated before the first sequencing reaction in (b) is initiated such that the sequencing reactions in (a) and (b) are staggered. By using asynchronous sequencing reactions and imaging two separate arrays using one detector, imaging can be carried out on one array while sequencing reactions are carried out on one the other, substrate, the other substrate is imaged, reducing the idle time of the imaging system.
Abstract:
The invention provides methods and kits for ordering sequence information derived from one or more target polynucleotides. In one aspect, one or more tiers or levels of fragmentation and aliquoting are generated, after which sequence information is obtained from fragments in a final level or tier. Each fragment in such final tier is from a particular aliquot, which, in turn, is from a particular aliquot of a prior tier, and so on. For every fragment of an aliquot in the final tier, the aliquots from which it was derived at every prior tier is known, or can be discerned. Thus, identical sequences from overlapping fragments from different aliquots can be distinguished and grouped as being derived from the same or different fragments from prior tiers. When the fragments in the final tier are sequenced, overlapping sequence regions of fragments in different aliquots are used to register the fragments so that non-overlapping regions are ordered. In one aspect, this process is carried out in a hierarchical fashion until the one or more target polynucleotides are characterized, e.g. by their nucleic acid sequences, or by an ordering of sequence segments, or by an ordering of single nucleotide polymorphisms (SNPs), or the like.
Abstract:
The invention provides methods and kits for ordering sequence information derived from one or more target polynucleotides. In one aspect, one or more tiers or levels of fragmentation and aliquoting are generated, after which sequence information is obtained from fragments in a final level or tier. Each fragment in such final tier is from a particular aliquot, which, in turn, is from a particular aliquot of a prior tier, and so on. For every fragment of an aliquot in the final tier, the aliquots from which it was derived at every prior tier is known, or can be discerned. Thus, identical sequences from overlapping fragments from different aliquots can be distinguished and grouped as being derived from the same or different fragments from prior tiers. When the fragments in the final tier are sequenced, overlapping sequence regions of fragments in different aliquots are used to register the fragments so that non-overlapping regions are ordered. In one aspect, this process is carried out in a hierarchical fashion until the one or more target polynucleotides are characterized, e.g. by their nucleic acid sequences, or by an ordering of sequence segments, or by an ordering of single nucleotide polymorphisms (SNPs), or the like.
Abstract:
The present invention is directed to logic for analysis of nucleic acid sequence data that employs algorithms that lead to a substantial improvement in sequence accuracy and that can be used to phase sequence variations, e.g., in connection with the use of the long fragment read (LFR) process.
Abstract:
Systems, methods, and apparatuses are provided for determining a sequence of a heteropolymer molecule. For example, all or part of a chromosome or a protein can be determined using sequence data from a plurality of heteropolymer fragments corresponding to the heteropolymer molecule. As one example, a position in the sequence read of a DNA fragment can be identified where a single base call is not clear. A multiplet base call can then be used, where the multiplet base call includes two or more bases at the position, along with a score for each base. The scores can be carried through mapping and assembly procedures, where the scores can be used to determine a final base call for the position in a chromosome of a genome of an organism. Other examples can be used for other monomer units besides bases.
Abstract:
Long fragment read techniques can be used to identify deletions and resolve base calls by utilizing shared labels (e.g., shared aliquots) of a read with any reads corresponding to heterozygous loci (hets) of a haplotype. For example, the linking of a locus to a haplotype of multiple hets can increase the reads available at the locus for determining a base call for a particular haplotype. For a hemizygous deletion, a region can be linked to one or more hets, and the labels for a particular haplotype can be used to identify which reads in the region correspond to which haplotype. In this manner, since the reads for a particular haplotype can be identified, a hemizygous deletion can be determined. Further, a phasing rate of pulses can be used to identify large deletions. A deletion can be identified with the phasing rate is sufficiently low, and other criteria can be used.
Abstract:
In a genome sequencing system and methodology, a protocol is provided to achieve precise alignment and accurate registration of an image of a planar array of nanoballs subject to optical analysis. Precise alignment correcting for fractional offsets is achieved by correcting for errors in subperiod x-y offset, scale and rotation by use of minimization techniques and Moiré averaging. In Moiré averaging, magnification is intentionally set so that the pixel period of the imaging element is a noninteger multiple of the site period. Accurate registration is achieved by providing for pre-defined pseudo-random sets of sites, herein deletion or reserved sites, where nanoballs are prevented from attachment to the substrate so that the sites of the array can be used in a pattern matching scheme as registration markers for absolute location identification. Information can be extracted with a high degree of confidence that it is correlated to a known location, while at the same time the amount of information that can be packed on a chip is maximized.
Abstract:
The present invention is directed to methods and compositions for acquiring nucleotide sequence information of target sequences. In particular, the present invention provides methods and compositions for improving the efficiency of sequencing reactions by using fewer labels to distinguish between nucleotides and by detecting nucleotides at multiple detection positions in a target sequence.
Abstract:
This disclosure provides a technology for users to gain first-hand knowledge and experience with interpreting whole genomes. The technology graphically depicts variations in genome sequences in an expandable display, and provides a platform whereby the user may find and research the biological significance of such variants. The technology also provides a unique collaborative environment designed to capture and improve the collective knowledge of the participating community.
Abstract:
The present invention is directed to methods and compositions for long fragment read sequencing. The present invention encompasses methods and compositions for preparing long fragments of genomic DNA, for processing genomic DNA for long fragment read sequencing methods, as well as software and algorithms for processing and analyzing sequence data.