Abstract:
Systems and method for determining variants can receive mapped reads and determine a distribution of matched-filter residuals distribution from a plurality of reads at a homopolymer region. The distribution of matched-filter residuals can be fit to uni-modal and bi-modal models. Based on the model that best fits the distribution of matched-filter residuals, the heterozygosity of the sample and the absence or presence of an insertion/deletion in the homopolymer can be determined.
Abstract:
Disclosed are systems and methods for polynucleotide sequencing where detection and correction of base calling errors can be achieved without reliance on a reference sequence. In certain embodiments, redundant information can be introduced during measurement so as to allow such detection of errors. Such redundant information and measurements can be facilitated by encoding of nucleotide sequence being measured. Various examples of such encoding, redundancy introduction, and decoding are provided.
Abstract:
Disclosed are systems and methods for polynucleotide sequencing where detection and correction of base calling errors can be achieved without reliance on a reference sequence. In certain embodiments, redundant information can be introduced during measurement so as to allow such detection of errors. Such redundant information and measurements can be facilitated by encoding of nucleotide sequence being measured. Various examples of such encoding, redundancy introduction, and decoding are provided.
Abstract:
A method for nucleic acid sequencing includes: receiving a signal comprising measurements of a parameter measured in response to a plurality of nucleotide flows flowed in a space comprising a sample nucleic acid; normalizing the signal to obtain a normalized signal; adaptively normalizing the normalized signal to obtain an adaptively normalized signal; and predicting a sequence of base calls corresponding to the sample nucleic acid using the adaptively normalized signal.
Abstract:
A method for nucleic acid sequencing includes receiving observed or measured nucleic acid sequencing data from a sequencing instrument that receives and processes a sample nucleic acid in a termination sequencing-by-synthesis process. The method also includes generating a set of candidate sequences of bases for the observed or measured nucleic acid sequencing data by determining a predicted signal for candidate sequences using a simulation framework. The simulation framework incorporates an estimated carry forward rate (CFR), an estimated incomplete extension rate (IER), an estimated droop rate (DR), an estimated reactivated molecules rate (RMR), and an estimated termination failure rate (TFR), the RMR being greater than or equal to zero and the TFR being lesser than one. The method also includes identifying, from the set of candidate sequences of bases, one candidate sequence leading to optimization of a solver function as corresponding to the sequence for the sample nucleic acid.
Abstract:
Disclosed are systems and methods for polynucleotide sequencing where detection and correction of base calling errors can be achieved without reliance on a reference sequence. In certain embodiments, redundant information can be introduced during measurement so as to allow such detection of errors. Such redundant information and measurements can be facilitated by encoding of nucleotide sequence being measured. Various examples of such encoding, redundancy introduction, and decoding are provided.
Abstract:
A method for nucleic acid sequencing includes receiving a plurality of observed or measured signals indicative of a parameter observed or measured for a plurality of defined spaces; determining, for at least some of the defined spaces, whether the defined space comprises one or more sample nucleic acids; processing, for at least some of the defined spaces, the observed or measured signal to improve a quality of the observed or measured signal; generating, for at least some of the defined spaces, a set of candidate sequences of bases for the defined space using one or more metrics adapted to associate a score or penalty to the candidate sequences of bases; and selecting the candidate sequence leading to a highest score or a lowest penalty as corresponding to the correct sequence for the one or more sample nucleic acids in the defined space.
Abstract:
A method for nucleic acid sequencing includes (a) disposing a plurality of template polynucleotide strands in a plurality of defined spaces disposed on a sensor array, at least some of the template polynucleotide strands comprising a test or control sequence; (b) exposing a plurality of the template polynucleotide strands in the defined spaces to a series of flows of nucleotide species flowed according to a predetermined ordering; and (c) determining sequence information for a plurality of the template polynucleotide strands in the defined spaces based on the flows of nucleotide species to generate a plurality of sequencing reads corresponding to the template polynucleotide strands, wherein the test or control sequence comprises a sequence determined by identifying, using a variant caller, loci with systematic errors present in a plurality of sequencing runs included in a training set of sequencing runs.
Abstract:
Methods and systems for quantification of a target nucleic acid in a sample are provided. The method includes forming a plurality of discrete sample portions. Each of the plurality of discrete sample portions comprising a portion of the sample, and a reaction mixture. The method further includes amplifying the plurality of discrete sample portions to form a plurality of discrete processed sample portions. At least one discrete processed sample portion containing nucleic acid amplification reaction products. Fluorescence signals are detected from the at least one of the plurality of discrete processed sample portions to determine a presence of the at least one target nucleic acid. The method also includes determining the respective volumes of the plurality of the plurality of discrete processed sample portions, and estimating the number of copies-per-unit-volume of the at least one target nucleic acid in the sample. Estimating the number of copies-per-unit-volume is based on the number of discrete processed sample portions determined to contain the at least one target nucleic acid therein.
Abstract:
A method for designing test or control sequences may include identifying, using a variant caller, loci with systematic errors present in a plurality of sequencing runs included in a training set of sequencing runs obtained using sequencing-by-synthesis; and selecting a representative set of loci, including selecting from the identified loci an approximately equal number of loci involving errors in A, T, C, and G homopolymers and selecting from the identified loci an approximately equal number of loci involving homopolymers having a length of two, three, and four.