Abstract:
Methods, system, and kits are provided for sample identification, and, more specifically, for designing, and/or making, and/or using sample discriminating codes or barcodes for identifying sample nucleic acids or other biomolecules or polymers. For example, a plurality of flowspace codewords may be generated, the codewords comprising a string of characters. A location for at least one padding character within the flowspace codewords may be determined. The padding character may be inserted into the flowspace codewords at the determined location. After the inserting, a plurality of the flowspace codewords may be selected based on satisfying a predetermined minimum distance criteria, wherein the selected codewords correspond to valid base space sequences according to a predetermined flow order. And the barcode sequences corresponding to the selected codewords may be manufactured.
Abstract:
A method for nucleic acid sequencing includes receiving observed or measured nucleic acid sequencing data from a sequencing instrument that receives and processes a sample nucleic acid in a termination sequencing-by-synthesis process. The method also includes generating a set of candidate sequences of bases for the observed or measured nucleic acid sequencing data by determining a predicted signal for candidate sequences using a simulation framework. The simulation framework incorporates an estimated carry forward rate (CFR), an estimated incomplete extension rate (IER), an estimated droop rate (DR), an estimated reactivated molecules rate (RMR), and an estimated termination failure rate (TFR), the RMR being greater than or equal to zero and the TFR being lesser than one. The method also includes identifying, from the set of candidate sequences of bases, one candidate sequence leading to optimization of a solver function as corresponding to the sequence for the sample nucleic acid.
Abstract:
Methods, system, and kits are provided for sample identification, and, more specifically, for designing, and/or making, and/or using sample discriminating codes or barcodes for identifying sample nucleic acids or other biomolecules or polymers. For example, a plurality of flowspace codewords may be generated, the codewords comprising a string of characters. A location for at least one padding character within the flowspace codewords may be determined. The padding character may be inserted into the flowspace codewords at the determined location. After the inserting, a plurality of the flowspace codewords may be selected based on satisfying a predetermined minimum distance criteria, wherein the selected codewords correspond to valid base space sequences according to a predetermined flow order. And the barcode sequences corresponding to the selected codewords may be manufactured.
Abstract:
A method for nucleic acid sequencing includes: receiving a signal comprising measurements of a parameter measured in response to a plurality of nucleotide flows flowed in a space comprising a sample nucleic acid; normalizing the signal to obtain a normalized signal; adaptively normalizing the normalized signal to obtain an adaptively normalized signal; and predicting a sequence of base calls corresponding to the sample nucleic acid using the adaptively normalized signal.
Abstract:
A method for nucleic acid sequencing includes receiving nucleic acid sequencing data from a sequencing instrument that receives and processes a sample nucleic acid in a sequencing-by-synthesis process. The method also includes generating a set of candidate sequences of bases for the observed or measured nucleic acid sequencing data by determining a predicted signal for candidate sequences using a simulation framework. The simulation framework incorporates an estimated carry forward rate (CFR), an estimated incomplete extension rate (IER), an estimated droop rate (DR), an estimated reactivated molecules rate (RMR), and an estimated termination failure rate (TFR), the RMR being greater than or equal to zero and the TFR being lesser than one. The method also includes identifying, from the set of candidate sequences of bases, a candidate sequence as corresponding to the sequence for the sample nucleic acid.
Abstract:
Systems and method for determining variants can receive mapped reads and determine a distribution of matched-filter residuals distribution from a plurality of reads at a homopolymer region. The distribution of matched-filter residuals can be fit to uni-modal and bi-modal models. Based on the model that best fits the distribution of matched-filter residuals, the heterozygosity of the sample and the absence or presence of an insertion/deletion in the homopolymer can be determined.
Abstract:
A method for sequencing a nucleic acid template includes: (a) performing a first sequencing process including flowing nucleotides and/or reagents to the nucleic acid template according to a first predetermined ordering of nucleotides and/or reagents to obtain a first sequencing result; (b) after the first sequencing process, performing a second sequencing process including flowing nucleotides and/or reagents to the nucleic acid template according to a second predetermined ordering of nucleotides and/or reagents to obtain a second sequencing result, the second predetermined ordering of nucleotides and/or reagents being different from the first predetermined ordering of nucleotides and/or reagents and at least one of the first and second predetermined orderings of nucleotides and/or reagents being designed for repeat sequencing; and (c) determining a sequence of bases corresponding to at least a portion of the nucleic acid template using both the first sequencing result and the second sequencing result.
Abstract:
A method for nucleic acid sequencing includes receiving a plurality of observed or measured signals indicative of a parameter observed or measured for a plurality of defined spaces; determining, for at least some of the defined spaces, whether the defined space comprises one or more sample nucleic acids; processing, for at least some of the defined spaces, the observed or measured signal to improve a quality of the observed or measured signal; generating, for at least some of the defined spaces, a set of candidate sequences of bases for the defined space using one or more metrics adapted to associate a score or penalty to the candidate sequences of bases; and selecting the candidate sequence leading to a highest score or a lowest penalty as corresponding to the correct sequence for the one or more sample nucleic acids in the defined space.
Abstract:
A method for nucleic acid sequencing includes: receiving a signal comprising measurements of a parameter measured in response to a plurality of nucleotide flows flowed in a space comprising a sample nucleic acid; normalizing the signal to obtain a normalized signal; adaptively normalizing the normalized signal to obtain an adaptively normalized signal; and predicting a sequence of base calls corresponding to the sample nucleic acid using the adaptively normalized signal.
Abstract:
A method for nucleic acid sequencing includes receiving observed or measured nucleic acid sequencing data from a sequencing instrument that receives and processes a sample nucleic acid in a termination sequencing-by-synthesis process. The method also includes generating a set of candidate sequences of bases for the observed or measured nucleic acid sequencing data by determining a predicted signal for candidate sequences using a simulation framework. The simulation framework incorporates an estimated carry forward rate (CFR), an estimated incomplete extension rate (IER), an estimated droop rate (DR), an estimated reactivated molecules rate (RMR), and an estimated termination failure rate (TFR), the RMR being greater than or equal to zero and the TFR being lesser than one. The method also includes identifying, from the set of candidate sequences of bases, one candidate sequence leading to optimization of a solver function as corresponding to the sequence for the sample nucleic acid.