Abstract:
A method for detecting a gene fusion includes amplifying a nucleic acid sample in the presence of primer pool to produce a plurality of amplicons. The primer pool includes primers targeting a plurality of exon-exon junctions of a driver gene. The amplicons correspond to the exon-exon junctions. The amplicons are sequenced and aligned to a reference sequence. The number of reads corresponding to each amplicon is normalized to give a normalized read count. A baseline correction is applied to the normalized read counts for the amplicons to form corrected read counts. A binary segmentation score is calculated for each corrected read count. A predicted breakpoint for the gene fusion is determined based on the amplicon index corresponding to the maximum absolute binary segmentation score. Gene fusion events may be detected in a partner agnostic manner, i.e. without prior knowledge of the specific fusion partner genes or specific breakpoint information.
Abstract:
A method for compressing nucleic acid sequence data wherein each sequence read is associated with a molecular tag sequence, wherein a portion of the sequence reads alignments correspond to sequence reads mapped to a targeted fusion reference sequence includes determining a consensus sequence read for each family of sequence reads based on flow space signal measurements corresponding to the family of sequence reads, determining a consensus sequence alignment for each family of sequence reads, wherein a portion of the consensus sequence alignments correspond to the consensus sequence reads aligned with the targeted fusion reference sequence, generating a compressed data structure comprising consensus compressed data, the consensus compressed data including the consensus sequence read and the consensus sequence alignment for each family, and detecting a fusion using the consensus sequence reads and the consensus sequence alignments from the compressed data structure.
Abstract:
Systems and method for determining variants can receive mapped reads, align flow space information to a flow space representation of a corresponding portion of the reference. Reads spanning a position with a potential variant can be evaluated in a context specific manner. A list of probable variants can be provided.
Abstract:
Systems and method for determining variants can receive mapped reads and determine a distribution of matched-filter residuals distribution from a plurality of reads at a homopolymer region. The distribution of matched-filter residuals can be fit to uni-modal and bi-modal models. Based on the model that best fits the distribution of matched-filter residuals, the heterozygosity of the sample and the absence or presence of an insertion/deletion in the homopolymer can be determined.
Abstract:
The present disclosure provides compositions and methods, as well as combinations, kits, and systems that include the compositions and methods, for amplification, detection, characterization, assessment, profiling and/or measurement of nucleic acids in samples, particularly biological samples. Compositions and methods provided herein include combinations of microbial species target-specific nucleic acid primers for selective amplification and/or combinations of primers for amplification of nucleic acids from a large group of taxonomically related microorganisms. In one aspect, amplified nucleic acids obtained using the compositions and methods can be used in various processes including nucleic acid sequencing and used to detect the presence of microbial species and assess microbial populations in a variety of samples. In accordance with the teachings and principles, new methods, systems and non-transitory machine-readable storage medium are provided to compress reference sequence databases used in mapping sequence reads for analysis and profiling of microbial populations.
Abstract:
Systems and method for determining variants can receive mapped reads, align flow space information to a flow space representation of a corresponding portion of the reference. Reads spanning a position with a potential variant can be evaluated in a context specific manner. A list of probable variants can be provided.
Abstract:
Systems and method for determining variants can receive mapped reads, align flow space information to a flow space representation of a corresponding portion of the reference. Reads spanning a position with a potential variant can be evaluated in a context specific manner. A list of probable variants can be provided.