Abstract:
Methods for determining the copy number of a genomic region at a detection position of a target sequence in a sample are disclosed. Genomic regions of a target sequence in a sample are sequenced and measurement data for sequence coverage is obtained. Sequence coverage bias is corrected and may be normalized against a baseline sample. Hidden Markov Model (HMM) segmentation, scoring, and output are performed, and in some embodiments population-based no-calling and identification of low-confidence regions may also be performed. A total copy number value and region-specific copy number value for a plurality of regions are then estimated.
Abstract:
Compositions and methods for use in the therapeutic and preventative treatment, study, diagnosis and prognosis of PD-related disease are disclosed. Also provided are kits and reagents for prognosis and diagnosis of PD-related disease and related conditions.
Abstract:
The invention provides methods of analyzing genes for differential relative allelic expression patterns. Haplotype blocks throughout the genomes of individuals are analyzed to identify haplotype patterns that are associated with specific differential relative allelic expression patterns. Haplotype blocks that contain associated haplotype patterns may be further investigated to identify genes or variants of genes involved in differential relative allelic expression patterns.
Abstract:
Techniques, systems, and products for analyzing sparse indicators and generating communications based on bucketing of sparse indicators are disclosed.
Abstract:
Embodiments in the disclosure are directed to the use of distributed computing to align reads against multiple portions of a reference dataset. Aligned portions of the reference dataset that correspond with an above-threshold alignment score can be assessed for the presence of sparse indicators that can be categorized and used to influence a determination of a state transition likelihood. Various tasks associated with the processing of reads (e.g., alignment, sparse indicator detection, and/or determination of a state transition likelihood) may be able to take advantage of parallel processing and can be distributed among the machines while considering the resource utilization of those machines. Different load-balancing mechanisms can be employed in order to achieve even resource utilization across the machines, and in some cases may involve assessing various processing characteristics that reflect a predicted resource expenditure and/or time profile for each task to be processed by a machine.
Abstract:
The invention provides a collection of polymorphic sites associated with variations in human skin color, and genes containing or proximal to the sites.
Abstract:
Techniques for accurately identifying duplications and deletions using depth vectors. A depth vector is generated for each of multiple clients based on a set of reads that is received and aligned to a reference data set. A transformation processing of the depth vectors is performed to produce multiple components. Each of the components is assigned an order based on the extent to which it accounts for cross-client differences in the depth vectors. Each of the components includes an intensity, multiple values, and multiple client weights. A subset of the components is identified based on the order. A sparse indicator and positional data for the sparse indicator can be determined from the components in the subset, and one or more clients can be identified as being associated with the components.
Abstract:
Techniques are disclosed for processing and aligning incomplete data. A stream of data is received from a data source including a plurality of reads. While receiving the stream of data and prior to having received all of the plurality of reads, a set of reads is extracted from the plurality of reads. Each of the set of reads is aligned to a corresponding portion of a reference data set. For each particular position of a plurality of particular positions of the reference data set, a subset of reads of the aligned set of reads is identified. A value of a client data set is generated based on the subset of reads. A variable is generated based on the client data set. Data is routed when a condition, based on the variable, is satisfied.
Abstract:
Techniques for using functional testing to detect run-time impacts of code modifications. A method includes accessing a workflow including a plurality of stages for processing reads. The stages are defined based on modifiable code and include a first stage for aligning reads with a corresponding portion of a reference data set and a second stage for collectively analyzing data corresponding to the aligned reads. The method includes identifying functional testing specifications to correspond with the workflow, including a definition of which stages are to be performed during functional testing, a reduced reference data set, and a set of reads. The method includes performing the functional testing using the reduced reference data set and the set of reads, detecting a result generated via the performance, and outputting the result.
Abstract:
Methods for interpreting absolute copy number of complex tumors and for determining the copy number of a genomic region at a detection position of a target sequence in a sample are disclosed. In certain aspects, genomic regions of a target sequence in a sample are sequenced and measurement data for sequence coverage is obtained. Sequence coverage bias is corrected and may be normalized against a baseline sample. Hidden Markov Model (HMM) segmentation, scoring, and output are performed, and in some embodiments population-based no-calling and identification of low-confidence regions may also be performed. A total copy number value and region-specific copy number value for a plurality of regions are then estimated.