GENOTYPING VARIABLE NUMBER TANDEM REPEATS

    公开(公告)号:US20230019053A1

    公开(公告)日:2023-01-19

    申请号:US17839075

    申请日:2022-06-13

    申请人: Illumina, Inc.

    IPC分类号: G16B30/00 G16B20/10 G16B40/00

    摘要: Disclosed herein include systems, devices, and methods for determining a variable number tandem repeat (VNTR) status. Haplotypes of a VNTR can be determined using long sequence reads of reference samples aligned to the VNTR in a reference. Short reads of a test sample of a test subject can be aligned to the haplotypes determined using the long sequence reads to determine a VNTR status (e.g., one or more haplotypes or a genotype of the test subject) of the test subject based on the probability indications of the haplotypes.

    METHODS AND SYSTEMS FOR VISUALIZING SHORT READS IN REPETITIVE REGIONS OF THE GENOME

    公开(公告)号:US20220254442A1

    公开(公告)日:2022-08-11

    申请号:US17547297

    申请日:2021-12-10

    申请人: Illumina, Inc.

    摘要: The disclosed embodiments concern methods, apparatus, systems and computer program products for genotyping and visualizing repeat sequences such as medically significant short tandem repeats (STRs). Some implementations can be used to genotype and visualize repeat sequences each including two or more repeat sub-sequences. Some implementations provides a computer tool to generate sequence read pileups for visualizing repeat sequences for samples that have different genotypes of the repeat sequence, each sequence pileup including reads aligned to two or more different haplotypes.

    SEQUENCE-GRAPH BASED TOOL FOR DETERMINING VARIATION IN SHORT TANDEM REPEAT REGIONS

    公开(公告)号:US20200286586A1

    公开(公告)日:2020-09-10

    申请号:US16811919

    申请日:2020-03-06

    申请人: Illumina, Inc.

    摘要: The disclosed embodiments concern methods, apparatus, systems and computer program products for genotyping repeat sequences such as medically significant short tandem repeats (STRs). The methods involve aligning reads to a repeat sequence represented by a sequence graph, and using the aligned reads to genotype the repeat sequence. The sequence graph is a directed graph each including at least one self-loop representing a repeat sub-sequence. In some implementations, the reads are paired end reads, and both mates of each read pair may be used to genotype the repeat sequences. Some implementations can be used to determine degenerate codon repeats. Some implementations can be used to genotype repeat sequences each including two or more repeat sub-sequences. Some implementations can be used to genotype nucleic acid sequences each including at least one repeat sub-sequence and another genetic variant such as an insertion, deletion, or substitution.