-
公开(公告)号:US20220284986A1
公开(公告)日:2022-09-08
申请号:US17699439
申请日:2022-03-21
Applicant: Life Technologies Corporation
Inventor: Paolo Vatta , Onur Sakarya , Heinz Breu , Liviu Popescu , Asim Siddiqui , Fiona Hyland
Abstract: Identification of exon junctions includes obtaining a first read sequence based on a detected plurality of signals of a first sequence. A list of exon prefix and suffix sequences are generated by identifying exons of the human genome with a prefix sequence mapping to a suffix sequence of the first read sequence and by identifying exons with a suffix sequence mapping to a prefix sequence of the first read sequence. A pair of exon sequences is selected, with a first exon sequence being one of the exon suffix sequences and a second exon sequence being one of the exon prefix sequences. Summing a number of sequence elements of the first exon sequence that overlap the prefix of the first read sequence, a number of sequence elements of the second exon sequence that overlap the suffix of the first read sequence, and a constant is used to identify a fusion junction.
-
公开(公告)号:US20180276338A1
公开(公告)日:2018-09-27
申请号:US15928202
申请日:2018-03-22
Applicant: LIFE TECHNOLOGIES CORPORATION
Inventor: Paolo Vatta , Onur Sakarya , Heinz Breu , Liviu Popescu , Asim Siddiqui , Fiona Hyland
IPC: G06F19/22
CPC classification number: G16B30/00
Abstract: Systems and methods are used to identify an exon junction from a single read of a transcript. A transcript sample is interrogated and a read sequence is produced using a nucleic acid sequencer. A first exon sequence and a second exon sequence are obtained using the processor. The first exon sequence is mapped to a prefix of the read sequence using the processor. The second exon sequence is mapped to a suffix of the read sequence using the processor. A sum of a number of sequence elements of the first exon sequence that overlap the prefix of the read sequence, of a number of sequence elements of the second exon sequence that overlap the suffix of the read sequence, and of a constant is calculated using the processor. If the sum equals a length of the read sequence, a junction is identified in the read using the processor.
-