- 专利标题: METHOD OF DETECTING FUSED TRANSCRIPTS AND SYSTEM THEREOF
-
申请号: US14369566申请日: 2011-12-31
-
公开(公告)号: US20140323320A1公开(公告)日: 2014-10-30
- 发明人: Wenlong Jia , Kunlong Qiu , Guangwu Guo , Minghui He , Jun Wang , Jian Wang , Huanming Yang
- 申请人: Wenlong Jia , Kunlong Qiu , Guangwu Guo , Minghui He , Jun Wang , Jian Wang , Huanming Yang
- 申请人地址: CN Shenzhen
- 专利权人: BGI TECH SOLUTIONS CO., LTD.
- 当前专利权人: BGI TECH SOLUTIONS CO., LTD.
- 当前专利权人地址: CN Shenzhen
- 国际申请: PCT/CN2011/085216 WO 20111231
- 主分类号: G06F19/22
- IPC分类号: G06F19/22
摘要:
Provided is a method of detecting method of detecting fusion transcripts in a sample to be analyzed. The method may comprises: subjecting the sample to be analyzed containing a RNA transcriptome to paired-end sequencing, to obtain paired-end RNA-Seq data of the sample to be analyzed; aligning the paired-end RNA-Seq data to a human reference genome sequence, to obtain first paired-end mapped reads, first single-end mapped reads, and first unmapped reads; evaluating an insertsize between two ends of the paired-end mapped reads by means of the first paired-end mapped reads, to obtain a proportion of paired-end mapped reads with overlapped 3′-ends; aligning the first unmapped reads to annotated transcripts, to obtain second single-end mapped reads and second unmapped reads; aligning the second unmapped reads to the annotated transcripts, to filter out unmapped reads caused by indel and obtain third unmapped reads; merging all single-end mapped reads, to obtain a set of single-end mapped reads; obtaining a gene pair linked by a cross-read as a primary set of candidate gene pairs based on the set of single-end mapped reads and combining with a relationship of the mapped paired-end reads; subjecting the primary set of candidate gene pairs to a filtration, to obtain a candidate set of fused gene pairs; bisecting the third unmapped read, to obtain a half-unmapped read; aligning the half-unmapped read to a gene-junction sequence in the candidate set of fused gene pairs, to obtain a potent region of a fused junction site in the gene in which the half-unmap read locates; outputting original reads of mapped half-unmapped reads, to obtain useful unmapped reads; subjecting the candidate set of fused gene pairs to a fusion simulation; aligning the useful unmapped reads to a junction library, to obtain a fused gene supported by the useful unmapped reads; calculating and gathering the fused sequence supported by the useful unmapped reads, to obtain information of the fused gene. And a system for detecting fusion transcripts is also provided.
信息查询