-
公开(公告)号:US10309968B2
公开(公告)日:2019-06-04
申请号:US15599431
申请日:2017-05-18
发明人: Ngoc Hieu Tran , Mohammad Ziaur Rahman , Lin He , Lei Xin , Baozhen Shan , Ming Li
摘要: Methods and systems for determining amino acid sequence of a polypeptide or protein from mass spectrometry data is provided, using a weighted de Bruijn graph. Extracted and purified protein is cleaved into a mixture of peptide and then analyzed using mass spectrometry. A list of peptide sequences is derived from mass spectrometry fragment data by de novo sequencing, and amino acid confidence scores are determined from peak fragment ion intensity. A weighted de Bruijn graph is constructed for the list of peptide sequences having node weights defined by k−1 mer confidence scores. At least one contig is assembled from the de Bruijn graph by identifying node weights having the highest k−1 mer confidence scores.