Abstract:
Peptide-fragment mixtures obtained by fragmenting a sample with each of multiple enzymes which cause cleavage at different sites are subjected to mass spectrometry. De novo sequencing is performed on the obtained results to deduce partial sequence candidates for various kinds of fragments (S1 and S2). Using the fact that a specific amino acid residue should appear at the cleavage site depending on the enzyme, a partial sequence candidate including the terminal of the original amino acid sequence is extracted from a number of candidates (S6). The task of searching for and combining non-terminal partial sequence candidates including an overlapping portion is repeated (S7 and S8). The sequence candidates including the terminal are subsequently connected to the ends of the sequence obtained through the repetitive task (S9). The eventually obtained amino acid sequence is highly likely to be the correct solution (S10 and S11).
Abstract:
The amino acid sequence is deduced by using de novo sequencing, to prevent the correct amino acid sequence from not being ranked high as candidates. Amino acid sequence candidates are computed by finding the longest path by a branch and using a bound method based on the spectrum data on the target peptide and the known amino acid sequence. A tree-structured directed graph is used where amino acid sequences are set as nodes and the peak intensities corresponding to the amino acids are set as branches. In a sequence put at a node in the highest layer, an amino acid is placed at a terminal, and as the layer goes deeper, amino acids are sequentially placed from both terminals toward the center of the sequence. The final score is estimated based on the remaining amino acids, and if the score is small, the search is halted.