Abstract:
Even when only mass spectra wherein the reproducibility of peak intensities is low are obtained in a mass spectrometry apparatus using, for example, a MALDI ion source, the correction of shifts in retention time using TICs for a plurality of specimens is performed with good precision. For each mass spectrum, variable scaling is executed which combines such first scaling as to equalize the extent of variations in signal intensity values in one mass spectrum, among different mass spectra, and second scaling for performing weighting according to relative variations in signal intensity values for each mass spectrum (S3). The signal intensity values after the scaling are added to obtain a total signal intensity value for one measurement time point (S4). From a plurality of total signal intensity values thus obtained, a TIC is created (S6). Using these TICs, RT alignment is executed (S8). Thus, the similarity in TIC waveforms increases, and RT alignment can be suitably performed.
Abstract:
To improve the reliability of mutual diagnosis in a cancer determination by machine learning, m/z values of ions originating from tumor markers or similar substances used in other related tests are stored in a particular m/z-value database. A spectrum information filtering section deletes signal intensities at the m/z values stored in the particular m/z-value database from a large number of mass spectra classified by the presence or absence of cancer. Using the data which remain after the deletion as training data, a training processor obtains training-result information and stores it in a training result database. A judgment processor similarly deletes signal intensities at the predetermined m/z values from mass spectrum data obtained for a target sample to be judged. Then, based on the training-result information stored in the training-result database, the judgment processor determines whether the target sample should be classified into a cancerous group or non-cancerous group.
Abstract:
MS1 and MS2 measurements of fractionated samples are performed. Based on the identification results and the S/N ratios of the MS1 peaks, an identification probability estimation model showing a relationship between the cumulative number of MS1 peaks and the number of MS1 peaks successfully identified through the MS2 measurements and identifications performed in ascending order of S/N ratio is created. S/N ratios of the MS1 peaks obtained by MS1 measurements are determined, and probabilities of substances in a target sample are estimated from S/N ratios using the aforementioned model. Optimization of precursor-ion selection and data-accumulation number is defined as the problem of maximizing the sum of identification probabilities of MS1 peaks selected for MS2 measurement, and formulated as an objective function using 0-1 variables. This function is solved as a 0-1 integer programming problem under preset conditions. Optimal precursor ions and data-accumulation numbers are determined from variables of the solution.
Abstract:
When conducting imaging mass analysis for a region to be measured on a sample, an individual reference value calculating part obtains a maximum value in Pi/Ii of respective measuring points, and stores the value together with measured data as an individual reference value. When performing comparison analysis for a plurality of the data obtained from different samples, a common reference value determining part reads out corresponding a plurality of the individual reference values and determines a minimum value as a common reference value Fmin. A normalization calculation processing part normalizes the respective intensity values by multiplying the intensity values read out from an external memory device by a normalization coefficient long_Max×(Fmin/Pi) obtained from the common reference value Fmin, TIC values Pi at the respective measuring points, and a maximum allowable value long_Max of a variable storing the intensity values at the time of operation.
Abstract:
This analytical data analysis method uses machine learning of analysis result data (31) measured by an analyzer (1), and includes generating simulated data (32) in which a data variation has been added to the analysis result data (31) within a range that does not affect identification, performing the machine learning using the generated simulated data (32), and performing discrimination using a discrimination criterion (23b) obtained through the machine learning.
Abstract:
If spatial measurement point intervals in imaging mass analysis data of two samples to be compared are different and the degrees of spatial distribution spreading of substances are compared, one of the data is defined as a reference, the measurement point intervals in the other of the data are redefined so as to be equalized to the reference, and a mass spectrum at each virtual measurement point set as a result of the redefinition is obtained through interpolation or extrapolation based on a mass spectrum at an actual measurement points. If the arrays of the m/z values of mass spectra are different for each sample, the m/z value positions of the mass spectrum in one of the data are defined as a reference, and the intensity values corresponding to the reference m/z values are obtained through interpolation or extrapolation for the mass spectrum of the other of the data.
Abstract:
The amino acid sequence is deduced by using de novo sequencing, to prevent the correct amino acid sequence from not being ranked high as candidates. Amino acid sequence candidates are computed by finding the longest path by a branch and using a bound method based on the spectrum data on the target peptide and the known amino acid sequence. A tree-structured directed graph is used where amino acid sequences are set as nodes and the peak intensities corresponding to the amino acids are set as branches. In a sequence put at a node in the highest layer, an amino acid is placed at a terminal, and as the layer goes deeper, amino acids are sequentially placed from both terminals toward the center of the sequence. The final score is estimated based on the remaining amino acids, and if the score is small, the search is halted.