-
公开(公告)号:US11694769B2
公开(公告)日:2023-07-04
申请号:US16226575
申请日:2018-12-19
Applicant: BIOINFORMATICS SOLUTIONS INC.
Inventor: Baozhen Shan , Ngoc Hieu Tran , Ming Li , Lei Xin , Rui Qiao , Xin Chen , Chuyi Liu
CPC classification number: G16B40/10 , G01N33/6818 , G01N33/6848 , G06N3/02 , G16B30/20 , G16B40/20 , H01J49/0036
Abstract: The present systems and methods introduce deep learning to de novo peptide sequencing from tandem mass spectrometry data, and in particular mass spectrometry data obtained by data-independent acquisition. The systems and methods achieve improvements in sequencing accuracy over existing systems and methods and enables complete assembly of novel protein sequences without assisting databases. To sequence peptides from mass spectrometry data obtained by data-independent acquisition, precursor profiles representing intensities of one or more precursor ion signals associated with a precursor retention time and fragment ion spectra representing signals from fragment ions and fragment retention times are fed into a neural network.
-
公开(公告)号:US20240013860A1
公开(公告)日:2024-01-11
申请号:US18347105
申请日:2023-07-05
Applicant: BIOINFORMATICS SOLUTIONS INC.
CPC classification number: G16B20/30 , G16B20/20 , G01N33/6878
Abstract: Personalized machine learning systems and methods are provided to predict the collective response of a patient's CD8+ T cells by modeling positive and negative selection processes. For each individual patient, HLA-I self peptides were used as negative selection, and allele-matched immunogenic T cell epitopes as positive selection. The negative and positive peptides were used to train a binary classification model, which was then applied to predict the immunogenicity of candidate neoantigens of that patient.
-
公开(公告)号:US11573239B2
公开(公告)日:2023-02-07
申请号:US16037949
申请日:2018-07-17
Applicant: BIOINFORMATICS SOLUTIONS INC.
Inventor: Baozhen Shan , Ngoc Hieu Tran , Ming Li , Lei Xin , Xianglilan Zhang
IPC: G01N33/48 , G01N33/50 , G01N33/68 , G06F17/16 , G16B20/00 , G16B40/00 , G16B50/00 , G16B30/20 , G16B40/10 , G16B40/20 , G16B50/20 , G16B50/10
Abstract: The present systems and methods introduce deep learning to de novo peptide sequencing from tandem mass spectrometry data. The systems and methods achieve improvements in sequencing accuracy over existing systems and methods and enables complete assembly of novel protein sequences without assisting databases. The present systems and methods are re-trainable to adapt to new sources of data and provides a complete end-to-end training and prediction solution, which is advantageous given the growing massive amount of data. The systems and methods combine deep learning and dynamic programming to solve optimization problems.
-
4.
公开(公告)号:US11644470B2
公开(公告)日:2023-05-09
申请号:US16846817
申请日:2020-04-13
Applicant: BIOINFORMATICS SOLUTIONS INC.
Inventor: Rui Qiao , Ngoc Hieu Tran , Lei Xin , Xin Chen , Baozhen Shan , Ali Ghodsi , Ming Li
CPC classification number: G01N33/6848 , G16B20/00 , G16B30/00 , G16B40/00
Abstract: The present systems and methods are directed to de novo identification of peptide sequences from tandem mass spectrometry data. The systems and methods uses unconverted mass spectrometry data from which features are extracted. Using unconverted mass spectrometry data reduces the loss of information and provides more accurate sequencing of peptides. The systems and methods combine deep learning and neural networks to sequencing of peptides.
-
公开(公告)号:US10309968B2
公开(公告)日:2019-06-04
申请号:US15599431
申请日:2017-05-18
Applicant: BIOINFORMATICS SOLUTIONS INC.
Inventor: Ngoc Hieu Tran , Mohammad Ziaur Rahman , Lin He , Lei Xin , Baozhen Shan , Ming Li
Abstract: Methods and systems for determining amino acid sequence of a polypeptide or protein from mass spectrometry data is provided, using a weighted de Bruijn graph. Extracted and purified protein is cleaved into a mixture of peptide and then analyzed using mass spectrometry. A list of peptide sequences is derived from mass spectrometry fragment data by de novo sequencing, and amino acid confidence scores are determined from peak fragment ion intensity. A weighted de Bruijn graph is constructed for the list of peptide sequences having node weights defined by k−1 mer confidence scores. At least one contig is assembled from the de Bruijn graph by identifying node weights having the highest k−1 mer confidence scores.
-
公开(公告)号:US09652586B2
公开(公告)日:2017-05-16
申请号:US11561327
申请日:2006-11-17
Applicant: Ming Li , Bin Ma , John Tromp
Inventor: Ming Li , Bin Ma , John Tromp
Abstract: An area of research in the field of bioinformatics deals with the identification of similarities within one, or between two DNA sequences. Current techniques are quite slow and many matches are missed. The invention provides a faster and more sensitive solution, by using “optimized spaced seeds” to perform these biological sequence homology searches. Various techniques are shown for identifying seeds which are optimized to improve the sensitivity or speed of the searching. In the preferred embodiment, optimized spaced seeds are determined by the parameters of the search and independent of the actual databases being searched (for example, using the length and weight of the spaced seed, as well as the probability of a hit in a similar region). Thus, these optimized seeds can be stored in libraries which are accessed as required.
-
-
-
-
-