METHODS AND SYSTEMS FOR PERSONALIZED NEOANTIGEN PREDICTION

    公开(公告)号:US20240013860A1

    公开(公告)日:2024-01-11

    申请号:US18347105

    申请日:2023-07-05

    Inventor: Hieu TRAN Ming LI

    CPC classification number: G16B20/30 G16B20/20 G01N33/6878

    Abstract: Personalized machine learning systems and methods are provided to predict the collective response of a patient's CD8+ T cells by modeling positive and negative selection processes. For each individual patient, HLA-I self peptides were used as negative selection, and allele-matched immunogenic T cell epitopes as positive selection. The negative and positive peptides were used to train a binary classification model, which was then applied to predict the immunogenicity of candidate neoantigens of that patient.

    Methods and systems for assembly of protein sequences

    公开(公告)号:US10309968B2

    公开(公告)日:2019-06-04

    申请号:US15599431

    申请日:2017-05-18

    Abstract: Methods and systems for determining amino acid sequence of a polypeptide or protein from mass spectrometry data is provided, using a weighted de Bruijn graph. Extracted and purified protein is cleaved into a mixture of peptide and then analyzed using mass spectrometry. A list of peptide sequences is derived from mass spectrometry fragment data by de novo sequencing, and amino acid confidence scores are determined from peak fragment ion intensity. A weighted de Bruijn graph is constructed for the list of peptide sequences having node weights defined by k−1 mer confidence scores. At least one contig is assembled from the de Bruijn graph by identifying node weights having the highest k−1 mer confidence scores.

    Method and system for faster and more sensitive homology searching

    公开(公告)号:US09652586B2

    公开(公告)日:2017-05-16

    申请号:US11561327

    申请日:2006-11-17

    CPC classification number: G06F19/22 G06F19/24

    Abstract: An area of research in the field of bioinformatics deals with the identification of similarities within one, or between two DNA sequences. Current techniques are quite slow and many matches are missed. The invention provides a faster and more sensitive solution, by using “optimized spaced seeds” to perform these biological sequence homology searches. Various techniques are shown for identifying seeds which are optimized to improve the sensitivity or speed of the searching. In the preferred embodiment, optimized spaced seeds are determined by the parameters of the search and independent of the actual databases being searched (for example, using the length and weight of the spaced seed, as well as the probability of a hit in a similar region). Thus, these optimized seeds can be stored in libraries which are accessed as required.

Patent Agency Ranking