摘要:
The subject invention provides systems and methods that facilitate AIDS vaccine cocktail assembly via machine learning algorithms such as a cost function, a greedy algorithm, an expectation-maximization (EM) algorithm, etc. Such assembly can be utilized to generate vaccine cocktails for species of pathogens that evolve quickly under immune pressure of the host. For example, the systems and methods of the subject invention can be utilized to facilitate design of T cell vaccines for pathogens such HIV. In addition, the systems and methods of the subject invention can be utilized in connection with other applications, such as, for example, sequence alignment, motif discovery, classification, and recombination hot spot detection. The novel techniques described herein can provide for improvements over traditional approaches to designing vaccines by constructing vaccine cocktails with higher epitope coverage, for example, in comparison with cocktails of consensi, tree nodes and random strains from data.
摘要:
Epitope prediction models are described herein. By way of example, a system for predicting epitope information relating to a epitope can include a classification model (e.g., logistic regression model). The trained classification model can illustratively operatively execute one ore logistic functions on received protein data, and incorporate one or more of hidden binary variables and shift variables that when processed represent the identification (e.g., prediction) of one or more desired epitopes. The classification model can be configured to predict the epitope information by processing data including various features of an epitope, MHC, MHC supertype, and Boolean combinations thereof.
摘要:
A tool for providing health and/or wellness services is described herein. Not necessarily clean or unclean data about a plurality of self-selected or non-selected or unselected subjects is received. The data can be aggregated and mined at least in part by employing a statistical algorithm, a data-mining algorithm and/or a machine-learning algorithm. The data can be further employed to provide health and/or wellness services to participants.
摘要:
Cluster models are described herein. By way of example, a system for predicting binding information relating to a binding of a protein and a ligand can include a trained binding model and a prediction component. The trained binding model can include a probability distribution and a hidden variable that represents a cluster of protein sequences, and/or a set of hidden variables representing learned supertypes. The prediction component can be configured to predict the binding information by employing information about the protein's sequence, the ligand's sequence and the trained binding model.
摘要:
Epitope prediction models are described herein. By way of example, a system for predicting epitope information relating to a epitope can include a classification model (e.g., logistic regression model). The trained classification model can illustratively operatively execute one ore logistic functions on received protein data, and incorporate one or more of hidden binary variables and shift variables that when processed represent the identification (e.g., prediction) of one or more desired epitopes. The classification model can be configured to predict the epitope information by processing data including various features of an epitope, MHC, MHC supertype, and Boolean combinations thereof.
摘要:
Systems and methodologies for efficient vaccine design are disclosed herein. A methodology for efficient vaccine design in accordance with one or more embodiments disclosed herein may be operable to receive a graph having vertices corresponding to epitope sequences present in the pathogen population, weights for respective vertices corresponding to respective frequencies with which corresponding epitope sequences appear in the pathogen population, and directed edges that connect vertices that correspond to overlapping epitope sequences. Such a methodology may also be operable to determine a candidate vaccine sequence of overlapping epitope sequences by identifying a path though the graph corresponding to a series of connected vertices and directed edges that maximizes the total weight of the vertices in the path for a desired vaccine sequence length.
摘要:
The methods/systems described herein facilitate large-scale data collection and aggregation. One exemplary system that facilitates large-scale reporting of health-related data comprises a data collection component, a database and an aggregation component. The data collection component can collect health-related data on a large-scale from a non-selected population. The database can store at least some of the health-related data. The aggregation component can facilitate automatically ascertaining at least one pattern from the health-related data at least in part by applying one or more statistical, data-mining or machine-learning techniques to the database. One exemplary method of extracting health observations from information obtained on a macro-scale comprises receiving information about a plurality of self-selected subjects, pooling the information, mining the pooled information at least in part by employing a data-mining algorithm to infer one or more health observations from the pooled information, and monetizing the one or more health observations.
摘要:
Systems and methodologies for efficient vaccine design are disclosed herein. A methodology for efficient vaccine design in accordance with one or more embodiments disclosed herein may be operable to receive a graph having vertices corresponding to epitope sequences present in the pathogen population, weights for respective vertices corresponding to respective frequencies with which corresponding epitope sequences appear in the pathogen population, and directed edges that connect vertices that correspond to overlapping epitope sequences. Such a methodology may also be operable to determine a candidate vaccine sequence of overlapping epitope sequences by identifying a path though the graph corresponding to a series of connected vertices and directed edges that maximizes the total weight of the vertices in the path for a desired vaccine sequence length.
摘要:
The methods/systems described herein facilitate large-scale data collection and aggregation. One exemplary system that facilitates large-scale reporting of health-related data comprises a data collection component, a database and an aggregation component. The data collection component can collect health-related data on a large-scale from a non-selected population. The database can store at least some of the health-related data. The aggregation component can facilitate automatically ascertaining at least one pattern from the health-related data at least in part by applying one or more statistical, data-mining or machine-learning techniques to the database. One exemplary method of extracting health observations from information obtained on a macro-scale comprises receiving information about a plurality of self-selected subjects, pooling the information, mining the pooled information at least in part by employing a data-mining algorithm to infer one or more health observations from the pooled information, and monetizing the one or more health observations.
摘要:
The methods/systems described herein facilitate large-scale data collection and aggregation. One exemplary system that facilitates large-scale reporting of health-related data comprises a data collection component, a database and an aggregation component. The data collection component can collect health-related data on a large-scale from a non-selected population. The database can store at least some of the health-related data. The aggregation component can facilitate automatically ascertaining at least one pattern from the health-related data at least in part by applying one or more statistical, data-mining or machine-learning techniques to the database. One exemplary method of extracting health observations from information obtained on a macro-scale comprises receiving information about a plurality of self-selected subjects, pooling the information, mining the pooled information at least in part by employing a data-mining algorithm to infer one or more health observations from the pooled information, and monetizing the one or more health observations.