Systems and methods that utilize machine learning algorithms to facilitate assembly of aids vaccine cocktails
    1.
    发明授权
    Systems and methods that utilize machine learning algorithms to facilitate assembly of aids vaccine cocktails 有权
    利用机器学习算法方便装配疫苗鸡尾酒的系统和方法

    公开(公告)号:US08478535B2

    公开(公告)日:2013-07-02

    申请号:US11324506

    申请日:2005-12-30

    IPC分类号: G01N33/50

    摘要: The subject invention provides systems and methods that facilitate AIDS vaccine cocktail assembly via machine learning algorithms such as a cost function, a greedy algorithm, an expectation-maximization (EM) algorithm, etc. Such assembly can be utilized to generate vaccine cocktails for species of pathogens that evolve quickly under immune pressure of the host. For example, the systems and methods of the subject invention can be utilized to facilitate design of T cell vaccines for pathogens such HIV. In addition, the systems and methods of the subject invention can be utilized in connection with other applications, such as, for example, sequence alignment, motif discovery, classification, and recombination hot spot detection. The novel techniques described herein can provide for improvements over traditional approaches to designing vaccines by constructing vaccine cocktails with higher epitope coverage, for example, in comparison with cocktails of consensi, tree nodes and random strains from data.

    摘要翻译: 本发明提供了通过诸如成本函数,贪心算法,期望最大化(EM)算法等机器学习算法来促进艾滋病疫苗鸡尾酒组合的系统和方法。可以利用这种装配来产生疫苗鸡尾酒, 在宿主免疫压力下快速发展的病原体。 例如,本发明的系统和方法可以用于促进用于诸如HIV的病原体的T细胞疫苗的设计。 此外,本发明的系统和方法可以与其他应用相结合使用,例如序列比对,基序发现,分类和重组热点检测。 本文所述的新颖技术可以提供改进,以通过构建具有较高表位覆盖度的疫苗混合物来设计疫苗的传统方法,例如与来自数据的共同体,树节点和随机菌株的鸡尾酒相比。

    T-CELL EPIOTOPE PREDICTION
    2.
    发明申请
    T-CELL EPIOTOPE PREDICTION 有权
    T细胞EPIOTOPE预测

    公开(公告)号:US20080172215A1

    公开(公告)日:2008-07-17

    申请号:US11963081

    申请日:2007-12-21

    IPC分类号: G06G7/60

    CPC分类号: G06F19/24 G06F19/16 G06F19/18

    摘要: Epitope prediction models are described herein. By way of example, a system for predicting epitope information relating to a epitope can include a classification model (e.g., logistic regression model). The trained classification model can illustratively operatively execute one ore logistic functions on received protein data, and incorporate one or more of hidden binary variables and shift variables that when processed represent the identification (e.g., prediction) of one or more desired epitopes. The classification model can be configured to predict the epitope information by processing data including various features of an epitope, MHC, MHC supertype, and Boolean combinations thereof.

    摘要翻译: 本文描述了表位预测模型。 作为示例,用于预测与表位相关的表位信息的系统可以包括分类模型(例如逻辑回归模型)。 经训练的分类模型可以说明性地操作地对所接收的蛋白质数据执行一个矿物物流功能,并且包含一个或多个隐藏的二进制变量和移位变量,其在被处理时表示一个或多个所需表位的识别(例如,预测)。 分类模型可以被配置为通过处理包括表位,MHC,MHC超类型和布尔组合的各种特征的数据来预测表位信息。

    Cluster modeling, and learning cluster specific parameters of an adaptive double threading model
    4.
    发明授权
    Cluster modeling, and learning cluster specific parameters of an adaptive double threading model 有权
    自适应双线程模型的集群建模和学习集群特定参数

    公开(公告)号:US08396671B2

    公开(公告)日:2013-03-12

    申请号:US11770684

    申请日:2007-06-28

    IPC分类号: G01N33/48

    CPC分类号: G06F19/24 G06F19/16 G06G7/48

    摘要: Cluster models are described herein. By way of example, a system for predicting binding information relating to a binding of a protein and a ligand can include a trained binding model and a prediction component. The trained binding model can include a probability distribution and a hidden variable that represents a cluster of protein sequences, and/or a set of hidden variables representing learned supertypes. The prediction component can be configured to predict the binding information by employing information about the protein's sequence, the ligand's sequence and the trained binding model.

    摘要翻译: 这里描述了群集模型。 作为示例,用于预测与蛋白质和配体的结合相关的结合信息的系统可以包括训练的结合模型和预测组分。 经训练的绑定模型可以包括概率分布和表示蛋白质序列簇的隐藏变量,和/或表示学习超类型的一组隐藏变量。 预测组件可以被配置为通过使用关于蛋白质序列,配体序列和训练的结合模型的信息来预测结合信息。

    T-cell epitope prediction
    5.
    发明授权
    T-cell epitope prediction 有权
    T细胞表位预测

    公开(公告)号:US08121797B2

    公开(公告)日:2012-02-21

    申请号:US11963081

    申请日:2007-12-21

    IPC分类号: G01N33/50

    CPC分类号: G06F19/24 G06F19/16 G06F19/18

    摘要: Epitope prediction models are described herein. By way of example, a system for predicting epitope information relating to a epitope can include a classification model (e.g., logistic regression model). The trained classification model can illustratively operatively execute one ore logistic functions on received protein data, and incorporate one or more of hidden binary variables and shift variables that when processed represent the identification (e.g., prediction) of one or more desired epitopes. The classification model can be configured to predict the epitope information by processing data including various features of an epitope, MHC, MHC supertype, and Boolean combinations thereof.

    摘要翻译: 本文描述了表位预测模型。 作为示例,用于预测与表位相关的表位信息的系统可以包括分类模型(例如逻辑回归模型)。 经训练的分类模型可以说明性地操作地对所接收的蛋白质数据执行一个矿物物流功能,并且包含一个或多个隐藏的二进制变量和移位变量,其在被处理时表示一个或多个所需表位的识别(例如,预测)。 分类模型可以被配置为通过处理包括表位,MHC,MHC超类型和布尔组合的各种特征的数据来预测表位信息。

    Vaccine design methodology
    6.
    发明授权
    Vaccine design methodology 有权
    疫苗设计方法

    公开(公告)号:US08452541B2

    公开(公告)日:2013-05-28

    申请号:US11764402

    申请日:2007-06-18

    IPC分类号: G01N33/48 G06F19/00

    摘要: Systems and methodologies for efficient vaccine design are disclosed herein. A methodology for efficient vaccine design in accordance with one or more embodiments disclosed herein may be operable to receive a graph having vertices corresponding to epitope sequences present in the pathogen population, weights for respective vertices corresponding to respective frequencies with which corresponding epitope sequences appear in the pathogen population, and directed edges that connect vertices that correspond to overlapping epitope sequences. Such a methodology may also be operable to determine a candidate vaccine sequence of overlapping epitope sequences by identifying a path though the graph corresponding to a series of connected vertices and directed edges that maximizes the total weight of the vertices in the path for a desired vaccine sequence length.

    摘要翻译: 本文公开了用于高效疫苗设计的系统和方法。 根据本文公开的一个或多个实施例的有效疫苗设计的方法可以用于接收具有对应于病原体群体中存在的表位序列的顶点的图,对应于相应表位序列出现在其中的相应频率的相应顶点的权重 病原体群体以及连接对应于重叠表位序列的顶点的定向边缘。 这种方法还可以用于通过识别通过对应于一系列连接的顶点的图形的路径来确定重叠表位序列的候选疫苗序列和使所需疫苗序列的路径中的顶点的总重量最大化的有向边缘 长度。

    Large-scale information collection and mining
    7.
    发明授权
    Large-scale information collection and mining 有权
    大型信息采集和挖掘

    公开(公告)号:US07814035B2

    公开(公告)日:2010-10-12

    申请号:US12180705

    申请日:2008-07-28

    摘要: The methods/systems described herein facilitate large-scale data collection and aggregation. One exemplary system that facilitates large-scale reporting of health-related data comprises a data collection component, a database and an aggregation component. The data collection component can collect health-related data on a large-scale from a non-selected population. The database can store at least some of the health-related data. The aggregation component can facilitate automatically ascertaining at least one pattern from the health-related data at least in part by applying one or more statistical, data-mining or machine-learning techniques to the database. One exemplary method of extracting health observations from information obtained on a macro-scale comprises receiving information about a plurality of self-selected subjects, pooling the information, mining the pooled information at least in part by employing a data-mining algorithm to infer one or more health observations from the pooled information, and monetizing the one or more health observations.

    摘要翻译: 这里描述的方法/系统有助于大规模数据收集和聚合。 促进健康相关数据的大规模报告的一个示例性系统包括数据收集组件,数据库和聚合组件。 数据收集组件可以从非选定人群大规模收集健康相关数据。 数据库可以存储至少一些健康相关数据。 聚合组件可以至少部分地通过将一个或多个统计,数据挖掘或机器学习技术应用于数据库来自动地从健康相关数据确定至少一个模式。 从在宏观尺度上获得的信息中提取健康观测的一个示例性方法包括:接收关于多个自选择对象的信息,汇集信息,至少部分地通过采用数据挖掘算法推断出一个或多个 从汇集的信息中获得更多的健康意见,并通过一个或多个健康观察获利。

    VACCINE DESIGN METHODOLOGY
    8.
    发明申请
    VACCINE DESIGN METHODOLOGY 有权
    疫苗设计方法

    公开(公告)号:US20080312095A1

    公开(公告)日:2008-12-18

    申请号:US11764402

    申请日:2007-06-18

    IPC分类号: G06G7/58

    摘要: Systems and methodologies for efficient vaccine design are disclosed herein. A methodology for efficient vaccine design in accordance with one or more embodiments disclosed herein may be operable to receive a graph having vertices corresponding to epitope sequences present in the pathogen population, weights for respective vertices corresponding to respective frequencies with which corresponding epitope sequences appear in the pathogen population, and directed edges that connect vertices that correspond to overlapping epitope sequences. Such a methodology may also be operable to determine a candidate vaccine sequence of overlapping epitope sequences by identifying a path though the graph corresponding to a series of connected vertices and directed edges that maximizes the total weight of the vertices in the path for a desired vaccine sequence length.

    摘要翻译: 本文公开了用于高效疫苗设计的系统和方法。 根据本文公开的一个或多个实施例的有效疫苗设计的方法可以用于接收具有对应于病原体群体中存在的表位序列的顶点的图,对应于相应表位序列出现在其中的相应频率的相应顶点的权重 病原体群体以及连接对应于重叠表位序列的顶点的定向边缘。 这种方法还可以用于通过识别通过对应于一系列连接的顶点的图形的路径来确定重叠表位序列的候选疫苗序列和使所需疫苗序列的路径中的顶点的总重量最大化的有向边缘 长度。

    LARGE-SCALE INFORMATION COLLECTION AND MINING
    9.
    发明申请
    LARGE-SCALE INFORMATION COLLECTION AND MINING 有权
    大规模信息采集和采矿

    公开(公告)号:US20080294465A1

    公开(公告)日:2008-11-27

    申请号:US12180705

    申请日:2008-07-28

    IPC分类号: G06Q50/00

    摘要: The methods/systems described herein facilitate large-scale data collection and aggregation. One exemplary system that facilitates large-scale reporting of health-related data comprises a data collection component, a database and an aggregation component. The data collection component can collect health-related data on a large-scale from a non-selected population. The database can store at least some of the health-related data. The aggregation component can facilitate automatically ascertaining at least one pattern from the health-related data at least in part by applying one or more statistical, data-mining or machine-learning techniques to the database. One exemplary method of extracting health observations from information obtained on a macro-scale comprises receiving information about a plurality of self-selected subjects, pooling the information, mining the pooled information at least in part by employing a data-mining algorithm to infer one or more health observations from the pooled information, and monetizing the one or more health observations.

    摘要翻译: 这里描述的方法/系统有助于大规模数据收集和聚合。 促进健康相关数据的大规模报告的一个示例性系统包括数据收集组件,数据库和聚合组件。 数据收集组件可以从非选定人群大规模收集健康相关数据。 数据库可以存储至少一些健康相关数据。 聚合组件可以至少部分地通过将一个或多个统计,数据挖掘或机器学习技术应用于数据库来自动地从健康相关数据确定至少一个模式。 从在宏观尺度上获得的信息中提取健康观测的一个示例性方法包括:接收关于多个自选择对象的信息,汇集信息,至少部分地通过采用数据挖掘算法推断出一个或多个 从汇集的信息中获得更多的健康意见,并通过一个或多个健康观察获利。

    Large-scale information collection and mining
    10.
    发明授权
    Large-scale information collection and mining 有权
    大型信息采集和挖掘

    公开(公告)号:US07406453B2

    公开(公告)日:2008-07-29

    申请号:US11266974

    申请日:2005-11-04

    摘要: The methods/systems described herein facilitate large-scale data collection and aggregation. One exemplary system that facilitates large-scale reporting of health-related data comprises a data collection component, a database and an aggregation component. The data collection component can collect health-related data on a large-scale from a non-selected population. The database can store at least some of the health-related data. The aggregation component can facilitate automatically ascertaining at least one pattern from the health-related data at least in part by applying one or more statistical, data-mining or machine-learning techniques to the database. One exemplary method of extracting health observations from information obtained on a macro-scale comprises receiving information about a plurality of self-selected subjects, pooling the information, mining the pooled information at least in part by employing a data-mining algorithm to infer one or more health observations from the pooled information, and monetizing the one or more health observations.

    摘要翻译: 这里描述的方法/系统有助于大规模数据收集和聚合。 促进健康相关数据的大规模报告的一个示例性系统包括数据收集组件,数据库和聚合组件。 数据收集组件可以从非选定人群大规模收集健康相关数据。 数据库可以存储至少一些健康相关数据。 聚合组件可以至少部分地通过将一个或多个统计,数据挖掘或机器学习技术应用于数据库来自动地从健康相关数据确定至少一个模式。 从在宏观尺度上获得的信息中提取健康观测的一个示例性方法包括:接收关于多个自选择对象的信息,汇集信息,至少部分地通过采用数据挖掘算法推断出一个或多个 从汇集的信息中获得更多的健康意见,并通过一个或多个健康观察获利。