Machine learning using relational databases
    11.
    发明授权
    Machine learning using relational databases 有权
    机器学习使用关系数据库

    公开(公告)号:US08364612B2

    公开(公告)日:2013-01-29

    申请号:US12559921

    申请日:2009-09-15

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005

    摘要: Machine learning using relational databases is described. In an embodiment a model of a probabilistic relational database is formed by augmenting relation schemas of a relational database with probabilistic attributes. In an example, the model comprises constraints introduced by linking the probabilistic attributes using factor statements. For example, a compiler translates the model into a factor graph data structure which may be passed to an inference engine to carry out machine learning. For example, this enables machine learning to be integrated with the data and it is not necessary to pre-process or reformat large scale data sets for a particular problem domain. In an embodiment a machine learning system for estimating skills of players in an online gaming environment is provided. In another example, a machine learning system for data mining of medical data is provided. In some examples, missing attribute values are filled using machine learning results.

    摘要翻译: 描述使用关系数据库的机器学习。 在一个实施例中,通过用概率属性来增加关系数据库的关系模式来形成概率关系数据库的模型。 在一个例子中,模型包括通过使用因子语句链接概率属性引入的约束。 例如,编译器将该模型转换为因子图数据结构,该结构可被传递给推理机以执行机器学习。 例如,这使得机器学习能够与数据集成,并且不需要为特定问题域预处理或重新格式化大规模数据集。 在一个实施例中,提供了一种用于估计在线游戏环境中的玩家的技能的机器学习系统。 在另一示例中,提供了用于医疗数据的数据挖掘的机器学习系统。 在一些示例中,使用机器学习结果填充缺少的属性值。

    FEATURE VECTOR CONSTRUCTION
    12.
    发明申请
    FEATURE VECTOR CONSTRUCTION 审中-公开
    特征矢量结构

    公开(公告)号:US20120158791A1

    公开(公告)日:2012-06-21

    申请号:US12975177

    申请日:2010-12-21

    IPC分类号: G06F17/30

    CPC分类号: G06F16/9024

    摘要: Feature vector construction techniques are described. In one or more implementations, an input is received at a computing device that describes a graph query that specifies one of a plurality of entities to be used to query a knowledge base graph that represents the plurality of entities. A feature vector is constructed, by the computing device, having a number of indicator variables, each of which indicates observance of a sub-graph feature represented by a respective indicator variable in the knowledge base graph.

    摘要翻译: 描述特征向量构造技术。 在一个或多个实现中,在描述指定用于查询表示多个实体的知识库的多个实体中的一个实体的图形查询的计算设备处接收输入。 由计算装置构建特征向量,其具有多个指示符变量,每个指标变量表示在知识库中由各个指示符变量表示的子图特征的遵循。

    INFORMATION PROPAGATION PROBABILITY FOR A SOCIAL NETWORK
    13.
    发明申请
    INFORMATION PROPAGATION PROBABILITY FOR A SOCIAL NETWORK 有权
    社会网络的信息传播概率

    公开(公告)号:US20120158630A1

    公开(公告)日:2012-06-21

    申请号:US12971191

    申请日:2010-12-17

    IPC分类号: G06N3/00 G06F15/173

    摘要: One or more techniques and/or systems are disclosed for predicting propagation of a message on a social network. A predictive model is trained to determine a probability of propagation of information on the social network using both positive and negative information propagation feedback, which may be collected while monitoring the social network over a desired period of time for information propagation. A particular message can be input to the predictive model, which can determine a probability of propagation of the message on the social network, such as how many connections may receive at least a portion of the message and/or a likelihood of at least a portion of the message reaching respective connections in the social network.

    摘要翻译: 公开了一种或多种技术和/或系统来预测消息在社交网络上的传播。 训练一个预测模型,以确定使用正和负信息传播反馈在社交网络上传播信息的概率,可以在信息传播的期望时间段内监视社交网络时收集信息。 可以将特定消息输入到预测模型,预测模型可以确定消息在社交网络上的传播概率,例如多少连接可以接收消息的至少一部分和/或至少一部分的可能性 的消息到达社交网络中的各个连接。

    HUMAN-ASSISTED TRAINING OF AUTOMATED CLASSIFIERS
    14.
    发明申请
    HUMAN-ASSISTED TRAINING OF AUTOMATED CLASSIFIERS 有权
    人工辅助自动分类培训

    公开(公告)号:US20120158620A1

    公开(公告)日:2012-06-21

    申请号:US12970158

    申请日:2010-12-16

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005 G06N3/08

    摘要: Many computing scenarios involve the classification of content items within one or more categories. The content item set may be too large for humans to classify, but an automated classifier (e.g., an artificial neural network) may not be able to classify all content items with acceptable accuracy. Instead, the automated classifier may calculate a classification confidence while classifying respective content items. Content items having a low classification confidence may be sent to a human classifier, and may be added, along with the categories identified by the human classifier, to a training set. The automated classifier may then be retrained using the training set, thereby incrementally improving the classification confidence of the automated classifier while conserving the involvement of human classifiers. Additionally, human classifiers may be rewarded for classifying the content items, and the costs of such rewards may be considered while selecting content items for the training set.

    摘要翻译: 许多计算场景包括对一个或多个类别内的内容项进行分类。 内容项集合可能太大以致人类不能进行分类,但是自动分类器(例如,人造神经网络)可能不能够以可接受的准确度对所有内容项进行分类。 相反,自动分类器可以在分类各个内容项目时计算分类置信度。 具有低分类置信度的内容项目可以被发送到人类分类器,并且可以与人类分类器识别的类别一起被添加到训练集合中。 然后可以使用训练集再次训练自动分类器,从而逐渐改进自动分类器的分类置信度,同时节省人类分类器的参与。 此外,可以奖励人类分类器对内容项进行分类,并且可以在选择训练集的内容项时考虑这种奖励的成本。

    Parallelization of Online Learning Algorithms
    15.
    发明申请
    Parallelization of Online Learning Algorithms 有权
    在线学习算法的并行化

    公开(公告)号:US20110320767A1

    公开(公告)日:2011-12-29

    申请号:US12822918

    申请日:2010-06-24

    IPC分类号: G06F15/76 G06F9/02

    CPC分类号: G06N99/005

    摘要: Methods, systems, and media are provided for a dynamic batch strategy utilized in parallelization of online learning algorithms. The dynamic batch strategy provides a merge function on the basis of a threshold level difference between the original model state and an updated model state, rather than according to a constant or pre-determined batch size. The merging includes reading a batch of incoming streaming data, retrieving any missing model beliefs from partner processors, and training on the batch of incoming streaming data. The steps of reading, retrieving, and training are repeated until the measured difference in states exceeds a set threshold level. The measured differences which exceed the threshold level are merged for each of the plurality of processors according to attributes. The merged differences which exceed the threshold level are combined with the original partial model states to obtain an updated global model state.

    摘要翻译: 提供了方法,系统和媒体,用于在线学习算法并行化中使用的动态批处理策略。 动态批量策略基于原始模型状态和更新的模型状态之间的阈值水平差,而不是根据常数或预定的批量大小来提供合并功能。 合并包括读取一批传入的流式传输数据,从合作伙伴处理器中检索任何丢失的模型信念,以及对批量的传入流数据进行培训。 重复读取,检索和训练的步骤,直到测量的状态差异超过设定的阈值水平。 根据属性对多个处理器中的每一个合并超过阈值水平的测量差异。 将超过阈值水平的合并差异与原始部分模型状态相结合以获得更新的全局模型状态。

    Machine Learning Using Relational Databases
    16.
    发明申请
    Machine Learning Using Relational Databases 有权
    机器学习使用关系数据库

    公开(公告)号:US20110066577A1

    公开(公告)日:2011-03-17

    申请号:US12559921

    申请日:2009-09-15

    IPC分类号: G06F15/18 G06N5/04 G06F17/30

    CPC分类号: G06N99/005

    摘要: Machine learning using relational databases is described. In an embodiment a model of a probabilistic relational database is formed by augmenting relation schemas of a relational database with probabilistic attributes. In an example, the model comprises constraints introduced by linking the probabilistic attributes using factor statements. For example, a compiler translates the model into a factor graph data structure which may be passed to an inference engine to carry out machine learning. For example, this enables machine learning to be integrated with the data and it is not necessary to pre-process or reformat large scale data sets for a particular problem domain. In an embodiment a machine learning system for estimating skills of players in an online gaming environment is provided. In another example, a machine learning system for data mining of medical data is provided. In some examples, missing attribute values are filled using machine learning results.

    摘要翻译: 描述使用关系数据库的机器学习。 在一个实施例中,通过用概率属性来增加关系数据库的关系模式来形成概率关系数据库的模型。 在一个例子中,模型包括通过使用因子语句链接概率属性引入的约束。 例如,编译器将该模型转换为因子图数据结构,该结构可被传递给推理机以执行机器学习。 例如,这使得机器学习能够与数据集成,并且不需要为特定问题域预处理或重新格式化大规模数据集。 在一个实施例中,提供了一种用于估计在线游戏环境中的玩家的技能的机器学习系统。 在另一示例中,提供了用于医疗数据的数据挖掘的机器学习系统。 在一些示例中,使用机器学习结果填充缺少的属性值。

    Reward-driven adaptive agents for video games
    17.
    发明授权
    Reward-driven adaptive agents for video games 有权
    奖励驱动的视频游戏自适应代理

    公开(公告)号:US07837543B2

    公开(公告)日:2010-11-23

    申请号:US10837415

    申请日:2004-04-30

    摘要: Adaptive agents are driven by rewards they receive based on the outcome of their behavior during actual game play. Accordingly, the adaptive agents are able to learn from experience within the gaming environment. Reward-driven adaptive agents can be trained at either or both of game-time or development time. Computer-controlled agents receive rewards (either positive or negative) at individual action intervals based on the effectiveness of the agents' actions (e.g., compliance with defined goals). The adaptive computer-controlled agent is motivated to perform actions that maximize its positive rewards and minimize is negative rewards.

    摘要翻译: 自适应代理由他们在实际游戏过程中根据其行为的结果而获得的奖励驱动。 因此,适应性代理能够从游戏环境中的经验中学习。 奖励驱动的自适应代理可以在游戏时间或开发时间中的任一个或两者进行训练。 计算机控制的代理人可以根据代理人的行为的有效性(例如,遵守所定义的目标),在个别行动间隔收到奖励(正或负)。 自适应计算机控制的代理人有动机执行最大化其积极奖励的行动,最小化是否定的回报。

    Scalable Clustering
    18.
    发明申请
    Scalable Clustering 有权
    可扩展聚类

    公开(公告)号:US20100262568A1

    公开(公告)日:2010-10-14

    申请号:US12421853

    申请日:2009-04-10

    IPC分类号: G06N5/02 G06F15/18

    CPC分类号: G06N99/005 G06K9/6226

    摘要: A scalable clustering system is described. In an embodiment the clustering system is operable for extremely large scale applications where millions of items having tens of millions of features are clustered. In an embodiment the clustering system uses a probabilistic cluster model which models uncertainty in the data set where the data set may be for example, advertisements which are subscribed to keywords, text documents containing text keywords, images having associated features or other items. In an embodiment the clustering system is used to generate additional features for associating with a given item. For example, additional keywords are suggested which an advertiser may like to subscribe to. The additional features that are generated have associated probability values which may be used to rank those features in some embodiments. User feedback about the generated features is received and used to revise the feature generation process in some examples.

    摘要翻译: 描述了可扩展的集群系统。 在一个实施例中,聚类系统可操作用于具有数千万个特征的数百万个项目被聚集的极大规模应用。 在一个实施例中,聚类系统使用概率聚类模型,其对数据集中的不确定性进行建模,其中数据集可以是例如订阅关键字的广告,包含文本关键字的文本文档,具有相关联特征或其他项目的图像。 在一个实施例中,聚类系统用于产生用于与给定项目相关联的附加特征。 例如,建议广告客户可能希望订阅的其他关键字。 生成的附加特征具有相关联的概率值,其可用于在某些实施例中对这些特征进行排名。 在一些示例中,接收并用于用户对生成的特征的反馈以修改特征生成过程。

    Mixture model for motion lines in a virtual reality environment
    19.
    发明授权
    Mixture model for motion lines in a virtual reality environment 有权
    虚拟现实环境中运动线的混合模型

    公开(公告)号:US07525546B2

    公开(公告)日:2009-04-28

    申请号:US12028012

    申请日:2008-02-08

    IPC分类号: G06T15/70 A63F9/14

    摘要: Improved human-like realism of computer opponents in racing or motion-related games is provided by using a mixture model to determine a dynamically prescribed racing line that the AI driver is to follow for a given segment of the race track. This dynamically prescribed racing line may vary from segment to segment and lap to lap, roughly following an ideal line with some variation. As such, the AI driver does not appear to statically follow the ideal line perfectly throughout the race. Instead, within each segment of the course, the AI driver's path may smoothly follow a probabilistically-determined racing line defined relative to at least one prescribed racing line.

    摘要翻译: 通过使用混合模型来确定AI驱动程序对于赛道的给定段落要遵循的动态规定的赛车线,提供了赛车或运动相关游戏中的计算机对手的人性化现实主义。 这个动态规定的赛车线可能会有所不同,分段和分段,搭搭到搭乘,大致跟随一个理想的线条,有一些变化。 因此,AI驱动程序在整个比赛中都不会完美地跟随理想线条。 相反,在本课程的每个部分中,AI驾驶员的路径可以顺利地遵循相对于至少一个规定赛车线定义的概率确定的赛车线。

    Bayesian scoring
    20.
    发明授权
    Bayesian scoring 有权
    贝叶斯得分

    公开(公告)号:US07376474B2

    公开(公告)日:2008-05-20

    申请号:US11276184

    申请日:2006-02-16

    IPC分类号: G06F19/00

    CPC分类号: G06Q10/06 A63B71/06 G09B7/02

    摘要: Players in a gaming environment, particularly, electronic on-line gaming environments, may be scored relative to each other or to a predetermined scoring system. The scoring of each player may be based on the outcomes of games between players who compete against each other in one or more teams of one or more players. Each player's score may be represented as a distribution over potential scores which may indicate a confidence level in the distribution representing the player's score. The score distribution for each player may be modeled with a Gaussian distribution and may be determined through a Bayesian inference algorithm. The scoring may be used to track a player's progress and/or standing within the gaming environment, used in a leaderboard indication of rank, and/or may be used to match players with each other in a future game.

    摘要翻译: 在游戏环境中,特别是电子在线游戏环境中的玩家可以相对于彼此或预定的评分系统进行打分。 每个玩家的得分可以基于在一个或多个玩家的一个或多个队中彼此竞争的玩家之间的游戏的结果。 每个玩家的得分可以表示为潜在分数的分布,其可以指示表示玩家得分的分布中的置信水平。 每个玩家的得分分布可以用高斯分布来建模,并且可以通过贝叶斯推理算法来确定。 评分可以用于跟踪玩家在排行榜中使用的游戏环境中的进展和/或站立,并且/或可以用于在未来的游戏中将玩家彼此匹配。