JOINT APPROACH TO FEATURE AND DOCUMENT LABELING
    1.
    发明申请
    JOINT APPROACH TO FEATURE AND DOCUMENT LABELING 审中-公开
    特征和文件标签的联合方法

    公开(公告)号:US20160203209A1

    公开(公告)日:2016-07-14

    申请号:US14594622

    申请日:2015-01-12

    CPC classification number: G06N99/005 G06F17/30707

    Abstract: Documents of a set of documents are represented by bag-of-words (BOW) vectors. L labeled topics are provided, each labeled with a word list comprising words of a vocabulary that are representative of the labeled topic and possibly a list of relevant documents. Probabilistic classification of the documents generates for each labeled topic a document vector whose elements store scores of the documents for the labeled topic and a word vector whose elements store scores of the words of the vocabulary for the labeled topic. Non-negative matrix factorization (NMF) is performed to generate a document-topic model that clusters the documents into k topics where k>L. NMF factors representing L topics of the k topics are initialized to the document and word vectors for the L labeled topics. In some embodiments the NMF factors representing the L topics initialized to the document and word vectors are frozen, that is, are not updated by the NMF after the initialization.

    Abstract translation: 一组文件的文件由袋子(BOW)向量表示。 提供了L个标记的主题,每个标题都包含一个单词列表,其中包含代表标记话题的词汇表,并且可能列出相关文档。 文档的概率分类为每个标记的主题生成文档向量,该文档向量的元素存储用于标记的主题的文档的分数,以及其元素存储标记的主题的词汇表的分数的单词向量。 执行非负矩阵分解(NMF)以生成将文档聚类到k个主题的文档主题模型,其中k> L。 表示k个主题的L个主题的NMF因子被初始化为L个标记主题的文档和字向量。 在一些实施例中,表示初始化为文档和字向量的L个主题的NMF因子被冻结,即,在初始化之后不被NMF更新。

    OVERLAPPING TRACE NORMS FOR MULTI-VIEW LEARNING
    2.
    发明申请
    OVERLAPPING TRACE NORMS FOR MULTI-VIEW LEARNING 有权
    多视图学习的重叠跟踪法则

    公开(公告)号:US20160026925A1

    公开(公告)日:2016-01-28

    申请号:US14339994

    申请日:2014-07-24

    CPC classification number: G06N99/005 G06F17/30 G06F17/30601 G06N7/00

    Abstract: In multi-view learning, optimized prediction matrices are determined for V≧2 views of n objects, and a prediction of a view of an object is generated based on the optimized prediction matrix for that view. An objective is optimized, wherein is a set of parameters including at least the V prediction matrices and a concatenated matrix comprising a concatenation of the prediction matrices, and comprises a sum including at least a loss function for each view, a trace norm of the prediction matrix for each view, and a trace norm of the concatenated matrix. may further include a sparse matrix for each view, with further including an element-wise norm of the sparse matrix for each view. may further include regularization parameters scaling the trace norms of the prediction matrices and the trace norm of the concatenated matrix.

    Abstract translation: 在多视图学习中,针对n个对象的V≥2视图确定优化预测矩阵,并且基于该视图的优化预测矩阵来生成对象视图的预测。 优化目标,其中是包括至少V个预测矩阵和包括预测矩阵的级联的级联矩阵的一组参数,并且包括至少包括每个视图的损失函数,预测的跟踪范数 每个视图的矩阵,以及连接矩阵的跟踪范数。 还可以包括用于每个视图的稀疏矩阵,还包括用于每个视图的稀疏矩阵的元素范数。 还可以包括缩放预测矩阵的轨迹范数的正则化参数和级联矩阵的轨迹范数。

    Convex collective matrix factorization
    3.
    发明授权
    Convex collective matrix factorization 有权
    凸集矩阵分解

    公开(公告)号:US09058303B2

    公开(公告)日:2015-06-16

    申请号:US13689955

    申请日:2012-11-30

    CPC classification number: G06F17/16 G06F17/30867 G06N5/043

    Abstract: A method operates on observed relationship data between pairs of entities of a set of entities including entities of at least two (and optionally at least three) different entity types. An observed collective symmetric matrix is constructed in which element (n,m)=element (m,n) stores the observed relationship between entities indexed n and m when the observed relationship data includes this observed relationship. A prediction collective symmetric matrix is optimized in order to minimize a loss function comparing the observed collective symmetric matrix and the prediction collective symmetric matrix. A relationship between two entities of the set of entities is predicted using the optimized prediction collective symmetric matrix. Entities of the same entity type may be indexed using a contiguous set of indices such that the entity type maps to a contiguous set of rows and corresponding contiguous set of columns in the observed collective symmetric matrix.

    Abstract translation: 一种方法对包括至少两个(且可选地至少三个)不同实体类型的实体的一组实体对的实体对之间的观察关系数据进行操作。 构建观察到的集体对称矩阵,其中元素(n,m)=元素(m,n)存储当观察到的关系数据包括该观察到的关系时,被索引为n和m的实体之间观察到的关系。 优化预测集体对称矩阵,以便将观察到的集体对称矩阵和预测集体对称矩阵进行比较来最小化损失函数。 使用优化的预测集体对称矩阵来预测该组实体的两个实体之间的关系。 可以使用连续的索引集索引相同实体类型的实体,使得实体类型映射到观察到的集体对称矩阵中的连续的行集合和对应的连续的列集合。

    Overlapping trace norms for multi-view learning
    4.
    发明授权
    Overlapping trace norms for multi-view learning 有权
    用于多视图学习的重叠跟踪规范

    公开(公告)号:US09542654B2

    公开(公告)日:2017-01-10

    申请号:US14339994

    申请日:2014-07-24

    CPC classification number: G06N99/005 G06F17/30 G06F17/30601 G06N7/00

    Abstract: In multi-view learning, optimized prediction matrices are determined for V≧2 views of n objects, and a prediction of a view of an object is generated based on the optimized prediction matrix for that view. An objective is optimized, wherein is a set of parameters including at least the V prediction matrices and a concatenated matrix comprising a concatenation of the prediction matrices, and comprises a sum including at least a loss function for each view, a trace norm of the prediction matrix for each view, and a trace norm of the concatenated matrix. may further include a sparse matrix for each view, with further including an element-wise norm of the sparse matrix for each view. may further include regularization parameters scaling the trace norms of the prediction matrices and the trace norm of the concatenated matrix.

    Abstract translation: 在多视图学习中,针对n个对象的V≥2视图确定优化预测矩阵,并且基于该视图的优化预测矩阵来生成对象视图的预测。 优化目标,其中是包括至少V个预测矩阵和包括预测矩阵的级联的级联矩阵的一组参数,并且包括至少包括每个视图的损失函数,预测的跟踪范数 每个视图的矩阵,以及连接矩阵的跟踪范数。 还可以包括用于每个视图的稀疏矩阵,还包括用于每个视图的稀疏矩阵的元素范数。 还可以包括缩放预测矩阵的轨迹范数的正则化参数和级联矩阵的轨迹范数。

    USER PROFILING FOR ESTIMATING PRINTING PERFORMANCE
    5.
    发明申请
    USER PROFILING FOR ESTIMATING PRINTING PERFORMANCE 审中-公开
    用于评估打印性能的用户配置文件

    公开(公告)号:US20140180651A1

    公开(公告)日:2014-06-26

    申请号:US13774020

    申请日:2013-02-22

    CPC classification number: G16B40/00 G06Q10/06

    Abstract: A computer-implemented system and method compute a reference behavior for a user, such as a new user of a set of shared devices or services. The method includes acquiring usage data for an initial set of users of the devices and extracting features from the usage data. A model is learned with the extracted features for predicting a user role profile for a new user based on features extracted from the new user's usage data. The user role profile associates the user with at least one of a set of roles. A new user's usage data is received and, with the trained model, a user role profile is predicting for the new user based on features extracted from the new user's usage data. A reference behavior is computed for the user based on the predicted user role profile and the reference behaviors for roles in the set of roles.

    Abstract translation: 计算机实现的系统和方法计算用户(例如一组共享设备或服务的新用户)的参考行为。 该方法包括获取设备的初始用户组的使用数据并从使用数据中提取特征。 根据从新用户的使用数据提取的特征,利用提取的特征来学习模型来预测新用户的用户角色简档。 用户角色配置文件将用户与一组角色中的至少一个相关联。 接收到新用户的使用数据,并且经过训练的模型,用户角色简档是基于从新用户的使用数据提取的特征来预测新用户的。 基于预测的用户角色配置文件和角色集中的角色的参考行为,为用户计算参考行为。

    Joint approach to feature and document labeling

    公开(公告)号:US10055479B2

    公开(公告)日:2018-08-21

    申请号:US14594622

    申请日:2015-01-12

    CPC classification number: G06N20/00 G06F16/353

    Abstract: Documents of a set of documents are represented by bag-of-words (BOW) vectors. L labeled topics are provided, each labeled with a word list comprising words of a vocabulary that are representative of the labeled topic and possibly a list of relevant documents. Probabilistic classification of the documents generates for each labeled topic a document vector whose elements store scores of the documents for the labeled topic and a word vector whose elements store scores of the words of the vocabulary for the labeled topic. Non-negative matrix factorization (NMF) is performed to generate a document-topic model that clusters the documents into k topics where k>L. NMF factors representing L topics of the k topics are initialized to the document and word vectors for the L labeled topics. In some embodiments the NMF factors representing the L topics initialized to the document and word vectors are frozen, that is, are not updated by the NMF after the initialization.

    Spectral diagnostic engine for customer support call center

    公开(公告)号:US09813555B2

    公开(公告)日:2017-11-07

    申请号:US14569095

    申请日:2014-12-12

    Abstract: A collective matrix is constructed, having a diagnostic sessions dimension and a diagnostic state descriptors dimension. The diagnostic state descriptors dimension includes a plurality of symptom fields, a plurality of root cause fields, and a plurality of solution fields. Collective matrix factorization of the collective matrix is performed to generate a factored collective matrix comprising a sessions factor matrix embedding diagnostic sessions and a descriptors factor matrix embedding diagnostic state descriptors. An in-progress diagnostic session is embedded in the factored collective matrix. A symptom or solution is recommended for evaluation in the in-progress diagnostic session based on the embedding. The diagnostic state descriptors dimension may further include at least one information field storing a representation (for example, a bag-of-words representation) of a semantic description of a problem being diagnosed by the in-progress diagnostic session.

    Transportation network micro-simulation with pre-emptive decomposition
    8.
    发明授权
    Transportation network micro-simulation with pre-emptive decomposition 有权
    运输网络微模拟与抢先分解

    公开(公告)号:US09400680B2

    公开(公告)日:2016-07-26

    申请号:US14532127

    申请日:2014-11-04

    CPC classification number: G06N7/005 G06F9/4887 G06F2209/483 G06F2209/484

    Abstract: In a parallel computing method performed by a parallel computing system comprising a plurality of central processing units (CPUs), a main process executes. Tasks are executed in parallel with the main process on CPUs not used in executing the main process. Results of completed tasks are stored in a cache, from which the main process retrieves completed task results when needed. The initiation of task execution is controlled by a priority ranking of tasks based on at least probabilities that task results will be needed by the main process and time limits for executing the tasks. The priority ranking of tasks is from the vantage point of a current execution point in the main process and is updated as the main process executes. An executing task may be pre-empted by a task having higher priority if no idle CPU is available.

    Abstract translation: 在由包括多个中央处理单元(CPU)的并行计算系统执行的并行计算方法中,执行主处理。 任务与执行主进程中未使用的CPU的主进程并行执行。 完成的任务的结果存储在缓存中,主进程在需要时从中获取完成的任务结果。 任务执行的启动由任务基于至少概率的优先级排序来控制,任务结果将由主进程和执行任务的时间限制所需要。 任务的优先级排序来自主进程当前执行点的有利位置,并在主进程执行时更新。 如果没有空闲CPU可用,则执行任务可以被具有较高优先级的任务抢占。

    TRANSPORTATION NETWORK MICRO-SIMULATION PRE-EMPTIVE DECOMPOSITION
    9.
    发明申请
    TRANSPORTATION NETWORK MICRO-SIMULATION PRE-EMPTIVE DECOMPOSITION 有权
    运输网络微模拟预分解

    公开(公告)号:US20160124770A1

    公开(公告)日:2016-05-05

    申请号:US14532127

    申请日:2014-11-04

    CPC classification number: G06N7/005 G06F9/4887 G06F2209/483 G06F2209/484

    Abstract: In a parallel computing method performed by a parallel computing system comprising a plurality of central processing units (CPUs), a main process executes. Tasks are executed in parallel with the main process on CPUs not used in executing the main process. Results of completed tasks are stored in a cache, from which the main process retrieves completed task results when needed. The initiation of task execution is controlled by a priority ranking of tasks based on at least probabilities that task results will be needed by the main process and time limits for executing the tasks. The priority ranking of tasks is from the vantage point of a current execution point in the main process and is updated as the main process executes. An executing task may be pre-empted by a task having higher priority if no idle CPU is available.

    Abstract translation: 在由包括多个中央处理单元(CPU)的并行计算系统执行的并行计算方法中,执行主处理。 任务与执行主进程中未使用的CPU的主进程并行执行。 完成的任务的结果存储在缓存中,主进程在需要时从中获取完成的任务结果。 任务执行的启动由任务基于至少概率的优先级排序来控制,任务结果将由主进程和执行任务的时间限制所需要。 任务的优先级排序来自主进程当前执行点的有利位置,并在主进程执行时更新。 如果没有空闲CPU可用,则执行任务可以被具有较高优先级的任务抢占。

    PROBABILISTIC RELATIONAL DATA ANALYSIS
    10.
    发明申请
    PROBABILISTIC RELATIONAL DATA ANALYSIS 审中-公开
    概率关系数据分析

    公开(公告)号:US20140156231A1

    公开(公告)日:2014-06-05

    申请号:US13690071

    申请日:2012-11-30

    CPC classification number: G06F17/18 G06N7/005

    Abstract: A multi-relational data set is represented by a probabilistic multi-relational data model in which each entity of the multi-relational data set is represented by a D-dimensional latent feature vector. The probabilistic multi-relational data model is trained using a collection of observations of relations between entities of the multi-relational data set. The collection of observations includes observations of at least two different relation types. A prediction is generated for an observation of a relation between two or more entities of the multi-relational data set based on a dot product of the optimized D-dimensional latent feature vectors representing the two or more entities. The training may comprise optimizing the D-dimensional latent feature vectors to maximize likelihood of the collection of observations, for example by Bayesian inference performed using Gibbs sampling.

    Abstract translation: 多关系数据集由概率多关系数据模型表示,其中多关系数据集的每个实体由D维潜在特征向量表示。 概率多关系数据模型使用多关系数据集的实体之间的关系的观察集来训练。 观察的收集包括至少两种不同关系类型的观察。 生成用于基于代表两个或多个实体的优化的D维潜在特征向量的点积来观察多关系数据集的两个或多个实体之间的关系的预测。 该训练可以包括优化D维潜在特征向量以最大化观察的收集的可能性,例如通过使用吉布斯抽样执行的贝叶斯推理。

Patent Agency Ranking