Semantic search via role labeling
    1.
    发明授权
    Semantic search via role labeling 有权
    通过角色标注进行语义搜索

    公开(公告)号:US08392436B2

    公开(公告)日:2013-03-05

    申请号:US12364041

    申请日:2009-02-02

    IPC分类号: G06F7/00 G06F17/00

    CPC分类号: G06F17/30654

    摘要: A method and system for searching for information contained in a database of documents each includes an offline part and an online part. The offline part includes predicting, in a first computer process, semantic data for sentences of the documents contained in the database and storing this data in a database. The online part includes querying the database for information with a semantically-sensitive query, predicting, in a real time computer process, semantic data for the query, and determining, in a second computer process, a matching score against all the documents in the database, which incorporates the semantic data for the sentences and the query.

    摘要翻译: 用于搜索包含在文档数据库中的信息的方法和系统各自包括离线部分和在线部分。 离线部分包括在第一计算机进程中预测包含在数据库中的文档的句子的语义数据,并将该数据存储在数据库中。 在线部分包括使用语义敏感的查询来查询数据库的信息,在实时计算机进程中预测用于查询的语义数据,以及在第二计算机进程中确定与数据库中的所有文档的匹配分数 ,其中包含句子和查询的语义数据。

    Large Scale Manifold Transduction
    2.
    发明申请
    Large Scale Manifold Transduction 有权
    大规模歧管转导

    公开(公告)号:US20090204556A1

    公开(公告)日:2009-08-13

    申请号:US12364059

    申请日:2009-02-02

    IPC分类号: G06F15/18

    CPC分类号: G06K9/6276

    摘要: A method for training a learning machine for use in discriminative classification and regression includes randomly selecting, in a first computer process, an unclassified datapoint associated with a phenomenon of interest; determining, in a second computer process, a set of datapoints associated with the phenomenon of interest that is likely to be in the same class as the selected unclassified datapoint; predicting, in a third computer process, a class label for the selected unclassified datapoint in a third computer process; predicting a class label for the set of datapoints in a fourth computer process; combining the predicted class labels in a fifth computer process, to predict a composite class label that describes the selected unclassified datapoint and the set of datapoints; and using the combined class label to adjust at least one parameter of the learning machine in a sixth computer process.

    摘要翻译: 用于训练用于辨别分类和回归的学习机的方法包括在第一计算机过程中随机选择与感兴趣的现象相关联的未分类的数据点; 在第二计算机进程中确定与可能与所选择的未分类数据点处于同一类别的感兴趣的现象相关联的一组数据点; 在第三计算机进程中,在第三计算机进程中预测所选未分类数据点的类标签; 在第四计算机进程中预测该组数据点的类标签; 在第五计算机进程中组合预测的类标签,以预测描述所选择的未分类数据点和数据点集合的复合类标签; 以及在第六计算机进程中使用组合的类标签来调整学习机器的至少一个参数。

    Fast semantic extraction using a neural network architecture
    3.
    发明授权
    Fast semantic extraction using a neural network architecture 有权
    使用神经网络架构的快速语义提取

    公开(公告)号:US08180633B2

    公开(公告)日:2012-05-15

    申请号:US12039965

    申请日:2008-02-29

    IPC分类号: G10L15/16

    CPC分类号: G06F17/2785

    摘要: A system and method for semantic extraction using a neural network architecture includes indexing each word in an input sentence into a dictionary and using these indices to map each word to a d-dimensional vector (the features of which are learned). Together with this, position information for a word of interest (the word to labeled) and a verb of interest (the verb that the semantic role is being predicted for) with respect to a given word are also used. These positions are integrated by employing a linear layer that is adapted to the input sentence. Several linear transformations and squashing functions are then applied to output class probabilities for semantic role labels. All the weights for the whole architecture are trained by backpropagation.

    摘要翻译: 使用神经网络架构的语义提取的系统和方法包括将输入语句中的每个单词索引到词典中,并且使用这些索引将每个单词映射到d维向量(其特征被学习)。 与此同时,还使用了一个关于一个给定单词的感兴趣的词的位置信息(被标记的词)和一个感兴趣的动词(语义角色被预测的动词)。 通过采用适合于输入句子的线性层来集成这些位置。 然后将多个线性变换和压缩函数应用于语义角色标签的输出类概率。 整个建筑的所有重量都通过反向传播进行训练。

    Deep neural networks and methods for using same
    4.
    发明授权
    Deep neural networks and methods for using same 有权
    深层神经网络及其使用方法

    公开(公告)号:US08504361B2

    公开(公告)日:2013-08-06

    申请号:US12367788

    申请日:2009-02-09

    IPC分类号: G10L15/16

    CPC分类号: G06F17/277

    摘要: A method and system for labeling a selected word of a sentence using a deep neural network includes, in one exemplary embodiment, determining an index term corresponding to each feature of the word, transforming the index term or terms of the word into a vector, and predicting a label for the word using the vector. The method and system, in another exemplary embodiment, includes determining, for each word in the sentence, an index term corresponding to each feature of the word, transforming the index term or terms of each word in the sentence into a vector, applying a convolution operation to the vector of the selected word and at least one of the vectors of the other words in the sentence, to transform the vectors into a matrix of vectors, each of the vectors in the matrix including a plurality of row values, constructing a single vector from the vectors in the matrix, and predicting a label for the selected word using the single vector.

    摘要翻译: 在一个示例性实施例中,用于使用深层神经网络标记句子的选定单词的方法和系统包括:确定对应于单词的每个特征的索引项,将该词的索引项或项变换为向量,以及 使用向量预测单词的标签。 在另一示例性实施例中,该方法和系统包括为每个词语确定与单词的每个特征相对应的索引项,将该词中的每个单词的索引项或项变换为向量,应用卷积 对所选择的单词的向量和句子中的其他单词的向量中的至少一个进行操作,将向量变换为向量矩阵,矩阵中的每个矢量包括多个行值,构成单个 向量,并使用单个向量来预测所选择的单词的标签。

    SUPERVISED SEMANTIC INDEXING AND ITS EXTENSIONS
    5.
    发明申请
    SUPERVISED SEMANTIC INDEXING AND ITS EXTENSIONS 有权
    监督语义索引及其扩展

    公开(公告)号:US20100179933A1

    公开(公告)日:2010-07-15

    申请号:US12562802

    申请日:2009-09-18

    IPC分类号: G06F17/30 G06F15/18

    CPC分类号: G06F17/30663 G06F17/30616

    摘要: A system and method for determining a similarity between a document and a query includes building a weight vector for each of a plurality of documents in a corpus of documents stored in memory and building a weight vector for a query input into a document retrieval system. A weight matrix is generated which distinguishes between relevant documents and lower ranked documents by comparing document/query tuples using a gradient step approach. A similarity score is determined between weight vectors of the query and documents in a corpus by determining a product of a document weight vector, a query weight vector and the weight matrix.

    摘要翻译: 用于确定文档和查询之间的相似度的系统和方法包括为存储在存储器中的文档的语料库中的多个文档中的每个文档建立权重向量,并且建立用于向文档检索系统输入的查询的加权向量。 生成权重矩阵,通过使用梯度步骤方法比较文档/查询元组来区分相关文档和较低排名的文档。 通过确定文档权重向量,查询权重向量和权重矩阵的乘积,在查询的权重向量和语料库中的文档之间确定相似性得分。

    Deep Neural Networks and Methods for Using Same
    6.
    发明申请
    Deep Neural Networks and Methods for Using Same 有权
    深层神经网络及其使用方法

    公开(公告)号:US20090210218A1

    公开(公告)日:2009-08-20

    申请号:US12367788

    申请日:2009-02-09

    IPC分类号: G06F17/27

    CPC分类号: G06F17/277

    摘要: A method and system for labeling a selected word of a sentence using a deep neural network includes, in one exemplary embodiment, determining an index term corresponding to each feature of the word, transforming the index term or terms of the word into a vector, and predicting a label for the word using the vector. The method and system, in another exemplary embodiment, includes determining, for each word in the sentence, an index term corresponding to each feature of the word, transforming the index term or terms of each word in the sentence into a vector, applying a convolution operation to the vector of the selected word and at least one of the vectors of the other words in the sentence, to transform the vectors into a matrix of vectors, each of the vectors in the matrix including a plurality of row values, constructing a single vector from the vectors in the matrix, and predicting a label for the selected word using the single vector.

    摘要翻译: 在一个示例性实施例中,用于使用深层神经网络标记句子的选定单词的方法和系统包括:确定对应于单词的每个特征的索引项,将该词的索引项或项变换为向量,以及 使用向量预测单词的标签。 在另一示例性实施例中,该方法和系统包括为每个词语确定与单词的每个特征相对应的索引项,将该词中的每个单词的索引项或项变换为向量,应用卷积 对所选择的单词的向量和句子中的其他单词的向量中的至少一个进行操作,将向量变换为向量矩阵,矩阵中的每个矢量包括多个行值,构成单个 向量,并使用单个向量来预测所选择的单词的标签。

    FAST SEMANTIC EXTRACTION USING A NEURAL NETWORK ARCHITECTURE
    7.
    发明申请
    FAST SEMANTIC EXTRACTION USING A NEURAL NETWORK ARCHITECTURE 有权
    使用神经网络架构进行快速语义提取

    公开(公告)号:US20080221878A1

    公开(公告)日:2008-09-11

    申请号:US12039965

    申请日:2008-02-29

    IPC分类号: G10L15/16

    CPC分类号: G06F17/2785

    摘要: A system and method for semantic extraction using a neural network architecture includes indexing each word in an input sentence into a dictionary and using these indices to map each word to a d-dimensional vector (the features of which are learned). Together with this, position information for a word of interest (the word to labeled) and a verb of interest (the verb that the semantic role is being predicted for) with respect to a given word are also used. These positions are integrated by employing a linear layer that is adapted to the input sentence. Several linear transformations and squashing functions are then applied to output class probabilities for semantic role labels. All the weights for the whole architecture are trained by backpropagation.

    摘要翻译: 使用神经网络架构的语义提取的系统和方法包括将输入语句中的每个单词索引到词典中,并且使用这些索引将每个单词映射到d维向量(其特征被学习)。 与此同时,还使用了一个关于一个给定单词的感兴趣的词的位置信息(被标记的词)和一个感兴趣的动词(语义角色被预测的动词)。 通过采用适合于输入句子的线性层来集成这些位置。 然后将多个线性变换和压缩函数应用于语义角色标签的输出类概率。 整个建筑的所有重量都通过反向传播进行训练。

    METHOD FOR TRAINING A LEARNING MACHINE HAVING A DEEP MULTI-LAYERED NETWORK WITH LABELED AND UNLABELED TRAINING DATA
    8.
    发明申请
    METHOD FOR TRAINING A LEARNING MACHINE HAVING A DEEP MULTI-LAYERED NETWORK WITH LABELED AND UNLABELED TRAINING DATA 有权
    用于训练具有标签和非完整培训数据的深层多层网络的学习机的方法

    公开(公告)号:US20090204558A1

    公开(公告)日:2009-08-13

    申请号:US12367278

    申请日:2009-02-06

    IPC分类号: G06F15/18

    CPC分类号: G06N3/08 G06K9/6251

    摘要: A method for training a learning machine having a deep network with a plurality of layers, includes applying a regularizer to one or more of the layers of the deep network; training the regularizer with unlabeled data; and training the deep network with labeled data. Also, an apparatus for use in discriminative classification and regression, including an input device for inputting unlabeled and labeled data associated with a phenomenon of interest; a processor; and a memory communicating with the processor. The memory includes instructions executable by the processor for implementing a learning machine having a deep network structure and training the learning machine by applying a regularizer to one or more of the layers of the deep network; training the regularizer with unlabeled data; and training the deep network with labeled data.

    摘要翻译: 一种用于训练具有多个层的深度网络的学习机的方法,包括:对所述深层网络的一个或多个层应用正则化; 训练正规者与未标记的数据; 并用标签数据训练深层网络。 另外,一种用于鉴别分类和回归的装置,包括输入装置,用于输入与感兴趣的现象相关联的未标记和标记的数据; 处理器 以及与处理器通信的存储器。 存储器包括可由处理器执行的用于实现具有深度网络结构的学习机器的指令,并且通过将深度网络的一个或多个层应用校正器来训练学习机器; 训练正规者与未标记的数据; 并用标签数据训练深层网络。

    Method and Apparatus for Transductive Support Vector Machines
    9.
    发明申请
    Method and Apparatus for Transductive Support Vector Machines 有权
    用于转换支持向量机的方法和装置

    公开(公告)号:US20070265991A1

    公开(公告)日:2007-11-15

    申请号:US11688928

    申请日:2007-03-21

    IPC分类号: G06F15/18

    CPC分类号: G06K9/6269 G06K9/6287

    摘要: Disclosed is a method for training a transductive support vector machine. The support vector machine is trained based on labeled training data and unlabeled test data. A non-convex objective function which optimizes a hyperplane classifier for classifying the unlabeled test data is decomposed into a convex function and a concave function. A local approximation of the concave function at a hyperplane is calculated, and the approximation of the concave function is combined with the convex function such that the result is a convex problem. The convex problem is then solved to determine an updated hyperplane. This method is performed iteratively until the solution converges.

    摘要翻译: 公开了用于训练换能支持向量机的方法。 支持向量机基于标记的训练数据和未标记的测试数据进行训练。 优化用于对未标记测试数据进行分类的超平面分类器的非凸目标函数被分解为凸函数和凹函数。 计算超平面上的凹函数的局部近似,并将凹函数的近似与凸函数相结合,使得结果为凸问题。 然后解决凸问题以确定更新的超平面。 该方法迭代执行,直到解收敛。

    Supervised semantic indexing and its extensions
    10.
    发明授权
    Supervised semantic indexing and its extensions 有权
    监督语义索引及其扩展

    公开(公告)号:US08341095B2

    公开(公告)日:2012-12-25

    申请号:US12562802

    申请日:2009-09-18

    IPC分类号: G06F15/18 G06F7/00 G06F3/00

    CPC分类号: G06F17/30663 G06F17/30616

    摘要: A system and method for determining a similarity between a document and a query includes building a weight vector for each of a plurality of documents in a corpus of documents stored in memory and building a weight vector for a query input into a document retrieval system. A weight matrix is generated which distinguishes between relevant documents and lower ranked documents by comparing document/query tuples using a gradient step approach. A similarity score is determined between weight vectors of the query and documents in a corpus by determining a product of a document weight vector, a query weight vector and the weight matrix.

    摘要翻译: 用于确定文档和查询之间的相似度的系统和方法包括为存储在存储器中的文档的语料库中的多个文档中的每个文档建立权重向量,并且建立用于向文档检索系统输入的查询的加权向量。 生成权重矩阵,通过使用梯度步骤方法比较文档/查询元组来区分相关文档和较低排名的文档。 通过确定文档权重向量,查询权重向量和权重矩阵的乘积,在查询的权重向量和语料库中的文档之间确定相似性得分。