专利检索 ap:("Jingrui He" OR "Richard D. Lawrence" OR "Prem Melville" OR "Vikas Sindhwani" OR "Vijil E. Chenthamarakshan") AND inv:"Richard D. Lawrence" 第 1 页

1.

发明申请
SYSTEM AND METHOD FOR AUTOMATED LABELING OF TEXT DOCUMENTS USING ONTOLOGIES 审中-公开

公开(公告)号：US20130018828A1

公开(公告)日：2013-01-17

申请号：US13619059

申请日：2012-09-14

申请人： Jingrui He , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Vijil E. Chenthamarakshan

发明人： Jingrui He , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Vijil E. Chenthamarakshan

IPC分类号： G06N5/02 , G06F15/18

CPC分类号： G06N5/022 , G06F16/353 , G06F16/367

摘要： A first mapping function automatically maps a plurality of documents each with a concept of ontology to create a documents-to-ontology distribution. An ontology-to-class distribution that maps concepts in the ontology to class labels, respectively, is received, and a classifier is generated that labels a selected document with an associated class identified based on the documents-to-ontology distribution and the ontology-to-class distribution.

2.

发明申请
SYSTEM AND METHOD FOR AUTOMATED LABELING OF TEXT DOCUMENTS USING ONTOLOGIES 审中-公开
标题翻译：使用本体论自动标记文本文件的系统和方法

公开(公告)号：US20130018827A1

公开(公告)日：2013-01-17

申请号：US13184156

申请日：2011-07-15

申请人： Jingrui He , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Vijil E. Chenthamarakshan

发明人： Jingrui He , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Vijil E. Chenthamarakshan

IPC分类号： G06F15/18 , G06N5/02

CPC分类号： G06N5/022 , G06F16/353 , G06F16/367

摘要： A first mapping function automatically maps a plurality of documents each with a concept of ontology to create a documents-to-ontology distribution. An ontology-to-class distribution that maps concepts in the ontology to class labels, respectively, is received, and a classifier is generated that labels a selected document with an associated class identified based on the documents-to-ontology distribution and the ontology-to-class distribution.

摘要翻译： 第一映射函数自动地将多个文档映射到本体的概念上以创建文档到本体分布。接收将本体中的概念分别映射到类标签的本体到类的分发，并且生成一个分类器，该分类器将所选择的文档标记为基于文档到本体分布和本体 - 班级分配。

3.

发明申请
INFERRING EMERGING AND EVOLVING TOPICS IN STREAMING TEXT 有权
标题翻译：在流动文字中传播新兴和演变主题

公开(公告)号：US20130151520A1

公开(公告)日：2013-06-13

申请号：US13315798

申请日：2011-12-09

申请人： Saha Ankan , Arindam Banerjee , Shiva P. Kasiviswanathan , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Edison L. Ting

发明人： Saha Ankan , Arindam Banerjee , Shiva P. Kasiviswanathan , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Edison L. Ting

IPC分类号： G06F17/30

CPC分类号： G06F17/2785 , G06F17/30619

摘要： A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify a first group of topics as evolving topics and a second group of topics as emerging topics. The matrices includes a first matrix X identifying a multitude of words in each of the documents, a second matrix W identifying a multitude of topics in each of the documents, and a third matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, the documents form a streaming dataset, and two forms of temporal regularizers are used to help identify the evolving topics and the emerging topics in the streaming dataset.

摘要翻译： 一套用于推断主题演变和出现在一组文件中的方法，系统和计算机程序产品。在一个实施例中，该方法包括使用文档中的文本形成一组矩阵，并且分析这些矩阵以将第一组主题识别为演变主题，将第二组主题识别为新兴主题。矩阵包括识别每个文档中的多个单词的第一矩阵X，标识每个文档中的众多主题的第二矩阵W，以及为每个主题中的每一个标识多个单词的第三矩阵H 。对这些矩阵进行分析，以确定不断发展的新兴主题。在一个实施例中，文档形成流数据集，并且使用两种形式的时间规则化器来帮助识别流数据集中不断发展的主题和新兴主题。

4.

发明授权
Inferring emerging and evolving topics in streaming text 有权
标题翻译：推动流媒体文本中新兴和不断发展的话题

公开(公告)号：US08909643B2

公开(公告)日：2014-12-09

申请号：US13315798

申请日：2011-12-09

申请人： Saha Ankan , Arindam Banerjee , Shiva P. Kasiviswanathan , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Edison L. Ting

发明人： Saha Ankan , Arindam Banerjee , Shiva P. Kasiviswanathan , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Edison L. Ting

IPC分类号： G06F17/30

CPC分类号： G06F17/2785 , G06F17/30619

摘要： A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify a first group of topics as evolving topics and a second group of topics as emerging topics. The matrices includes a first matrix X identifying a multitude of words in each of the documents, a second matrix W identifying a multitude of topics in each of the documents, and a third matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, the documents form a streaming dataset, and two forms of temporal regularizers are used to help identify the evolving topics and the emerging topics in the streaming dataset.

摘要翻译： 一套用于推断主题演变和出现在一组文件中的方法，系统和计算机程序产品。在一个实施例中，该方法包括使用文档中的文本形成一组矩阵，并且分析这些矩阵以将第一组主题识别为演变主题，将第二组主题识别为新兴主题。矩阵包括识别每个文档中的多个单词的第一矩阵X，标识每个文档中的众多主题的第二矩阵W，以及为每个主题中的每一个标识多个单词的第三矩阵H 。对这些矩阵进行分析，以确定不断发展的新兴主题。在一个实施例中，文档形成流数据集，并且使用两种形式的时间规则化器来帮助识别流数据集中不断发展的主题和新兴主题。

5.

发明申请
INFERRING EMERGING AND EVOLVING TOPICS IN STREAMING TEXT 审中-公开
标题翻译：在流动文字中传播新兴和演变主题

公开(公告)号：US20130151525A1

公开(公告)日：2013-06-13

申请号：US13616403

申请日：2012-09-14

申请人： Saha Ankan , Arindam Banerjee , Shiva P. Kasiviswanathan , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Edison L. Ting

发明人： Saha Ankan , Arindam Banerjee , Shiva P. Kasiviswanathan , Richard D. Lawrence , Prem Melville , Vikas Sindhwani , Edison L. Ting

IPC分类号： G06F17/30

CPC分类号： G06F17/2785 , G06F16/316

摘要： A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify evolving topics and emerging topics. The matrices includes a matrix X identifying a multitude of words in each of the documents, a matrix W identifying a multitude of topics in each of the documents, and a matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, two forms of temporal regularizers are used to help identify the evolving and emerging topics. In another embodiment, a two stage approach involving detection and clustering is used to help identify the evolving and emerging topics.

摘要翻译： 一套用于推断主题演变和出现在一组文件中的方法，系统和计算机程序产品。在一个实施例中，该方法包括使用文档中的文本形成一组矩阵，并且分析这些矩阵以识别演进主题和新兴主题。矩阵包括识别每个文档中的多个单词的矩阵X，标识每个文档中的众多主题的矩阵W以及为每个主题识别多个单词的矩阵H。对这些矩阵进行分析，以确定不断发展的新兴主题。在一个实施例中，使用两种形式的时间正则化器来帮助识别不断发展和新兴的主题。在另一个实施例中，使用涉及检测和聚类的两阶段方法来帮助识别不断发展和新兴的主题。

6.

发明授权
System and method for domain adaption with partial observation 有权
标题翻译：用局部观察进行域适应的系统和方法

公开(公告)号：US08856050B2

公开(公告)日：2014-10-07

申请号：US13006245

申请日：2011-01-13

申请人： Vijil E. Chenthamarakshan , Richard D. Lawrence , Yan Liu , Dan Zhang

发明人： Vijil E. Chenthamarakshan , Richard D. Lawrence , Yan Liu , Dan Zhang

IPC分类号： G06F15/18

CPC分类号： G06N99/005 , G06F17/3071

摘要： A novel domain adaption/transfer learning method applied to the problem of classifying abbreviated documents, e.g., short text messages, instant messages, tweets. The method uses a large number of multi-labeled examples (source domain) to improve the learning on the partial observations (target domain). Specifically, a hidden, higher-level abstraction space is learned that is meaningful for the multi-labeled examples in the source domain. This is done by simultaneously minimizing the document reconstruction error and the error in a classification model learned in the hidden space using known labels from the source domain. The partial observations in the target space are then mapped to the same hidden space, and classified into the label space determined by the source domain.

摘要翻译： 一种适用于对简短文件进行分类的问题的新颖的领域适应/转移学习方法，例如短文本消息，即时消息，推文。该方法使用大量多标记示例（源域）来改善部分观察（目标域）的学习。具体来说，学习一个隐藏的，更高级别的抽象空间，这对于源域中的多标签示例是有意义的。这是通过使用来自源域的已知标签在隐藏空间中学习的分类模型中同时最小化文档重建错误和错误来完成的。然后将目标空间中的部分观察值映射到相同的隐藏空间，并将其分类为由源域确定的标签空间。

7.

发明申请
SYSTEM AND METHOD FOR DOMAIN ADAPTION WITH PARTIAL OBSERVATION 有权
标题翻译：用于局部观察的域适应的系统和方法

公开(公告)号：US20120185415A1

公开(公告)日：2012-07-19

申请号：US13006245

申请日：2011-01-13

申请人： Vijil E. Chenthamarakshan , Richard D. Lawrence , Yan Liu , Dan Zhang

发明人： Vijil E. Chenthamarakshan , Richard D. Lawrence , Yan Liu , Dan Zhang

IPC分类号： G06F15/18

CPC分类号： G06N99/005 , G06F17/3071

摘要： System, method and computer program product provides a novel domain adaption/transfer learning approach applied to the problem of classifying abbreviated documents, e.g., short text messages, instant messages, tweets. The proposed method uses a large number of multi-labeled examples (source domain) to improve the learning on the partial observations (target domain). Specifically, a hidden, higher-level abstraction space is learned that is meaningful for the multi-labeled examples in the source domain. This is done by simultaneously minimizing the document reconstruction error and the error in a classification model learned in the hidden space using known labels from the source domain. The partial observations in the target space are then mapped to the same hidden space, and classified into the label space determined by the source domain. Exemplary results provided for a Twitter dataset demonstrate that the method identifies meaningful hidden topics and provides useful classifications of specific tweets.

摘要翻译： 系统，方法和计算机程序产品提供了一种新颖的域适应/转移学习方法，其应用于对简短文档进行分类的问题，例如短文本消息，即时消息，推文。所提出的方法使用大量多标记示例（源域）来改善部分观察（目标域）上的学习。具体来说，学习一个隐藏的，更高级别的抽象空间，这对于源域中的多标签示例是有意义的。这是通过使用来自源域的已知标签在隐藏空间中学习的分类模型中同时最小化文档重建错误和错误来完成的。然后将目标空间中的部分观察值映射到相同的隐藏空间，并将其分类为由源域确定的标签空间。为Twitter数据集提供的示例性结果表明该方法识别有意义的隐藏主题，并提供特定推文的有用分类。

8.

发明授权
System and method for domain adaption with partial observation 有权

公开(公告)号：US08856052B2

公开(公告)日：2014-10-07

申请号：US13618603

申请日：2012-09-14

申请人： Vijil E. Chenthamarakshan , Richard D. Lawrence , Yan Liu , Dan Zhang

发明人： Vijil E. Chenthamarakshan , Richard D. Lawrence , Yan Liu , Dan Zhang

IPC分类号： G06F15/18

CPC分类号： G06N99/005 , G06F17/3071

摘要： A novel domain adaption/transfer learning method applied to the problem of classifying abbreviated documents, e.g., short text messages, instant messages, tweets. The method uses a large number of multi-labeled examples (source domain) to improve the learning on the partial observations (target domain). Specifically, a hidden, higher-level abstraction space is learned that is meaningful for the multi-labeled examples in the source domain. This is done by simultaneously minimizing the document reconstruction error and the error in a classification model learned in the hidden space using known labels from the source domain. The partial observations in the target space are then mapped to the same hidden space, and classified into the label space determined by the source domain.

9.

发明申请
SYSTEM AND METHOD FOR DOMAIN ADAPTION WITH PARTIAL OBSERVATION 有权
标题翻译：用于局部观察的域适应的系统和方法

公开(公告)号：US20130013539A1

公开(公告)日：2013-01-10

申请号：US13618603

申请日：2012-09-14

申请人： Vijil E. Chenthamarakshan , Richard D. Lawrence , Yan Liu , Dan Zhang

发明人： Vijil E. Chenthamarakshan , Richard D. Lawrence , Yan Liu , Dan Zhang

IPC分类号： G06F15/18

CPC分类号： G06N99/005 , G06F17/3071

摘要： System, method and computer program product provides a novel domain adaption/transfer learning approach applied to the problem of classifying abbreviated documents, e.g., short text messages, instant messages, tweets. The proposed method uses a large number of multi-labeled examples (source domain) to improve the learning on the partial observations (target domain). Specifically, a hidden, higher-level abstraction space is learned that is meaningful for the multi-labeled examples in the source domain. This is done by simultaneously minimizing the document reconstruction error and the error in a classification model learned in the hidden space using known labels from the source domain. The partial observations in the target space are then mapped to the same hidden space, and classified into the label space determined by the source domain. Exemplary results provided for a Twitter dataset demonstrate that the method identifies meaningful hidden topics and provides useful classifications of specific tweets.

摘要翻译： 系统，方法和计算机程序产品提供了一种新颖的域适应/转移学习方法，其应用于对简短文档进行分类的问题，例如短文本消息，即时消息，推文。所提出的方法使用大量多标记示例（源域）来改善部分观察（目标域）上的学习。具体来说，学习一个隐藏的，更高级别的抽象空间，这对于源域中的多标签示例是有意义的。这是通过使用来自源域的已知标签在隐藏空间中学习的分类模型中同时最小化文档重建错误和错误来完成的。然后将目标空间中的部分观察值映射到相同的隐藏空间，并将其分类为由源域确定的标签空间。为Twitter数据集提供的示例性结果表明该方法识别有意义的隐藏主题，并提供特定推文的有用分类。

10.

发明申请
METHOD AND SYSTEM USING MACHINE LEARNING TO AUTOMATICALLY DISCOVER HOME PAGES ON THE INTERNET 有权
标题翻译：使用机器学习的方法和系统在互联网上自动发现家庭页面

公开(公告)号：US20090210419A1

公开(公告)日：2009-08-20

申请号：US12033160

申请日：2008-02-19

申请人： UPENDRA CHITNIS , Wojciech Gryc , Ildar Khabibrakhmanov , Richard D. Lawrence , Prem Melville , Cezar Pendus

发明人： UPENDRA CHITNIS , Wojciech Gryc , Ildar Khabibrakhmanov , Richard D. Lawrence , Prem Melville , Cezar Pendus

IPC分类号： G06F17/30

CPC分类号： G06F17/30864

摘要： A method for automatically determining an Internet home page corresponding to a named entity identified by a specified descriptor including building a trained machine-learning model, generating candidate matches from the specified descriptor, wherein each candidate match includes an Internet address, extracting content-based features from websites associated with the Internet addresses of the candidate matches, determining a model score for each candidate match based on the content-based features using the trained machine-learning model, and determining a match from among the candidate matches according to the scores, wherein the match is returned as the Internet home page corresponding to the named entity.

摘要翻译： 一种用于自动确定与由指定描述符标识的命名实体相对应的因特网主页的方法，包括建立训练有素的机器学习模型，从指定的描述符生成候选匹配，其中每个候选匹配包括因特网地址，提取基于内容的特征从与候选匹配的互联网地址相关联的网站，基于使用训练机器学习模型的基于内容的特征来确定每个候选匹配的模型分数，以及根据分数从候选匹配中确定匹配，其中该匹配将作为与该命名实体相对应的因特网主页返回。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类