-
公开(公告)号:US08645298B2
公开(公告)日:2014-02-04
申请号:US12912428
申请日:2010-10-26
申请人: Philipp Hennig , David Stern , Thore Graepel , Ralf Herbrich
发明人: Philipp Hennig , David Stern , Thore Graepel , Ralf Herbrich
CPC分类号: G06N99/005 , G06N7/005
摘要: Machine learning techniques may be used to train computing devices to understand a variety of documents (e.g., text files, web pages, articles, spreadsheets, etc.). Machine learning techniques may be used to address the issue that computing devices may lack the human intellect used to understand such documents, such as their semantic meaning. Accordingly, a topic model may be trained by sequentially processing documents and/or their features (e.g., document author, geographical location of author, creation date, social network information of author, and/or document metadata). Additionally, as provided herein, the topic model may be used to predict probabilities that words, features, documents, and/or document corpora, for example, are indicative of particular topics.
摘要翻译: 机器学习技术可用于训练计算设备以理解各种文档(例如,文本文件,网页,文章,电子表格等)。 可以使用机器学习技术来解决计算设备可能缺乏用于理解这样的文档的人类智力的问题,例如其语义意义。 因此,主题模型可以通过顺序处理文档和/或其特征(例如,文档作者,作者的地理位置,创作日期,作者的社交网络信息和/或文档元数据)来进行培训。 另外,如本文所提供的,主题模型可以用于预测词,特征,文档和/或文档语料库例如表示特定主题的概率。
-
公开(公告)号:US20120101965A1
公开(公告)日:2012-04-26
申请号:US12912428
申请日:2010-10-26
申请人: Philipp Hennig , David Stern , Thore Graepel , Ralf Herbrich
发明人: Philipp Hennig , David Stern , Thore Graepel , Ralf Herbrich
CPC分类号: G06N99/005 , G06N7/005
摘要: Machine learning techniques may be used to train computing devices to understand a variety of documents (e.g., text files, web pages, articles, spreadsheets, etc.). Machine learning techniques may be used to address the issue that computing devices may lack the human intellect used to understand such documents, such as their semantic meaning. Accordingly, a topic model may be trained by sequentially processing documents and/or their features (e.g., document author, geographical location of author, creation date, social network information of author, and/or document metadata). Additionally, as provided herein, the topic model may be used to predict probabilities that words, features, documents, and/or document corpora, for example, are indicative of particular topics.
摘要翻译: 机器学习技术可用于训练计算设备以理解各种文档(例如,文本文件,网页,文章,电子表格等)。 可以使用机器学习技术来解决计算设备可能缺乏用于理解这样的文档的人类智力的问题,例如其语义意义。 因此,主题模型可以通过顺序处理文档和/或其特征(例如,文档作者,作者的地理位置,创作日期,作者的社交网络信息和/或文档元数据)来进行培训。 另外,如本文所提供的,主题模型可以用于预测词,特征,文档和/或文档语料库例如表示特定主题的概率。
-