-
公开(公告)号:US20150178383A1
公开(公告)日:2015-06-25
申请号:US14576907
申请日:2014-12-19
Applicant: Google Inc.
Inventor: Gregory Sean Corrado , Tomas Mikolov , Samy Bengio , Yoram Singer , Jonathon Shlens , Andrea L. Frome , Jeffrey Adgate Dean , Mohammad Norouzi
IPC: G06F17/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于对数据对象进行分类。 其中一种方法包括获得将术语词汇中的每个术语与该术语的相应高维表示相关联的数据; 获取数据对象的分类数据,其中分类数据包括多个类别中的每一个的相应分数,并且其中每个类别与相应的分类标签相关联; 从与类别和相应分数相关联的类别标签的高维表示中计算数据对象的聚合高维表示; 识别具有最接近聚合高维表示的高维表示的术语词汇表中的第一项; 并选择第一项作为数据对象的类别标签。