发明申请
- 专利标题: Hierarchical Content Classification Into Deep Taxonomies
- 专利标题(中): 分层内容分类成深入分类法
-
申请号: US12777260申请日: 2010-05-11
-
公开(公告)号: US20110282858A1公开(公告)日: 2011-11-17
- 发明人: Ron Karidi , Liat Segal , Oded Elyada
- 申请人: Ron Karidi , Liat Segal , Oded Elyada
- 申请人地址: US WA Redmond
- 专利权人: Microsoft Corporation
- 当前专利权人: Microsoft Corporation
- 当前专利权人地址: US WA Redmond
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
A document may be classified by traversing a hierarchical classification tree and comparing the words in the document to words in documents representing the nodes on the classification tree. The document may be classified by traversing the classification tree and generating a comparison score based on word comparisons. The score may be used to trim the classification tree or to advance to another node on the tree. The score may be based on a scarcity or importance of individual words in the document compared to the scarcity or importance of words in the category. The result may be a set of classifications with scores for those classifications.