发明申请
- 专利标题: Method and system for classifying display pages using summaries
- 专利标题(中): 使用汇总分类显示页面的方法和系统
-
申请号: US10836319申请日: 2004-04-30
-
公开(公告)号: US20050246410A1公开(公告)日: 2005-11-03
- 发明人: Zheng Chen , Dou Shen , Benyu Zhang , Hua-Jun Zeng , Wei-Ying Ma
- 申请人: Zheng Chen , Dou Shen , Benyu Zhang , Hua-Jun Zeng , Wei-Ying Ma
- 申请人地址: US WA Redmond
- 专利权人: Microsoft Corporation
- 当前专利权人: Microsoft Corporation
- 当前专利权人地址: US WA Redmond
- 主分类号: G06F17/30
- IPC分类号: G06F17/30 ; G06F15/16
摘要:
A method and system for classifying display pages based on automatically generated summaries of display pages. A web page classification system uses a web page summarization system to generate summaries of web pages. The summary of a web page may include the sentences of the web page that are most closely related to the primary topic of the web page. The summarization system may combine the benefits of multiple summarization techniques to identify the sentences of a web page that represent the primary topic of the web page. Once the summary is generated, the classification system may apply conventional classification techniques to the summary to classify the web page. The classification system may use conventional classification techniques such as a Naïve Bayesian classifier or a support vector machine to identify the classifications of a web page based on the summary generated by the summarization system.
公开/授权文献
信息查询