发明申请
- 专利标题: PRODUCT LINE EXTRACTION
- 专利标题(中): 产品线提取
-
申请号: US12110390申请日: 2008-04-28
-
公开(公告)号: US20090271367A1公开(公告)日: 2009-10-29
- 发明人: Nimish G. Dharawat , Meera Mahabala , Gitika Gupta
- 申请人: Nimish G. Dharawat , Meera Mahabala , Gitika Gupta
- 申请人地址: US WA Redmond
- 专利权人: MICROSOFT CORPORATION
- 当前专利权人: MICROSOFT CORPORATION
- 当前专利权人地址: US WA Redmond
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
Methods, systems and computer readable media for extracting product lines from a plurality of product titles are provided. In one embodiment, the plurality of product titles are broken into tokens. Association rules are calculated for individual tokens and pairs of tokens. Brand specific terms and product class specific terms within the product titles are identified. In one embodiment, a token tree is used to identify product lines within the list of product titles using the association rules, the brand specific terms, and the product class specific terms.
公开/授权文献
- US07853597B2 Product line extraction 公开/授权日:2010-12-14
信息查询