-
公开(公告)号:US20090271367A1
公开(公告)日:2009-10-29
申请号:US12110390
申请日:2008-04-28
申请人: Nimish G. Dharawat , Meera Mahabala , Gitika Gupta
发明人: Nimish G. Dharawat , Meera Mahabala , Gitika Gupta
IPC分类号: G06F17/30
CPC分类号: G06F17/30705
摘要: Methods, systems and computer readable media for extracting product lines from a plurality of product titles are provided. In one embodiment, the plurality of product titles are broken into tokens. Association rules are calculated for individual tokens and pairs of tokens. Brand specific terms and product class specific terms within the product titles are identified. In one embodiment, a token tree is used to identify product lines within the list of product titles using the association rules, the brand specific terms, and the product class specific terms.
摘要翻译: 提供了用于从多个产品标题中提取产品线的方法,系统和计算机可读介质。 在一个实施例中,多个产品标题被分成令牌。 关联规则是针对各个令牌和令牌对计算的。 识别产品标题中的品牌特定术语和产品类特定术语。 在一个实施例中,令牌树用于使用关联规则,品牌特定术语和产品类特定术语来识别产品标题列表内的产品线。
-
公开(公告)号:US20110066622A1
公开(公告)日:2011-03-17
申请号:US12951680
申请日:2010-11-22
申请人: Nimish G. Dharawat , Meera Mahabala , Gitika Gupta
发明人: Nimish G. Dharawat , Meera Mahabala , Gitika Gupta
IPC分类号: G06F17/30
CPC分类号: G06F16/35
摘要: Methods, systems and computer readable media for extracting product lines from a plurality of product titles are provided. In one embodiment, the plurality of product titles are broken into tokens. Association rules are calculated for individual tokens and pairs of tokens. Brand specific terms and product class specific terms within the product titles are identified. In one embodiment, a token tree is used to identify product lines within the list of product titles using the association rules, the brand specific terms, and the product class specific terms.
摘要翻译: 提供了用于从多个产品标题中提取产品线的方法,系统和计算机可读介质。 在一个实施例中,多个产品标题被分成令牌。 关联规则是针对各个令牌和令牌对计算的。 识别产品标题中的品牌特定术语和产品类特定术语。 在一个实施例中,令牌树用于使用关联规则,品牌特定术语和产品类特定术语来识别产品标题列表内的产品线。
-
公开(公告)号:US07853597B2
公开(公告)日:2010-12-14
申请号:US12110390
申请日:2008-04-28
申请人: Nimish G. Dharawat , Meera Mahabala , Gitika Gupta
发明人: Nimish G. Dharawat , Meera Mahabala , Gitika Gupta
IPC分类号: G06F17/30
CPC分类号: G06F17/30705
摘要: Methods, systems and computer readable media for extracting product lines from a plurality of product titles are provided. In one embodiment, the plurality of product titles are broken into tokens. Association rules are calculated for individual tokens and pairs of tokens. Brand specific terms and product class specific terms within the product titles are identified. In one embodiment, a token tree is used to identify product lines within the list of product titles using the association rules, the brand specific terms, and the product class specific terms.
摘要翻译: 提供了用于从多个产品标题中提取产品线的方法,系统和计算机可读介质。 在一个实施例中,多个产品标题被分成令牌。 关联规则是针对各个令牌和令牌对计算的。 识别产品标题中的品牌特定术语和产品类特定术语。 在一个实施例中,令牌树用于使用关联规则,品牌特定术语和产品类特定术语来识别产品标题列表内的产品线。
-
-