-
公开(公告)号:US20140172858A1
公开(公告)日:2014-06-19
申请号:US14100990
申请日:2013-12-09
Applicant: eBay Inc.
Inventor: Badrul M. Sarwar , John A. Mount
CPC classification number: G06F17/3053 , G06F17/277 , G06F17/2775 , G06F17/30312 , G06F17/30598
Abstract: A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample set of descriptions. The relevance value is an estimated probability of relevance based on the header of the description being segmented.
Abstract translation: 描述了一种基于标题令牌自动分割文本的方法和系统。 在描述中为每个令牌确定相关性值和不相关性值,假设没有令牌被遗漏在计算之外。 不相关值基于描述样本集中令牌的出现。 相关性值是基于正被分割的描述的标题的估计的相关概率。
-
公开(公告)号:US20150261761A1
公开(公告)日:2015-09-17
申请号:US14724269
申请日:2015-05-28
Applicant: eBay Inc.
Inventor: Badrul M. Sarwar , John A. Mount
IPC: G06F17/30
CPC classification number: G06F17/3053 , G06F17/277 , G06F17/2775 , G06F17/30312 , G06F17/30598
Abstract: A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample set of descriptions. The relevance value is an estimated probability of relevance based on the header of the description being segmented.
Abstract translation: 描述了一种基于标题令牌自动分割文本的方法和系统。 在描述中为每个令牌确定相关性值和不相关性值,假设没有令牌被遗漏在计算之外。 不相关值基于描述样本集中令牌的出现。 相关性值是基于正被分割的描述的标题的估计的相关概率。
-
公开(公告)号:US09053091B2
公开(公告)日:2015-06-09
申请号:US14100990
申请日:2013-12-09
Applicant: eBay Inc.
Inventor: Badrul M. Sarwar , John A. Mount
CPC classification number: G06F17/3053 , G06F17/277 , G06F17/2775 , G06F17/30312 , G06F17/30598
Abstract: A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample set of descriptions. The relevance value is an estimated probability of relevance based on the header of the description being segmented.
Abstract translation: 描述了一种基于标题令牌自动分割文本的方法和系统。 在描述中为每个令牌确定相关性值和不相关性值,假设没有令牌被遗漏在计算之外。 不相关值基于描述样本集中令牌的出现。 相关性值是基于正被分割的描述的标题的估计的相关概率。
-
-