-
公开(公告)号:US20150051899A1
公开(公告)日:2015-02-19
申请号:US13965492
申请日:2013-08-13
IPC分类号: G06F17/27
CPC分类号: G06F17/27
摘要: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.
-
公开(公告)号:US09311291B2
公开(公告)日:2016-04-12
申请号:US14021607
申请日:2013-09-09
CPC分类号: G06F17/27
摘要: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.
摘要翻译: 用于计算语言模型中的N-gram概率的方法和系统。 一种方法包括对多个文档的多个页面或每个文档中的每个页面中的N-gram进行计数以获得相应的N-gram计数。 该方法还包括基于观看计数和排名中的至少一个来对各个N元计数应用权重,以获得加权的相应的N元计数。 视图计数,并且相对于多个页面或多个文档确定排名。 该方法还包括合并加权的相应的N克计数以获得多页或多个文档的合并加权的相应的N克计数。 该方法还包括基于合并的加权相应的N-gram计数来计算每个N克的相应概率。
-
公开(公告)号:US09251135B2
公开(公告)日:2016-02-02
申请号:US13965492
申请日:2013-08-13
CPC分类号: G06F17/27
摘要: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.
-
公开(公告)号:US20150051902A1
公开(公告)日:2015-02-19
申请号:US14021607
申请日:2013-09-09
IPC分类号: G06F17/27
CPC分类号: G06F17/27
摘要: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.
摘要翻译: 用于计算语言模型中的N-gram概率的方法和系统。 一种方法包括对多个文档的多个页面或每个文档中的每个页面中的N-gram进行计数以获得相应的N-gram计数。 该方法还包括基于观看计数和排名中的至少一个来对各个N元计数应用权重,以获得加权的相应的N元计数。 视图计数,并且相对于多个页面或多个文档确定排名。 该方法还包括合并加权的相应的N克计数以获得多页或多个文档的合并加权的相应的N克计数。 该方法还包括基于合并的加权相应的N-gram计数来计算每个N克的相应概率。
-
-
-