CORRECTING N-GRAM PROBABILITIES BY PAGE VIEW INFORMATION

    公开(公告)号:US20150051899A1

    公开(公告)日:2015-02-19

    申请号:US13965492

    申请日:2013-08-13

    IPC分类号: G06F17/27

    CPC分类号: G06F17/27

    摘要: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.

    Correcting N-gram probabilities by page view information
    2.
    发明授权
    Correcting N-gram probabilities by page view information 有权
    按页面浏览信息校正N-gram概率

    公开(公告)号:US09311291B2

    公开(公告)日:2016-04-12

    申请号:US14021607

    申请日:2013-09-09

    IPC分类号: G06F17/27 G06F17/20 G06F17/21

    CPC分类号: G06F17/27

    摘要: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.

    摘要翻译: 用于计算语言模型中的N-gram概率的方法和系统。 一种方法包括对多个文档的多个页面或每个文档中的每个页面中的N-gram进行计数以获得相应的N-gram计数。 该方法还包括基于观看计数和排名中的至少一个来对各个N元计数应用权重,以获得加权的相应的N元计数。 视图计数,并且相对于多个页面或多个文档确定排名。 该方法还包括合并加权的相应的N克计数以获得多页或多个文档的合并加权的相应的N克计数。 该方法还包括基于合并的加权相应的N-gram计数来计算每个N克的相应概率。

    Correcting N-gram probabilities by page view information

    公开(公告)号:US09251135B2

    公开(公告)日:2016-02-02

    申请号:US13965492

    申请日:2013-08-13

    IPC分类号: G06F17/27 G06F17/20 G06F17/21

    CPC分类号: G06F17/27

    摘要: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.

    CORRECTING N-GRAM PROBABILITIES BY PAGE VIEW INFORMATION
    4.
    发明申请
    CORRECTING N-GRAM PROBABILITIES BY PAGE VIEW INFORMATION 有权
    通过页面查看信息更正N-GRAM概念

    公开(公告)号:US20150051902A1

    公开(公告)日:2015-02-19

    申请号:US14021607

    申请日:2013-09-09

    IPC分类号: G06F17/27

    CPC分类号: G06F17/27

    摘要: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.

    摘要翻译: 用于计算语言模型中的N-gram概率的方法和系统。 一种方法包括对多个文档的多个页面或每个文档中的每个页面中的N-gram进行计数以获得相应的N-gram计数。 该方法还包括基于观看计数和排名中的至少一个来对各个N元计数应用权重,以获得加权的相应的N元计数。 视图计数,并且相对于多个页面或多个文档确定排名。 该方法还包括合并加权的相应的N克计数以获得多页或多个文档的合并加权的相应的N克计数。 该方法还包括基于合并的加权相应的N-gram计数来计算每个N克的相应概率。