PHRASE-BASED DETECTION OF DUPLICATE DOCUMENTS IN AN INFORMATION RETRIEVAL SYSTEM
    1.
    发明申请
    PHRASE-BASED DETECTION OF DUPLICATE DOCUMENTS IN AN INFORMATION RETRIEVAL SYSTEM 有权
    信息检索系统中基于相位检测的双重文件

    公开(公告)号:US20140156647A1

    公开(公告)日:2014-06-05

    申请号:US13919830

    申请日:2013-06-17

    Applicant: Google Inc.

    Abstract: An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. Related phrases and phrase extensions are also identified. Phrases in a query are identified and used to retrieve and rank documents. Phrases are also used to cluster documents in the search results, create document descriptions, and eliminate duplicate documents from the search results, and from the index.

    Abstract translation: 信息检索系统使用短语来索引,检索,组织和描述文档。 确定短语在短文中的预测。 文件根据其包含的短语进行索引。 还确定了相关的短语和短语扩展。 查询中的短语被识别并用于检索和排列文档。 短语还用于在搜索结果中集中文档,创建文档描述,并从搜索结果和索引中消除重复的文档。

    Index server architecture using tiered and sharded phrase posting lists
    2.
    发明授权
    Index server architecture using tiered and sharded phrase posting lists 有权
    索引服务器架构使用分层和分层的短语发布列表

    公开(公告)号:US08943067B1

    公开(公告)日:2015-01-27

    申请号:US13842731

    申请日:2013-03-15

    Applicant: Google Inc.

    Abstract: An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are extracted from the document collection. Documents are the indexed according to their included phrases, using phrase posting lists. The phrase posting lists are stored in an cluster of index servers. The phrase posting lists can be tiered into groups, and sharded into partitions. Phrases in a query are identified based on possible phrasifications. A query schedule based on the phrases is created from the phrases, and then optimized to reduce query processing and communication costs. The execution of the query schedule is managed to further reduce or eliminate query processing operations at various ones of the index servers.

    Abstract translation: 信息检索系统使用短语来索引,检索,组织和描述文档。 短语从文档集中提取。 文件根据所包含的短语索引,使用短语发布列表。 短语发布列表存储在索引服务器的集群中。 短语列表可以分组成分组,并分成分区。 查询中的短语是根据可能的短语来确定的。 从短语中创建基于短语的查询调度,然后进行优化,以减少查询处理和通信成本。 管理查询调度的执行以进一步减少或消除索引服务器中的各个查询处理操作。

    MULTIPLE INDEX BASED INFORMATION RETRIEVAL SYSTEM
    5.
    发明申请
    MULTIPLE INDEX BASED INFORMATION RETRIEVAL SYSTEM 有权
    多指标信息检索系统

    公开(公告)号:US20160283474A1

    公开(公告)日:2016-09-29

    申请号:US15172717

    申请日:2016-06-03

    Applicant: Google Inc.

    Abstract: An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. The document index is partitioned into multiple indexes, including a primary index and a secondary index. The primary index stores phrase posting lists with relevance rank ordered documents. The secondary index stores excess documents from the posting lists in document order.

    Abstract translation: 信息检索系统使用短语来索引,检索,组织和描述文档。 确定短语在短文中的预测。 文件根据其包含的短语进行索引。 文档索引被分成多个索引,包括主索引和次索引。 主要索引存储具有相关性排序文档的短语发布列表。 次要指数以文档顺序存储过帐凭证。

    Phrase-based searching in an information retrieval system
    8.
    发明授权
    Phrase-based searching in an information retrieval system 有权
    在信息检索系统中基于词组搜索

    公开(公告)号:US09569505B2

    公开(公告)日:2017-02-14

    申请号:US14713374

    申请日:2015-05-15

    Applicant: GOOGLE INC.

    Abstract: An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. Related phrases and phrase extensions are also identified. Phrases in a query are identified and used to retrieve and rank documents. Phrases are also used to cluster documents in the search results, create document descriptions, and eliminate duplicate documents from the search results, and from the index.

    Abstract translation: 信息检索系统使用短语来索引,检索,组织和描述文档。 确定短语在短文中的预测。 文件根据其包含的短语进行索引。 还确定了相关的短语和短语扩展。 查询中的短语被识别并用于检索和排名文档。 短语还用于在搜索结果中集中文档,创建文档描述,并从搜索结果和索引中消除重复的文档。

    PHRASE-BASED DETECTION OF DUPLICATE DOCUMENTS IN AN INFORMATION RETRIEVAL SYSTEM
    10.
    发明申请
    PHRASE-BASED DETECTION OF DUPLICATE DOCUMENTS IN AN INFORMATION RETRIEVAL SYSTEM 有权
    信息检索系统中基于相位检测的双重文件

    公开(公告)号:US20150248415A1

    公开(公告)日:2015-09-03

    申请号:US14713374

    申请日:2015-05-15

    Applicant: GOOGLE INC.

    Abstract: An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. Related phrases and phrase extensions are also identified. Phrases in a query are identified and used to retrieve and rank documents. Phrases are also used to cluster documents in the search results, create document descriptions, and eliminate duplicate documents from the search results, and from the index.

    Abstract translation: 信息检索系统使用短语来索引,检索,组织和描述文档。 确定短语在短文中的预测。 文件根据其包含的短语进行索引。 还确定了相关的短语和短语扩展。 查询中的短语被识别并用于检索和排列文档。 短语还用于在搜索结果中集中文档,创建文档描述,并从搜索结果和索引中消除重复的文档。

Patent Agency Ranking