- 专利标题: Phrase extraction using subphrase scoring
-
申请号: US13615541申请日: 2012-09-13
-
公开(公告)号: US09355169B1公开(公告)日: 2016-05-31
- 发明人: Soham Mazumdar , Viktor Przebinda , Yonatan Zunger
- 申请人: Soham Mazumdar , Viktor Przebinda , Yonatan Zunger
- 申请人地址: US CA Mountain View
- 专利权人: GOOGLE INC.
- 当前专利权人: GOOGLE INC.
- 当前专利权人地址: US CA Mountain View
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are extracted from the document collection. Documents are the indexed according to their included phrases, using phrase posting lists. The phrase posting lists are stored in an cluster of index servers. The phrase posting lists can be tiered into groups, and sharded into partitions. Phrases in a query are identified based on possible phrasifications. A query schedule based on the phrases is created from the phrases, and then optimized to reduce query processing and communication costs. The execution of the query schedule is managed to further reduce or eliminate query processing operations at various ones of the index servers.