-
公开(公告)号:US20140164388A1
公开(公告)日:2014-06-12
申请号:US13709064
申请日:2012-12-10
发明人: Li Zhang , Mihai Budiu , Yuan Yu , Gordon D. Plotkin
IPC分类号: G06F17/30
CPC分类号: G06F17/30911 , G06F17/2247 , G06F17/30011 , G06F17/30625 , Y10S707/956
摘要: A document index is generated from a set of documents and is used to identify documents that match one or more queries. A tree is generated for each document with a node corresponding to each object of the document. The nodes of the generated trees are merged or combined to generate the document index, which is itself a tree. In addition, an inverted index is generated for each node of the index that identifies the tree(s) that the node originated from. When a query is received, the query is first executed against the document index tree: during the execution, proper set operations are applied to the inverted indices associated with the nodes matched by the query. The resulted set identifies the documents that may match the query. The query is then executed on the identified documents.
摘要翻译: 从一组文档生成文档索引,并用于标识与一个或多个查询匹配的文档。 为每个文档生成一个与文档的每个对象对应的节点的树。 生成的树的节点被合并或组合以生成本身是树的文档索引。 此外,为标识节点源自的树的索引的每个节点生成反向索引。 当接收到查询时,首先对文档索引树执行查询:在执行过程中,将适当的集合操作应用于与查询匹配的节点相关联的反向索引。 结果集确定可能匹配查询的文档。 然后在所识别的文档上执行查询。
-
公开(公告)号:US09208254B2
公开(公告)日:2015-12-08
申请号:US13709064
申请日:2012-12-10
发明人: Li Zhang , Mihai Budiu , Yuan Yu , Gordon D. Plotkin
CPC分类号: G06F17/30911 , G06F17/2247 , G06F17/30011 , G06F17/30625 , Y10S707/956
摘要: A document index is generated from a set of documents and is used to identify documents that match one or more queries. A tree is generated for each document with a node corresponding to each object of the document. The nodes of the generated trees are merged or combined to generate the document index, which is itself a tree. In addition, an inverted index is generated for each node of the index that identifies the tree(s) that the node originated from. When a query is received, the query is first executed against the document index tree: during the execution, proper set operations are applied to the inverted indices associated with the nodes matched by the query. The resulted set identifies the documents that may match the query. The query is then executed on the identified documents.
摘要翻译: 从一组文档生成文档索引,并用于标识与一个或多个查询匹配的文档。 为每个文档生成一个与文档的每个对象对应的节点的树。 生成的树的节点被合并或组合以生成本身是树的文档索引。 此外,为标识节点源自的树的索引的每个节点生成反向索引。 当接收到查询时,首先对文档索引树执行查询:在执行过程中,将适当的集合操作应用于与查询匹配的节点相关联的反向索引。 结果集确定可能匹配查询的文档。 然后在所识别的文档上执行查询。
-