Invention Grant
- Patent Title: Query generation using structural similarity between documents
-
Application No.: US12942950Application Date: 2010-11-09
-
Publication No.: US08346792B1Publication Date: 2013-01-01
- Inventor: Steven D. Baker , Michael Flaster , Nitin Gupta , Paul Haahr , Srinivasan Venkatachary , Yonghui Wu
- Applicant: Steven D. Baker , Michael Flaster , Nitin Gupta , Paul Haahr , Srinivasan Venkatachary , Yonghui Wu
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Fish & Richardson P.C.
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Methods, systems, and apparatus, including computer program products, for generating synthetic queries using seed queries and structural similarity between documents are described. In one aspect, a method includes identifying embedded coding fragments (e.g., HTML tag) from a structured document and a seed query; generating one or more query templates, each query template corresponding to at least one coding fragment, the query template including a generative rule to be used in generating candidate synthetic queries; generating the candidate synthetic queries by applying the query templates to other documents that are hosted on the same web site as the document; identifying terms that match structure of the query templates as candidate synthetic queries; measuring a performance for each of the candidate synthetic queries; and designating as synthetic queries the candidate synthetic queries that have performance measurements exceeding a performance threshold.
Information query