Abstract:
A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.
Abstract:
In general, one aspect of the subject matter described can be embodied in a method that includes, obtaining a plurality of search results responsive to an initial search query, the search results including a first search result that identifies a first resource; determining, using a document-to-query-to-document model, that the first resource is relevant to a first suggested query different from the initial search query; generating a presentation of the search results responsive to the initial search query; and providing the presentation of the search results in response to the initial search query. Each search result in the presentation includes a link to a respective resource, wherein the first search result in the presentation includes a link that, upon a selection by a user, can cause the first suggested query to be submitted to a search engine.
Abstract:
A system and method for providing search query refinements are presented. A stored query and a stored document are associated as a logical pairing. A weight is assigned to the logical pairing. The search query is issued and a set of search documents is produced. At least one search document is matched to at least one stored document. The stored query and the assigned weight associated with the matching at least one stored document are retrieved. At least one cluster is formed based on the stored query and the assigned weight associated with the matching at least one stored document. The stored query associated with the matching at least one stored document are scored for the at least one cluster relative to at least one other cluster. At least one such scored search query is suggested as a set of query refinements.
Abstract:
A system and method for generating query refinement suggestions may include collecting refinement data for at least one received source query. The collected refinement data is then clustered to form at least one cluster. At least one potential refinement query suggestion is identified from the refinement data within the at least one cluster.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using synthetic descriptive text to rank search results. One of the methods includes receiving a search query from a user device; receiving data identifying a plurality of search result resources and respective initial scores for each of the search result resources; determining, from a search engine index, that a particular search result resource of the plurality of search result resources is associated with one or more pieces of synthetic descriptive text, wherein each piece of synthetic descriptive text is generated by applying a respective template to a respective linking resource that links to the particular search result resource; computing a synthetic descriptive text score for the particular search result resource; and adjusting the initial score for the particular search result resource based at least in part on the synthetic descriptive text score.
Abstract:
Web pages of a Website may be processed to improve search results. For example, information likely to pertain to more than just the Web page it is directly associated with may be identified. One or more other, related, Web pages that such information is likely to pertain to is also identified. The identified information is associated with the identified other Web page(s) and this association is saved in a way to affect a search result score of the Web page(s).
Abstract:
A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.