Abstract:
A system and computer-implemented method are provided for associating categories with business names for generalizing search queries, the method including identifying one or more businesses within a first geographic region, determining a business name and one or more categories for each of the one or more businesses, generating one or more name components for each of the one or more businesses from the name of the business, generating one or more name component groups from the name components of the one or more businesses, each name component group including one or more identical name components, determining for each name component group, if the one or more name components within the name component group are associated with businesses that share one or more common categories and associating the one or more common categories with the name component of the name component group.
Abstract:
Provided is a process of and apparatus for spatially indexing geographic items. The process may include obtaining geographic-item data identifying geographic items, the geographic location of each item, and an attribute of each item, wherein the geographic-item data identifies values of key-value pairs to be formed in a spatial index; obtaining a plurality of geographic-location keys each corresponding to a geographic area, the geographic-location keys identifying keys of the key-value pairs to be formed in the spatial index; and pairing each geographic-location key with an item among the geographic-item data. Pairing may be performed by: calculating distances between the geographic location of each of the items and the geographic-location key; weighting each of the distances based on the attribute of the item corresponding to that distance; and selecting the geographic item having the closest attribute-weighted distance as the item to be paired with the geographic-location key.
Abstract:
A spelling system derives a language model for a particular domain of structured data, the language model enabling determinations of alternative spellings of queries or other strings of text from that domain. More specifically, the spelling system calculates (a) probabilities that the various query entity types—such as STREET, CITY, or STATE for queries in the geographical domain—are arranged in each of the various possible orders, and (b) probabilities that an arbitrary query references given particular ones of the entities, such as the street “El Camino Real.” Based on the calculated probabilities, the spelling system generates a language model that has associated scores (e.g., probabilities) for each of a set of probable entity name orderings, where the total number of entity name orderings is substantially less than the number of all possible orderings. The language model can be applied to determine probabilities of arbitrary queries, and thus to suggest alternative queries more likely to represent what a user intended.
Abstract:
Provided is a process for identifying a new business listing, the process including: identifying, from a log of local search queries, a term that does not correspond to a name of a business listing; determining a number of recent search queries containing the term and a number of historical search queries containing the term; determining that a rate based on the number of recent search queries exceeds a threshold rate based on the number of historical search queries; and identifying the term as a name of a new business listing.