Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving geographic targeting of digital content. In some implementations, a targeting request that identifies a target geographic region is received. Groups of geographic regions that each include the target geographic region and at least another geographic region are identified. Combined targeting accuracies are computed for the groups of geographic regions. One or more of the groups of geographic regions are selected based on their combined targeting accuracies being higher than a targeting accuracy for the target geographic region. Data describing the selected one or more groups of geographic regions is provided for output in response to the targeting request.
Abstract:
Methods, systems, and apparatus, including computer program products, for constructing text classifiers. The method includes receiving a collection of candidate phrases for a given topic; filtering the received candidate phrases to remove erroneously included candidate phrases; assigning weights to the candidate phrases including scoring each candidate phrase using an initial classifier and assigning weights to the candidate phrases based on the scores; and generating a linear classifier using the filtered and weighted candidate phrases, where the linear classifier varies the weights for each phrase candidate depending on the length of the document being classified.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining geographic locations. One of the methods includes obtaining a sequence of events, each of the events including geographical location information, from a first device to be located; determining, for each event and each of a plurality of geographical locations, a probability that the respective event was obtained from a second device given that the second device is located at the respective geographical location; determining a probability that the sequence of events was obtained from the second device, including using a model representing how sequences of events are generated by network devices; and determining for each of the plurality of geographical locations a probability that the first device is located at the respective geographical location using the probability that the sequence of events was obtained.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining geographic locations of devices. One of the methods includes obtaining an estimated user location associated with each respective IP address block based on observed events from the IP address block; obtaining an estimate of a probability model p(ev|loc), the probability model p(ev|loc) including a respective probability distribution of interest locations for each of multiple user locations; wherein obtaining the estimate of the probability model p(ev|loc) includes calculating p(ev|loc) from a p(zone|loc) matrix and a p(ev|zone) matrix; and using the estimate for the probability model p(ev|loc) and the observed events to calculate an estimate for multiple probability distributions X(loc) associated with a respective IP address block.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a targeting request that identifies a target geographic region; identifying one or more groups of geographic regions that each include at least two geographic regions, including the target geographic region, wherein the one or more groups of geographic regions are identified based on respective combined targeting accuracies, the respective combined targeting accuracy of each of the one or more groups being higher than a targeting accuracy for the target geographic region; and providing data describing the identified one or more groups of geographic regions in response to the targeting request.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for inferring the geographical location of devices. One of the methods includes obtaining device information associated with a first device located at a respective geographical location, the device information including a plurality of events obtained from the first device, wherein least a one event of the obtained events contains ambiguous geographical location information that can be interpreted as relating to one of two or more alternative geographical locations; identifying the at least one event containing ambiguous geographical location information; and determining an estimate of the geographical location of the first device based at least in part on the device information taking into account that the at least one identified event contains ambiguous geographical location information.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying resources using scores from multiple classifiers. In general, one aspect of the subject matter described in this specification can be embodied in methods that include the actions of receiving identifying a collection of documents to classify; receiving a plurality of classifiers for scoring a document with respect to a specified property; for each document in the collection, applying each of the plurality of classifiers, each classifier generating a score associated with a likelihood that the document has the specified property, combining the scores from each classifier including applying a multiple classifier model that uses monotonic regression to combine the plurality of classifiers, and classifying the document as having the specified property based on the combined score.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining geographic locations. One of the methods includes obtaining a sequence of events, each of the events including geographical location information, from a first device to be located; determining, for each event and each of a plurality of geographical locations, a probability that the respective event was obtained from a second device given that the second device is located at the respective geographical location; determining a probability that the sequence of events was obtained from the second device, including using a model representing how sequences of events are generated by network devices; and determining for each of the plurality of geographical locations a probability that the first device is located at the respective geographical location using the probability that the sequence of events was obtained.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining geographical locations of devices. One of the methods includes obtaining a first network address of a first device; obtaining first route information associated with at least one data transmission between the first source network address and a first network address; obtaining a second network address associated with a second device; obtaining second route information associated with at least one data transmission between a second source network address and the second network address; obtaining an estimate for geographical location of the second device; determining a first latency distance between the first network address and the second network address based on the first and second route information; and estimating a geographical location of the first device based on the estimate for geographical location of the second device and the first latency distance.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving geographic targeting of digital content. One of the methods includes receiving a targeting request that identifies a target geographic region; identifying one or more groups of geographic regions that each include at least two geographic regions, including the target geographic region, wherein the one or more groups of geographic regions are identified based on respective combined targeting accuracies, the respective combined targeting accuracy of each of the one or more groups being higher than a targeting accuracy for the target geographic region; and providing data describing the identified one or more groups of geographic regions in response to the targeting request.