摘要:
Subject matter disclosed herein relates to document classification and/or automated document classifier tuning. In an example embodiment, a document received from a user computing platform in an online database stored in a memory of a server computing platform may be classified based, at least in part, on a training set. Also for an example embodiment, the training set may be modified based, at least in part, on statistics gathered from user browsing behavior.
摘要:
A method of generating a diversified vertical search results listing, including listing attribute values related to search criteria and their frequency of occurrence to create a plurality of listings; creating a plurality of interval bands based on the plurality of listings; generating a random diversity score for each listing over a substantially uniform distribution within each of the plurality of bands; and sorting a set of search results for diversified listing in response to a user searching for the search criteria according to the diversity score of each listing.
摘要:
A method and system for allocating inventory in an Internet environment is provided. A method employed by the system may include generating an inventory pool that represents a number of impressions deliverable to all users, then determining, from multiple past orders for booking impressions, a hierarchy of parameters utilized to target users and a number of impressions deliverable to users characterized by the parameters. The inventory pool may then be partitioned into multiple inventory pools according to the hierarchy, where each inventory pool represents a number of impressions deliverable to users characterized by parameters associated with the inventory pool. The hierarchy of pools may then be stored to a database.
摘要:
A method of constructing a score-optimal R-tree to support top-k stabbing queries over a set of scored intervals generates a constraint graph from the set, and determines over each node in the constraint graph that has no other nodes pointing to it the node with the smallest left endpoint; for each of these nodes, the associated interval is added to the tree and the node is removed from the constraint graph.
摘要:
The subject matter disclosed herein relates to processing information regarding events. In one particular example, a stabbing query may be formulated in response to an event. One or more sets are associated with and/or mapped to nodes of a tree.
摘要:
The subject matter disclosed herein relates to processing information regarding events. In one particular example, a stabbing query may be formulated in response to an event. One or more sets are associated with and/or mapped to nodes of a tree.
摘要:
Example embodiments described herein may relate to estimating inventory for a display advertising system utilized, for example, in Web-based advertising.
摘要:
Methods and apparatus for representing probabilistic data using a probabilistic histogram are disclosed. An example method comprises partitioning a plurality of ordered data items into a plurality of buckets, each of the data items capable of having a data value from a plurality of possible data values with a probability characterized by a respective individual probability distribution function (PDF), each bucket associated with a respective subset of the ordered data items bounded by a respective beginning data item and a respective ending data item, and determining a first representative PDF for a first bucket associated with a first subset of the ordered data items by partitioning the plurality of possible data values into a first plurality of representative data ranges and respective representative probabilities based on an error between the first representative PDF and a first plurality of individual PDFs characterizing the first subset of the ordered data items.
摘要:
A distinct-count estimate is obtained in a guaranteed small footprint using a two level hash, distinct count sketch. A first hash fills the first-level hash buckets with an exponentially decreasing number of data-elements. These are then uniformly hashed to an array of second-level-hash tables, and have an associated total-element counter and bit-location counters. These counters are used to identify singletons and so provide a distinct-sample and a distinct-count. An estimate of the total distinct-count is obtained by dividing by the distinct-count by the probability of mapping a data-element to that bucket. An estimate of the total distinct-source frequencies of destination address can be found in a similar fashion. By further associating the distinct-count sketch with a list of singletons, a total singleton count and a heap containing the destination addresses ordered by their distinct-source frequencies, a tracking distinct-count sketch may be formed that has considerably improved query time.