摘要:
A method for performing continuous auctions over a computer network system consisting of a server/seller and multiple clients/buyers. The seller makes information about the type of sale items, the number of sale items, minimum bid price, time limits for bids to be submitted, and estimated time interval to the next auction decision available to the buyer by displaying it on buyers' computer terminals. Each buyer responds by entering a bid and such bid's duration, within the time limits set by the seller, in to the auction system through buyers' computer terminals. Additionally, a buyer's bid entry time is saved by the system. Determining the response time for present buyers to schedule the next auction. At least one auction winner, whose bid is within bid duration, is selected through a dynamically adjusted customer selection method.
摘要:
A method and system of collaboratively caching information to allow improved caching decisions by a lower level or sibling node. In a caching hierarchy, the client and/or servers may factor in the caching status at the higher level in deciding whether to cache an object and which objects are to be replaced. The PICS protocol may be used to pass the caching information of some or all the upper hierarchy down the hierarchy. Furthermore, the caching status information can also be used to direct the object request to the closest higher level proxy which has potentially cached the object, instead of blindly requesting it from the next immediate higher level proxy. A selection policy used to select objects for replacement in the cache may be prioritized not only on the size and the frequency of access of the object, but also on the access time required to get the object if it is not cached. The selection policy may also include a selection weight factor wherein each object is assigned a selection weight based on its replacement cost, the object size and how frequently it is modified. Non-uniform size objects may be classified in ranges of selection weights having geometrically increasing intervals. Multiple LRU stacks may be independently maintained wherein each stack contains objects in a certain range of selection weights. In order to choose candidates for replacement, only the least recently used objects in each group need be considered.
摘要:
A system and method are provided to analyze information stored in a computer data base by detecting clusters of related or correlated data values. Data values stored in the data base represent a set of objects. A data value is stored in the data base as an instance of a set of features that characterize the objects. The features are the dimensions of the feature space of the data base. Each cluster includes not only a subset of related data values stored in the data base but also a subset of features. The data values in a cluster are data values that are a short distance apart, in the sense of a metric, when projected onto a subspace that corresponds to the subset of features of the cluster. A set of k clusters may be detected such that the average number of features of the subsets of features of the clusters is l.
摘要:
A computer method of removing simple and strict redundant association rules generated from large collections of data. A compact set of rules is presented to an end user which is devoid of many redundancies in the discovery of data patterns. The method is directed primarily to on-line applications such as the Internet and Intranet. Given a number of large itemsets as input, simple redundancies are removed by generating all maximal ancestors, the frontier set, for each large itemset. The set of maximal ancestors share a hierarchical relationship with the large itemset from which they were derived and further satisfy an inequality whereby the ratio of respective support values is less than the reciprocal of some user defined confidence value.The resulting compact rule set is displayed to an end user at some specified level of support and confidence. The method is also able to generate the full set of rules from the compact set.
摘要:
A graph taxonomy of information which is represented by a plurality of vectors is generated. The graph taxonomy includes a plurality of nodes and a plurality of edges. The plurality of nodes is generated, and each node of the plurality of nodes is associated with ones of the plurality of vectors. A tree hierarchy is established based on the plurality of nodes. A plurality of distances between ones of the plurality of nodes is calculated. Ones of the plurality of nodes are connected with other ones of the plurality of nodes by ones of the plurality of edges based on the plurality of distances. The information represented by the plurality of vectors may be, for example, a plurality of documents such as Web Pages.
摘要:
A computerized method of online mining of inference rules in a large database. The method is comprised of two stages, a preprocessing stage followed by an online rule generation stage. The pro-processing stage is further defined to be a two step process that involves the generation of large itemsets. The present method defines large itemsets by how the items in the itemsets relate to each other rather than their level of presence. The measure by which itemsets are said to relate to each other is defined by a computed figure of merit, K1. The first substep of the preprocessing stage involves finding those itemsets that possess a minimum computer collective strength of K1. From those found itemsets, a second user supplied input, K2 is used to prune those itemsets with inference strength below K2.
摘要:
A computer method of online mining of quantitative association rules consisting of two stages, a preprocessing stage followed by an online rule generation stage. The required computational effort is reduced by the pre-processing stage, defined by pre-processing data to organize the relationship between antecedent attributes to create a heirarchially arranged multidimensional indexing structure. The resulting structure facilitates the performance of the second stage, online processing, which involves the generation of quantitative association rules. The second stage, online rule generation, utilizes the multidimensional index structure created by the preprocessing stage by first finding the areas in the data which correspond to the rules and then uses a merging step to create a merged tree in order to carefully combine interesting regions in order to give a heirarchical representation of the rule set. The merged tree is then used in order to actually generate the rules.
摘要:
An apparatus and a method for constructing a multidimensional index tree which minimizes the time to access data objects and is resilient to the skewness of the data. This is achieved through successive partitioning of all given data objects by considering one level at a time starting with one partition and using a top-down approach until each final partition can fit within a leaf node. Subdividing the data objects is via a global optimization approach to minimize the area overlap and perimeter of the minimum bounding rectangles covered by each node. The current invention divides the index construction problem into two subproblems: the first one addresses the tightness of the packing (in terms of area, overlap and perimeter) using a small fan out at each index node and the other one handles the fan out issue to improve index page utilization. These two stages are referred to as binarization and compression. The binarization stage constructs a binary tree such that the entries in the leaf nodes correspond to the spatial data objects. The compression stage converts the binary tree into a tree for which all but the leaf nodes and the parent nodes of all leaf nodes have branch factors of M. In the binarization stage, a weighting or skew factor is used to achieve flexibility in determining the number of data objects to be included in each of the partitions to obtain a tree structure with desirable query performance. Thus the index tree constructed is not required to be height balanced. This provides a means to trade-off imbalance in the index tree in order to reduce the number of pages which need to be accessed in a query.
摘要:
Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.
摘要:
A method for dynamically placing objects in slots on a web page in response to a current client request for the web page comprises the steps of classifying users into user groups based one or more user-characteristics, accumulating self-learning data based on user click behavior for each user group, matching the current client request with a corresponding user group and scheduling real-time selection of the slots for the objects on the web page based on the self-learning data of the corresponding user group.