摘要:
A method of estimating the query size of two databases T and R is disclosed. The method uses a threshold value to categorize the databases as dense or sparse. A dense-dense procedure is then applied to the two databases to produce a dense-dense estimate (A.sub.d). A sparse-any procedure that suppresses the dense data items coming from database T is performed which produces a first sparse-any estimate (A.sub.s1). A second sparse-any estimate (A.sub.s2) is then produced by suppressing the dense data items from database R. Ultimately a query size estimate is produced by combining the dense-dense estimate, the first sparse-any estimate and the second sparse-any estimate.
摘要:
A system for, and method of, generating an alias source address for an electronic mail (“e-mail”) message having a real source address and a destination address and a computer network, such as the Internet, including the system or the method. In one embodiment, the system includes an alias source address generator that employs the destination address to generate the alias source address. The system further includes an alias source address substitutor that substitutes the alias source address for the real source address. This removes the real source address from the e-mail message and thereby renders the sender, located at the real source address, anonymous. Further-described are systems and methods for forwarding reply e-mail and filtering reply e-mail based on alias source address.
摘要:
A parallel processing method involves the steps of determining a sequential ordering of tasks for processing, assigning priorities to available tasks on the basis of the earliest and then later in the sequential ordering, selecting a number of tasks greater than a total number of available parallel processing elements from all available tasks having the highest priorities, partitioning the selected tasks into a number of groups equal to the available number of parallel processing elements, and executing the tasks in the groups in the parallel processing elements. The determining step establishes an ordering with a specific predetermined sequential schedule that is independent of the parallel execution, and the assigning step assigns priorities for parallel execution on the basis of the sequential schedule that is independent of the parallel execution.
摘要:
A method maintains information associated with items in a database of limited memory which information is used to generate representations of the information such as high-biased histograms. In a first embodiment of the inventive method, information associated with all items with sales above a threshold, together with approximate counts of the items, is maintained. Appropriate choice of a threshold limits the amount of information required to be maintained so as to generate accurate representations of the information with high probability. In a second embodiment of the inventive method, information used to generate a high-biased histogram is maintained within a fixed allotment of memory by dynamic adjusting a threshold which threshold is used to determine a probability with which information is retained in the database.
摘要:
A method for scheduling access of data blocks located in a computer system having a plurality of disk drives, each disk drive has a disk cache with a specified fence parameter value coupled to a host computer via a common bus. The method according to one embodiment, comprises the steps of: (a) sequentially accessing each of the disk drives for a predetermined number of iterations to retrieve a predetermined number of data blocks; (b) for a specified number of the iterations, transferring data located in the disk cache to be transferred to the common bus and requesting data corresponding to the following iteration to be transferred to the disk cache; and (c) repeating steps (a) and (b) until the predetermined iterations are completed.
摘要:
A system and method for scheduling and controlling delivery of advertising in a communications network and a communications network and remote computer program employing the system or the method. The system includes: (1) a time allocation controller that allocates time available in a particular advertising region in a display device of a remote computer between at least two advertisements as a function of one of a desired user frequency, a desired time frequency, or a desired geometry, for each of the at least two advertisements and (2) data communication controller, coupled to the time allocation controller, that delivers the at least two advertisements to said remote computer for display in the advertising region according to the allocating of the time.
摘要:
Techniques for maintaining an approximate histogram of a relation in a database, in the presence of updates to the relation. The histogram includes a number of subsets, or "buckets," each representing at least one possible value of an attribute of the relation. Each of the subsets has a count associated therewith indicative of the frequency of occurrence of the corresponding value of the attribute. After an update to the relation, the counts associated with the subsets are compared to a threshold. If the count associated with a given subset exceeds the threshold, the given subset is separated at its median into two separate subsets. After the separation operation, the two subsets with the lowest counts are combined such that a constant number of subsets are maintained in the histogram, if the total combined count of the subsets does not exceed the threshold. If no two subsets have a total combined count which does not exceed the threshold, the histogram is recomputed from a random sample of the relation. The invention substantially reduces the number of times the histogram must be recomputed from the random sample, and is particularly well-suited for use with approximate equi-depth and compressed histograms.
摘要:
Techniques for maintaining a random sample of a relation in a database in the presence of updates to the relation. The random sample of the relation is referred to as a "backing sample," and it is maintained in the presence of insert, modify and delete operations involving the relation. When a new tuple is inserted into the relation, a sample of the given tuple is added to the backing sample if the size of the backing sample is below an upper bound. Otherwise, a randomly-selected tuple of the backing sample is replaced with the new tuple if a sample of the new tuple must be inserted into the backing sample to maintain randomness or another characteristic. When a tuple in the relation is the subject of a modify operation, the backing sample is left unchanged if the modify operation does not affect an attribute of interest to an application which uses the backing sample. Otherwise, a value field in a sample of the tuple in the backing sample is updated. When a tuple is deleted from the relation, any sample of that tuple in the backing sample is removed. A new backing sample may be computed if this removal causes the size of the backing sample to fall below a prespecified lower bound. The backing sample can be of a size which is negligible in comparison to the relation, and need only be modified very infrequently. As a result, its overhead in terms of computation time and storage space is minimal.
摘要:
Parallel processing is performed by determining sequential ordering of tasks for processing, assigning priorities to the tasks available on the basis of the sequential ordering, selecting a number of tasks greater than a total number of available parallel processing elements from all available tasks having the highest priorities, partitioning the selected tasks into a number of groups equal to the available number of parallel processing elements, and executing the tasks in the parallel processing elements.
摘要:
The present invention provides a prefetch system for use with a cache memory associated with a database employing indices. In one embodiment, the prefetch system includes a search subsystem configured to prefetch cache lines containing an index of a node of a tree structure associated with the database. Additionally, the prefetch system also includes a scan subsystem configured to prefetch cache lines based on an index prefetch distance between first and second leaf nodes of the tree structure.