摘要:
An on-line reorganization method of an object-oriented database with physical references involves a novel fuzzy traversal of the database, or a partition thereof, to identify the approximate parents of all migrating objects. Where the entire database is traversed the process begins from its persistent root. For traversals of a partition the process begins from each object with a reference pointing to it from outside the partition. To facilitate the identification of these inter-partitional objects an External Reference Table (“ERT”) is maintained. During the fuzzy traversal all new inserted and deleted references are tracked in a Temporary Reference Table (“TRT”). After the fuzzy traversal is completed, for each migrating object, a lock is obtained on the identified approximate parents and on all new parents in which references to the object were inserted, as indicated by the TRT. Based on the information in the TRT, locks are released on all approximate parents whose references to the object have been deleted. The references to the migrating object in the remaining set of locked parents are updated, the object is relocated and the locks are released. Alternatively, each parent of a migrating object can be individually locked, updated and released.
摘要:
A method and apparatus are disclosed for providing enhanced pay per view in a video server. Specifically, the present invention periodically schedules a group of non pre-emptible tasks corresponding to videos in a video server having a predetermined number of processors, wherein each task begins at predetermined periods and has a set of sub-tasks separated by predetermined intervals. To schedule the group of tasks, the present invention divides the tasks into two groups according to whether they may be scheduled on a single processor. The present invention schedules each group separately. For the group of tasks not scheduleable on a single processor, the present invention determines a number of processors required to schedule such group and schedules such tasks to start at a predetermined time. For the group of tasks scheduleable on a single processor, the present invention determines whether such tasks are scheduleable on the available processors using an array of time slots. If the present invention determines that such group of tasks are not scheduleable on the available processors, then the present invention recursively partitions such group of tasks in subsets and re-performs the second determination of scheduleability. Recursive partitioning continues until the group of tasks is deemed scheduleable or no longer partitionable. In the latter case, the group of tasks is deemed not scheduleable.
摘要:
For use with an active database stored in volatile memory for direct revision thereof, the active database having multiple checkpoints and a stable log, having a tail stored in the volatile memory, for tracking revisions to the active database to allow corresponding revisions to be made to the multiple checkpoints, the active database subject to corruption, a system for, and method of, restoring the active database and a computer system containing the same. The system includes: (1) a checkpoint determination controller that determines which of the multiple checkpoints is a most recently completed checkpoint and copies the most recently completed checkpoint to the volatile memory to serve as an unrevised database for reconstructing the active database and (2) a revision application controller that retrieves selected ones of the revisions from the stable log and the tail and applies the revisions to the unrevised database thereby to restore the active database. In an advantageous embodiment, the applied revisions include log records at an operation level (lower level of abstration than transactions), and the revision application controller, using memory locks while restoring the active database, releases ones of the memory locks as a function of applying ones of the log records.
摘要:
For use with a central database associated with a server of a network, the central database having distributed counterparts stored in volatile memories of clients of the network to allow operations to be performed locally thereon, the central database further having multiple checkpoints and a stable log stored in the server for tracking operations on the central database to allow corresponding operations to be made to the multiple checkpoints, the stable log having tails stored in the volatile memories to track operations on corresponding ones of the distributed counterparts, the distributed counterparts to corruption, a system for, and method of, restoring a distributed counterpart stored in one of the volatile memories. The system includes: (1) a checkpoint determination controller that determines which of the multiple checkpoints is a most recently completed checkpoint and copies the most recently completed checkpoint to the one of the volatile memories to serve as an unrevised database for reconstructing the distributed counterpart and (2) an operation application controller that retrieves selected ones of the operations from the stable log and a tail corresponding to the distributed counterpart and applies the operations to the unrevised database thereby to restore the distributed counterpart.
摘要:
A method, a system and a computer program product for maximizing content spread in a social network are provided. Samples of edges are generated from an initial candidate set of edges. Each edge of the samples of edges has a probability value for content flow. Further, a subset of edges is determined from the samples of edges based on gain corresponding to each edge. Also, each node of the subset of edges is having at least one of less than ‘K’ or equal to ‘K’ incoming edges. Further, the probability of each edge, of the subset of edges, may be incremented. Furthermore, a final set of edges may be determined by ensuring ‘K’ incoming edges. The ‘K’ incoming edges may be ensured by removing one or more incoming edges when a number of the incoming edges for a node of the final set is greater than ‘K’ incoming edge.
摘要:
A method includes generating, electronically, one or more matching patterns for one or more pairs of attribute values. Each pair includes two attribute values. The two attribute values include a first attribute value from a first record and a second attribute value from a second record. The first attribute value and the second attribute value satisfy a first criterion. Further, the method includes identifying, electronically, matching segment between the first attribute value and the second attribute value of a first pair. The method also includes repeating identifying for each pair. Moreover, the method includes computing a similarity score for the first pair using one of the first pair and the matching segment based on the one or more matching patterns and matching segments of the one or more pairs satisfying a second criterion. The method also includes repeating computing for each pair.
摘要:
Methods and apparatus are provided for identifying constraint violation repairs in data that is comprised of a plurality of records, where each record has a plurality of cells. A database is processed, based on a plurality of constraints that data in the database must satisfy. At least one constraint violation to be resolved is identified based on a cost of repair and the corresponding records to be resolved and equivalent cells are identified in the data that violate the identified at least one constraint violation. A value for each of the equivalent cells can optionally be determined, and the determined value can be assigned to each of the equivalent cells. The at least one constraint violation selected for resolution may be, for example, the constraint violation with a lowest cost. The cost of repairing a constraint is based on a distance metric between the attributes values.
摘要:
An example of a method includes determining features of a first type for a web page of a plurality of web pages. The method also includes electronically determining a plurality of rules for an attribute of the first web page, wherein the plurality of rules are determined based on features of the first type. The method also includes electronically identifying a first rule, from the plurality of rules, which satisfies a first predefined criterion. The first predefined criteria include at least one of a first threshold for a precision parameter, a second threshold for a support parameter, a third threshold for a distance parameter and a fourth threshold for a recall parameter. The method further includes storing the first rule to enable extraction of value of the attribute from a second web page.
摘要:
Web pages are efficiently categorized in a data processor without analyzing the content of the web pages. According to at least one embodiment, data is maintained that represents sample URLs grouped into a plurality of clusters. The sample URLs of a cluster are used to produce a URL regular expression pattern (“URL-regex”) that differentiates the sample URLs of the cluster from the sample URLs of other clusters and that covers at least a specified percentage of the sample URLs in the cluster. The process of producing a URL-regex is repeated for each of the clusters producing a URL-regex for each cluster. Web pages are then categorized into one of the clusters by determining which of the URL-regex patterns produced for the clusters match URLs that refer to the web pages. Thus, a web page may be categorized based on a URL that refers to the web page without having to obtain and analyze the content of the web page.
摘要:
A system and method for determining optimal selection of paths for passively monitoring a communications network. A diagnostic set of paths is determined by ensuring that, for all pairs of links in the network, the set contains one path having only one member of that pair. A detection subset of paths is determined by ensuring that, for all the links in the network, one member of the subset contains that link. Selecting a minimum detection and diagnostic set of paths minimizes the communication overhead imposed by monitoring. During normal operation, only the detection subset need be monitored. Once an anomaly is detected, the system may switch to monitoring the full diagnostic set. The cost of deploying and operating the passive monitoring equipment is minimized by determining the minimum set of links on which a probe needs to be placed in order to monitor the diagnostic set of paths.