摘要:
The present invention discloses a document descriptor extraction method and system. The document descriptor extraction method and system creates a document descriptor by generalizing input sequences within a document; factoring the input sequences and generalized input sequences; and selecting a document descriptor from the input sequences, generalized sequences, and factored sequences, preferably using minimum descriptor length (MDL) principles. Novel algorithms are employed to perform the generalizing, factoring, and selecting.
摘要:
Physical connectivity is determined between elements such as switches and routers in a multiple subnet communication network. Each element has one or more interfaces each of which is physically linked with an interface of another network element. Address sets are generated for each interface of the network elements, wherein members of a given address set correspond to network elements that can be reached from the corresponding interface for which the given address set was generated. The members of first address sets generated for corresponding interfaces of a given network element, are compared with the members of second address sets generated for corresponding interfaces of network elements other than the given element. A set of candidate connections between an interface of the given network element and one or more interfaces of other network elements, are determined. If more than one candidate connection is determined, connections with network elements that are in the same subnet as the given network element are eliminated from the set.
摘要:
A system for, and method of, ensuring serialization of lazy updates in a distributed database described by a directed acyclic copy graph. In one embodiment, the system includes: (1) a timestamp module that creates a unique timestamp for each of the lazy updates and (2) a propagation module, associated with the timestamp module, that employs edges of the directed acyclic copy graph to propagate the lazy updates among replicas in the distributed database according to said unique timestamp and ensure the serialization.
摘要:
A system for, and method of, ensuring serialization of updates from a replica site in a distributed database that is described by a copy graph and a distributed database incorporating the system or the method. In one embodiment, the system includes: (1) a directed acyclic copy graph (DAG) creation module that identifies backedges in, and removes the backedges from, the copy graph to yield a DAG and (2) a propagation module, associated with the DAG creation module, that initially employs eager updating to propagate the updates along the backedges and thereafter employs lazy updating to propagate the updates along edges of the directed acyclic copy graph to ensure the serialization.
摘要:
For use with a database of data records stored in a memory, a system and method for increasing a memory capacity and a memory database employing the system or the method. The system includes: (1) a time stamping controller that assigns a time stamp to transactions to be performed on the database, the time stamp operates to preserve an order of the transactions, (2) a versioning controller that creates multiple versions of ones of the data records affected by the transactions that are update transactions and (3) an aging controller, which is associated with each of the time stamping and versioning controllers, that monitors a measurable characteristic of the memory and deletes ones of the multiple versions of the ones of the data records in response to the time stamp and the measurable characteristic thereby to increase memory capacity.
摘要:
A method of detecting and recovering from data corruption of a database is characterized by the step of logging information about reads of a database in memory to detect errors in data of the database, wherein said errors in data of said database arise from one of bad writes of data to the database, of erroneous input of data to the database by users and of logical errors in code of a transaction. The read logging method may be implemented in a plurality of database recovery models including a cache-recovery model, a prior state model a redo-transaction model and a delete transaction model. In the delete transaction model, it is assumed that logical information is not available to allow a redo of transactions after a possible error and the effects of transactions that read corrupted data are deleted from history and any data written by a transaction reading Ararat data is treated as corrupted.
摘要:
An on-line reorganization method of an object-oriented database with physical references involves a novel fuzzy traversal of the database, or a partition thereof, to identify the approximate parents of all migrating objects. Where the entire database is traversed the process begins from its persistent root. For traversals of a partition the process begins from each object with a reference pointing to it from outside the partition. To facilitate the identification of these inter-partitional objects an External Reference Table (“ERT”) is maintained. During the fuzzy traversal all new inserted and deleted references are tracked in a Temporary Reference Table (“TRT”). After the fuzzy traversal is completed, for each migrating object, a lock is obtained on the identified approximate parents and on all new parents in which references to the object were inserted, as indicated by the TRT. Based on the information in the TRT, locks are released on all approximate parents whose references to the object have been deleted. The references to the migrating object in the remaining set of locked parents are updated, the object is relocated and the locks are released. Alternatively, each parent of a migrating object can be individually locked, updated and released.
摘要:
A system for, and method of, ensuring serialization of lazy updates in a distributed database described by a directed acyclic copy graph. In one embodiment, the system includes: (1) a forest construction module that creates a forest having trees and edges from the directed acyclic copy graph and (2) a propagation module, associated with the forest construction module, that employs the edges of the forest to propagate the lazy updates among replicas in the distributed database and ensure the serialization.
摘要:
A method of detecting and recovering from data corruption of a database is characterized by the step of protecting data of the database with codewords, one codeword for each region of the database; and verifying that a codeword matches associated data before the data is read from the database to prevent transaction-carried corruption. A deferred maintenance scheme is recommended for the codewords protecting the database such that the method of detecting and recovering from data corruption of a database may comprise the steps of protecting data of the database with codewords, one codeword for each region of the database; and asynchronously maintaining the codewords to improve concurrency of the database. Moreover, the database may be audited by using the codewords and noting them in a table and protecting regions of the database with latches. Once codeword values are computed and checked against noted values in memory, a flush can cause codewords from outstanding log records to be applied to the stored codeword table.
摘要:
For use with a database of data records organized into components, the database stored in a memory, a processing system for, and method of, physically versioning the database. In one embodiment, the processing system includes: (1) a component copier that creates a physical copy of an original component to be affected by an update transaction to be applied to the database, and that causes pointers in nodes of the physical copy to point to other nodes in the physical copy, (2) a data updater, associated with the component copier, that applies the update transaction to the physical copy to create therefrom a new physical version, the original component remaining unaffected by the update transaction and (3) a pointer updater, associated with the data updated, that employs an atomic word write to revise a component pointer, associated with the database, to cause the pointer to point to the new physical version.