摘要:
An improved approach is described for implementing transformations of data records in high concurrency environments. Each transformation is performed in parallel at the source when the data record is first generated. According to one approach for data integrity validation, record generators compute an integrity checksum for a newly generated record before copying into a data unit in shared memory. Subsequent generators may aggregate integrity checksums for data records into checksums for data units incrementally. This approach achieves end-to-end protection of data records against corruption using an efficient method of maintaining verifiable data integrity. In another approach, compression and encryption data transformations may be performed by themselves, or in combination with an integrity checksum transformation.
摘要:
A method and system is provided for measuring, guaranteeing, and reducing replication data lag time between a primary system and one or more standby systems. Each standby system determines the lag time between the generation of a consistent version of data on the primary system and the time that the consistent version is applied on the standby system. Applications can request and be guaranteed to receive data from a standby system that is identical to the state on the primary system at the time of the query, or lag the primary state only by a maximum tolerable amount. A standby system may also publish a service that guarantees a maximum lag time and withdraw the service offer when the actual lag time exceeds the guaranteed lag time.Implications for implementing synchronous and asynchronous replication as well as performance optimizations are also discussed.
摘要:
A method and system is provided for measuring, guaranteeing, and reducing replication data lag time between a primary system and one or more standby systems. Each standby system determines the lag time between the generation of a consistent version of data on the primary system and the time that the consistent version is applied on the standby system. Applications can request and be guaranteed to receive data from a standby system that is identical to the state on the primary system at the time of the query, or lag the primary state only by a maximum tolerable amount. A standby system may also publish a service that guarantees a maximum lag time and withdraw the service offer when the actual lag time exceeds the guaranteed lag time.Implications for implementing synchronous and asynchronous replication as well as performance optimizations are also discussed.
摘要:
A method and system is provided for reducing delay to applications connected to a database server that guarantees no data loss during failure or disaster. After storing a log record persistently in a local primary log, the log writer returns control to the application which continues running concurrently with the database server sending the session's log records to a standby database. A separate back channel is used by the standby to communicate, out-of-band to the primary, the location of the last log record stored persistently to the standby log. An application waiting for a transaction to commit may wait until the transaction's commit record has been persisted. Also described is a technique for reducing application delay when there is contention between nodes of a multi-node cluster for updating the same block. The technique provides for an asynchronous ping protocol that guarantees zero data loss during failure or disaster.
摘要:
Techniques used in an automatic failover configuration having a primary database system, a standby database system, and an observer. In the automatic failover configuration, the primary database system remains available even in the absence of both the standby and the observer as long as the standby and the observer become absent sequentially. The failover configuration may use asynchronous transfer modes to transfer redo to the standby and permits automatic failover only when the observer is present and the failover will not result in data loss due to the asynchronous transfer mode beyond a specified maximum. The database systems and the observer have copies of failover configuration state and the techniques include techniques for propagating the most recent version of the state among the databases and the observer and techniques for using carefully-ordered writes to ensure that state changes are propagated in a fashion which prevents divergence.
摘要:
A method and apparatus for detecting split brain in a distributed system is provided. After determining that a rogue instance is no longer an active member of the cluster, a recovery instance detects activity associated with a redo log that is updated by the rogue instance to store log records that describe changes made by the rogue instance to data associated with the cluster.
摘要:
A method and system for replicating database data is provided. One or more standby database replicas can be used for servicing read-only queries, and the amount of storage required is scalable in the size of the primary database storage. One technique is described for combining physical database replication to multiple physical databases residing within a common storage system that performs de-duplication. Having multiple physical databases allows for many read-only queries to be processed, and the de-duplicating storage system provides scalability in the size of the primary database storage. Another technique uses one or more diskless standby database systems that share a read-only copy of physical standby database files. Notification messages provide consistency between each diskless system's in-memory cache and the state of the shared database files. Use of a transaction sequence number ensures that each database system only accesses versions of data blocks that are consistent with a transaction checkpoint.
摘要:
A computer is programmed to identify failures and perform recovery of data. Specifically, in several embodiments, the computer is programmed to automatically check integrity of data in a storage structure to identify a set of failures related to the storage structure. The computer is further programmed in some embodiments to identify, based on one failure in the set of failures, a group of repairs to fix that one failure. Each repair in the group of repairs is alternative to another repair in the group. The computer is also programmed in some embodiments to execute at least one repair in the group of repairs, so as to generate corrected data to fix the one failure. In certain embodiments, the corrected data is stored in non-volatile storage media of the computer.
摘要:
A method and apparatus for detecting split brain in a distributed system is provided. After determining that a rogue instance is no longer an active member of the cluster, a recovery instance detects activity associated with a redo log that is updated by the rogue instance to store log records that describe changes made by the rogue instance to data associated with the cluster.
摘要:
A method and system for replicating database data is provided. One or more standby database replicas can be used for servicing read-only queries, and the amount of storage required is scalable in the size of the primary database storage. One technique is described for combining physical database replication to multiple physical databases residing within a common storage system that performs de-duplication. Having multiple physical databases allows for many read-only queries to be processed, and the de-duplicating storage system provides scalability in the size of the primary database storage. Another technique uses one or more diskless standby database systems that share a read-only copy of physical standby database files. Notification messages provide consistency between each diskless system's in-memory cache and the state of the shared database files. Use of a transaction sequence number ensures that each database system only accesses versions of data blocks that are consistent with a transaction checkpoint.