Abstract:
A distributed system with transaction support may have a transaction component and one or more data components. The transaction component may manage a transaction using a log sequence number for each operation, and then transmit operations to one or more data components with log sequence numbers. The data components may perform the data operations in an idempotent manner and return a reply. The transaction component may then write the operation, its log sequence number, and information from the reply message to its log. The transaction component is able to commit a transaction, as well as retry or undo portions of a transaction, by using the information stored on its log. This may be possible even when a single transaction uses multiple data components, which may be located on different devices or manage separate and independent data sources.
Abstract:
This patent application relates to enhanced logical recovery techniques for redo recovery operations of a system with an unbundled storage engine. These techniques can be implemented by utilizing an enhanced logical recovery approach in which a dirty page table (DPT) is constructed based on information logged during normal execution. The unbundled storage engine can include a transaction component (TC) that is architecturally independent of a data component (DC). These techniques can enhance redo recovery operations by mitigating the resources needed to determine whether previously executed operations sent from the TC to the DC are to be repeated in response to a recovery-initiating event. This can include using the DPT to avoid fetching every data page corresponding to every previously executed operation received by the DC during recovery and/or pre-fetching data pages and/or index pages that correspond to PIDs in the DPT
Abstract:
A method and system for increasing server cluster availability by requiring at a minimum only one node and a quorum replica set of replica members to form and operate a cluster. Replica members maintain cluster operational data. A cluster operates when one node possesses a majority of replica members, which ensures that any new or surviving cluster includes consistent cluster operational data via at least one replica member from the immediately prior cluster. Arbitration provides exclusive ownership by one node of the replica members, including at cluster formation, and when the owning node fails. Arbitration uses a fast mutual exclusion algorithm and a reservation mechanism to challenge for and defend the exclusive reservation of each member. A quorum replica set algorithm brings members online and offline with data consistency, including updating unreconciled replica members, and ensures consistent read and update operations.
Abstract:
The present invention leverages a unilaterally-based interaction contract that enables applications to have a persistent state and ensures an exactly-once execution despite failures between and by communicating entities, permitting disparate software applications to be robustly supported in an environment where little is known about the implementation of the interaction contract. In one instance of the present invention, a web services interaction contract provides a communicating application with duplicate commit request elimination, persistent state transitions, and/or unique persistent reply requests. The present invention permits this interaction contract to be supported by, for example, a persistent application, a workflow, a transaction queue, a database, and a file system to facilitate in providing idempotent executions for requests from a communicating application.
Abstract:
Persistent components are provided across both process and server failures, without the application programmer needing take actions for component recoverability. Application interactions with a stateful component are transparently intercepted and stably logged to persistent storage. A “virtual” component isolates an application from component failures, permitting the mapping of a component to an arbitrary “physical” component. Component failures are detected and masked from the application. A virtual component is re-mapped to a new physical component, and the operations required to recreate a component and reinstall state up to the point of the last logged interaction is replayed from the log automatically.
Abstract:
This patent application relates to enhanced logical recovery techniques for redo recovery operations of a system with an unbundled storage engine. These techniques can be implemented by utilizing an enhanced logical recovery approach in which a dirty page table (DPT) is constructed based on information logged during normal execution. The unbundled storage engine can include a transaction component (TC) that is architecturally independent of a data component (DC). These techniques can enhance redo recovery operations by mitigating the resources needed to determine whether previously executed operations sent from the TC to the DC are to be repeated in response to a recovery-initiating event. This can include using the DPT to avoid fetching every data page corresponding to every previously executed operation received by the DC during recovery and/or pre-fetching data pages and/or index pages that correspond to PIDs in the DPT.
Abstract:
The present invention leverages a unilaterally-based interaction contract that enables applications to have a persistent state and ensures an exactly-once execution despite failures between and by communicating entities, permitting disparate software applications to be robustly supported in an environment where little is known about the implementation of the interaction contract. In one instance of the present invention, a web services interaction contract provides a communicating application with duplicate commit request elimination, persistent state transitions, and/or unique persistent reply requests. The present invention permits this interaction contract to be supported by, for example, a persistent application, a workflow, a transaction queue, a database, and a file system to facilitate in providing idempotent executions for requests from a communicating application.
Abstract:
Recovery processing of logless components is disclosed. Logless components in middle-tier systems can be checkpointed to provide faster recovery. In particular, a client system, executing a persistent component and itself logging, initiates a snapshot method that returns to the client the values of all variables and other state of the logless component during normal execution. The client writes this data to the client log along with information about the initiation call. To recover the logless component, the client invokes a restore method which takes as an argument values returned from the snapshot method and included in the checkpointing portion of the client log relating to the logless component. This information is sufficient for recreating the logless component which is logically identical to the failed logless component and for setting its state to the checkpoint state. This can occur transparently and shorten the recovery time in providing exactly-once execution.
Abstract:
A data structure, added to a modified form of the Blink-tree data structure, tracks delete states for nodes. The index delete state (DX) indicates whether it is safe to directly access an index node without re-traversing the B-tree. The DX state is maintained globally, outside of the tree structure. The data delete state (DD) indicates whether it is safe to post an index term for a new leaf node. A DD state is maintained in each level 1 node for its leaf nodes. Delete states indicate whether a specific node has not been deleted, or whether it may have been deleted. Delete states are used to remove the necessity for atomic node splits and chains of latches for deletes, while not requiring retraversal. This property of not requiring a retraversal is exploited to simplify the tree modification operations.
Abstract translation:添加到B 链接 SUP>树数据结构的修改形式的数据结构跟踪节点的删除状态。 索引删除状态(D SUB> X)表示在不重新遍历B树的情况下直接访问索引节点是否安全。 在树结构之外,全局地维护D X>状态。 数据删除状态(D D SUB>)表示是否可以安全地为新的叶节点发布索引项。 在其叶节点的每个1级节点中保持D D D N状态。 删除状态表示特定节点是否未被删除,或者是否可能被删除。 删除状态用于消除用于删除的原子节点拆分和锁存链的必要性,而不需要重新进行穿越。 利用这种不需要重穿的属性来简化树的修改操作。
Abstract:
A system, method and computer-readable medium for optimizing recovery logging is provided. A calling component stably logs a message from a called component only when sending a second message or sending a second message after a log force that writes the return message from the first message to the stable log. The called component stably logs its return message before the return message is sent.