Distributed transaction processing using two-phase commit protocol with
presumed-commit without log force
    1.
    发明授权
    Distributed transaction processing using two-phase commit protocol with presumed-commit without log force 失效
    分布式事务处理使用两阶段提交协议与推定提交无日志强制

    公开(公告)号:US5335343A

    公开(公告)日:1994-08-02

    申请号:US909556

    申请日:1992-07-06

    摘要: A two-phase commit protocol for a distributed transaction processing system employs the presumed-commit configuration, with the exception that the new presumed-commit protocol coordinator needs to force-write only a "commit" log record for committed transactions, not the previous force writing of two log records. In order to provide information needed to allow the coordinator to answer inquiries from subordinate processes following a crash or loss of communications, a technique for circumscribing the set of indeterminate transactions is employed. The transactions are numbered in increasing order, identified by a transaction ID (T.sub.-- ID). The commit protocol is not allowed to begin unless the transaction ID of the committing transaction is within some preselected range of numbers starting from the highest-numbered stably-recorded transaction ID. That is, if the transaction number is too far removed from the highest TID of a stably stored log record (written to disk storage and able to survive a crash), then log records are written to disk until this condition hold. This may require writing to a disk log record for the committing transaction. Most commit transactions can, however, proceed without waiting for a disk write (forced log), and so performance is improved. A technique is disclosed for circumscribing the set of indeterminate transactions (not shown whether they committed, aborted or never started) so that information is small. It must be "permanently" retained, but the coordinator can store some of it in a cache (volatile memory) to answer inquiries.

    摘要翻译: 分布式事务处理系统的两阶段提交协议采用假定提交配置,除了新的推定提交协议协调器需要强制写入已提交事务的“提交”日志记录,而不是先前的强制 写两个日志记录。 为了提供所需的信息,允许协调者在通信崩溃或丢失之后回答下属进程的询问,采用限定一组不确定交易的技术。 交易按照递增顺序编号,由交易ID(T-ID)标识。 提交协议不允许开始,除非提交事务的事务ID处于从最高编号的稳定记录的事务ID开始的一些预选的数字范围内。 也就是说,如果交易号码远离稳定存储的日志记录的最高TID(写入磁盘存储并能够在崩溃中生存)太远,则将日志记录写入磁盘,直到此条件成立为止。 这可能需要写入提交事务的磁盘日志记录。 然而,大多数提交事务可以在不等待磁盘写入(强制日志)的情况下进行,因此性能得到改善。 公开了一种用于限制一组不确定事务(未示出,是否提交,中止或从未启动)的技术,使得信息很小。 它必须“永久”保留,但协调器可以将其中的一些存储在缓存(易失性存储器)中以回答查询。

    Method and system for consistent cluster operational data in a server cluster using a quorum of replicas
    4.
    发明授权
    Method and system for consistent cluster operational data in a server cluster using a quorum of replicas 有权
    使用复制数量的服务器集群中一致的集群操作数据的方法和系统

    公开(公告)号:US06938084B2

    公开(公告)日:2005-08-30

    申请号:US09895810

    申请日:2001-07-02

    摘要: A method and system for increasing server cluster availability by requiring at a minimum only one node and a quorum replica set of replica members to form and operate a cluster. Replica members, independent from the nodes, maintain cluster operational data. A cluster operates when one node possesses a majority of replica members, which ensures that any new or surviving cluster includes consistent cluster operational data via at least one replica member from the immediately prior cluster. Arbitration provides exclusive ownership by one node of the replica members, including at cluster formation, and when the owning node fails. Arbitration uses a fast mutual exclusion algorithm and a reservation mechanism to challenge for and defend the exclusive reservation of each member. A quorum replica set algorithm brings members online and offline with data consistency, including updating unreconciled replica members, and ensures consistent read and update operations.

    摘要翻译: 通过至少要求一个节点和副本成员的仲裁复制集来形成和操作集群,来提高服务器集群可用性的方法和系统。 独立于节点的副本成员维护集群操作数据。 当一个节点拥有大多数副本成员时,集群就会运行,从而确保任何新的或存活的集群通过至少一个来自紧接在前的集群的副本成员包括一致的集群操作数据。 仲裁提供复制成员的一个节点的独占所有权,包括在集群形成时以及所有节点出现故障时。 仲裁使用快速互斥算法和保留机制来挑战和保护每个成员的独家保留。 仲裁复制集算法使成员在线和离线具有数据一致性,包括更新未协调的副本成员,并确保一致的读取和更新操作。

    Database computer system with application recovery and dependency
handling write cache

    公开(公告)号:US6067550A

    公开(公告)日:2000-05-23

    申请号:US832870

    申请日:1997-04-04

    申请人: David B. Lomet

    发明人: David B. Lomet

    IPC分类号: G06F17/00 G06F17/30

    摘要: This invention concerns a database computer system and method for making applications recoverable from system crashes. The application state (i.e., address space) is treated as a single object which can be atomically flushed in a manner akin to flushing individual pages in database recovery techniques. To enable this monolithic treatment of the application, executions performed by the application are mapped to logical loggable operations which can be posted to the stable log. Any modifications to the application state are accumulated and the application state is periodically flushed to stable storage using an atomic procedure. The application recovery integrates with database recovery, and effectively eliminates or at least substantially reduces the need for check pointing applications. In addition, optimization techniques are described to make the read, write, and recovery phases more efficient.

    Database computer system with application recovery
    6.
    发明授权
    Database computer system with application recovery 失效
    数据库计算机系统与应用程序恢复

    公开(公告)号:US5946698A

    公开(公告)日:1999-08-31

    申请号:US814808

    申请日:1997-03-10

    申请人: David B. Lomet

    发明人: David B. Lomet

    IPC分类号: G06F17/30

    CPC分类号: G06F11/1438 Y10S707/99953

    摘要: This invention concerns a database computer system and method for making applications recoverable from system crashes. The application state (i.e., address space) is treated as a single object which can be atomically flushed in a manner akin to flushing individual pages in database recovery techniques. To enable this monolithic treatment of the application, executions performed by the application are mapped to logical loggable operations which can be posted to the stable log. Any modifications to the application state are accumulated and the application state is flushed from time to time to stable storage using an atomic procedure. Applications are recovered by replaying the logged state transition operations, in the same manner that most database systems replay state transformation operations to recover database pages. This application recovery integrates with database recovery, and effectively eliminates or at least substantially reduces the need for check pointing applications. In addition, optimization techniques are described to make the read, write, and recovery phases more efficient.

    摘要翻译: 本发明涉及用于使应用从系统崩溃中恢复的数据库计算机系统和方法。 应用状态(即,地址空间)被视为单个对象,其可以类似于在数据库恢复技术中刷新单独页面的方式被原子地刷新。 为了实现对应用程序的这种单一处理,应用程序执行的执行映射到可以发布到稳定日志的逻辑可记录操作。 累积对应用状态的任何修改,并且使用原子程序将应用状态不时刷新到稳定存储。 通过重播记录的状态转换操作来恢复应用程序,方式与大多数数据库系统重播状态转换操作以恢复数据库页面相同。 该应用程序恢复与数据库恢复集成,并有效地消除或至少大大减少了对检查指向应用程序的需求。 此外,描述了优化技术,以使读取,写入和恢复阶段更有效率。

    Order preserving data translation
    7.
    发明授权
    Order preserving data translation 失效
    订单保存数据转换

    公开(公告)号:US5585793A

    公开(公告)日:1996-12-17

    申请号:US258144

    申请日:1994-06-10

    IPC分类号: G06T9/00 H03M7/30

    CPC分类号: G06T9/005 H03M7/3088

    摘要: In a computer system, input strings to be translated are composed of characters selected from a first alphabet. According to a predetermined criterion, a list of sub-strings is selected from the input strings to form entries in a dictionary. The entries of the dictionary are arranged according to a collating order of the first alphabet. An interval including the sub-strings of the input strings is partitioned into an all-inclusive and disjoint set of ranges. The sub-strings of the interval are arranged according to the collating order of the first alphabet, and each sub-strings of a particular range has a common prefix, the common prefix selected from the list of sub-strings. A unique encoding is assigned to each common prefix, the corresponding set of unique encodings composed of characters selected from a second alphabet. The input strings are parsed, one at the time, into a plurality of tokens, each token corresponding to a sub-string selected from the dictionary. In an output string, there is placed for each token, a corresponding one of the set of unique encodings.

    摘要翻译: 在计算机系统中,要翻译的输入字符串由从第一字母表中选择的字符组成。 根据预定标准,从输入字符串中选择一个子串列表,以在字典中形成条目。 字典的条目按照第一个字母表的整理顺序排列。 包括输入字符串的子串的间隔被划分成一个全包和不相交的范围集合。 间隔的子串根据第一字母表的整理顺序排列,特定范围的每个子串具有共同的前缀,从子列表中选择的公用前缀。 每个公共前缀分配唯一的编码,由从第二个字母表中选择的字符组成的对应的唯一编码集。 将输入字符串一次解析成多个令牌,每个令牌对应于从字典中选择的子串。 在输出字符串中,为每个令牌放置一组唯一编码中相应的一个。

    Concurrency-control method and apparatus in a database management system
utilizing key-valued locking
    8.
    发明授权
    Concurrency-control method and apparatus in a database management system utilizing key-valued locking 失效
    利用键值锁定的数据库管理系统中的并发控制方法和装置

    公开(公告)号:US5485607A

    公开(公告)日:1996-01-16

    申请号:US14188

    申请日:1993-02-05

    IPC分类号: G06F17/30

    摘要: The concurrency-control mechanisms in a database-management system achieves high concurrency by using a lock-mode set larger than that conventionally employed for multi-granularity locking. In a system of key-valued locking in which locks on key-value ranges are acquired separately from the locks on the key values with which they are associated, the IX lock mode conventionally acquired on a range by update, insert, and delete operations is replaced with three separate lock modes respectively associated with those operations and invoked by them for range locking. In key-valued-locking systems in which ranges are locked commonly with the key-values associated with them, the mode set is further expanded so that each mode represents a different combination of range and key-value locks.

    摘要翻译: 数据库管理系统中的并发控制机制通过使用大于常规用于多粒度锁定的锁模式来实现高并发性。 在键值锁定系统中,键值范围的锁与其关联的键值分开获取锁定,通常通过更新,插入和删除操作在范围上获取的IX锁定模式是 替换为分别与这些操作相关联的三个单独的锁定模式,并由它们用于范围锁定。 在键值锁定系统中,范围与其相关联的键值通常被锁定,模式集进一步扩展,使得每个模式表示范围和键值锁定的不同组合。

    Concurrency and recovery for index trees with nodal updates using
multiple atomic actions by which the trees integrity is preserved
during undesired system interruptions
    9.
    发明授权
    Concurrency and recovery for index trees with nodal updates using multiple atomic actions by which the trees integrity is preserved during undesired system interruptions 失效
    具有节点更新的索引树的并发和恢复使用多个原子动作,在不期望的系统中断期间保留树完整性

    公开(公告)号:US5276872A

    公开(公告)日:1994-01-04

    申请号:US720405

    申请日:1991-06-25

    IPC分类号: G06F11/00

    摘要: The present invention includes an approach to index tree structure changes which provides high concurrency while being usable with many recovery schemes and with many varieties of index trees. The present invention permits multiple concurrent structure changes. In addition, all update activity and structure change activity above the data level executes in short independent atomic actions which do not impede normal database activity. Only data node splitting executes in the context of a database transaction. This feature makes the approach usable with diverse recovery mechanisms, while only impacting concurrency in a modest way. Even this impact can be avoided by re-packaging the atomic actions, at the cost of requiring more from the recovery system.

    摘要翻译: 本发明包括索引树结构变化的方法,其提供高并发性,同时可用于许多恢复方案和许多种类的索引树。 本发明允许多个并发结构改变。 此外,数据级别以上的所有更新活动和结构更改活动都会在短时间内执行,而不会妨碍正常的数据库活动。 只有数据节点分割在数据库事务的上下文中执行。 此功能使该方法可用于各种恢复机制,同时仅以适度的方式影响并发。 即使这种影响可以通过重新打包原子行动来避免,以牺牲更多的恢复系统为代价。