Reader bias based locking technique enabling high read concurrency for read-mostly workloads

    公开(公告)号:US10535368B1

    公开(公告)日:2020-01-14

    申请号:US16290431

    申请日:2019-03-01

    Abstract: A data object has a lock and a condition indicator associated with it. Based at least partly on detecting a first setting of the condition indicator, a reader stores an indication that the reader has obtained read access to the data object in an element of a readers structure and reads the data object without acquiring the lock. A writer detects the first setting and replaces it with a second setting, indicating that the lock is to be acquired by readers before reading the data object. Prior to performing a write on the data object, the writer verifies that one or more elements of the readers structure have been cleared.

    Techniques for enhancing progress for hardware transactional memory

    公开(公告)号:US10346196B2

    公开(公告)日:2019-07-09

    申请号:US15221428

    申请日:2016-07-27

    Abstract: Hardware transactional memory (HTM) systems may guarantee that transactions commit without falling back to non-speculative code paths. A transaction that fails to progress may enter a power mode, giving the transaction priority when it conflicts with non-power-mode transactions. If, during execution of a power-mode transaction, another thread attempts, using a non-power-mode transaction, to access a shared resource being accessed by the power-mode transaction, it may be determined whether any actual data conflict occurs between the two transactions. If no data conflict exists, both transactions may continue to completion. If, however, a data conflict does exist, the power-mode transaction may deny the other transaction access to the shared resource. HTM systems may, in some embodiments, ensure that only one power-mode transaction exists at a time. In other embodiments, multiple, concurrent, power-mode transactions may be supported while ensuring that they access disjoint data sets.

    System and method for mitigating the impact of branch misprediction when exiting spin loops

    公开(公告)号:US10191741B2

    公开(公告)日:2019-01-29

    申请号:US15090554

    申请日:2016-04-04

    Abstract: A computer system may recognize a busy-wait loop in program instructions at compile time and/or may recognize busy-wait looping behavior during execution of program instructions. The system may recognize that an exit condition for a busy-wait loop is specified by a conditional branch type instruction in the program instructions. In response to identifying the loop and the conditional branch type instruction that specifies its exit condition, the system may influence or override a prediction made by a dynamic branch predictor, resulting in a prediction that the exit condition will be met and that the loop will be exited regardless of any observed branch behavior for the conditional branch type instruction. The looping instructions may implement waiting for an inter-thread communication event to occur or for a lock to become available. When the exit condition is met, the loop may be exited without incurring a misprediction delay.

    System and Method for Promoting Reader Groups for Lock Cohorting

    公开(公告)号:US20180349211A1

    公开(公告)日:2018-12-06

    申请号:US16056094

    申请日:2018-08-06

    Abstract: NUMA-aware reader-writer locks may leverage lock cohorting techniques that introduce a synthetic level into the lock hierarchy (e.g., one whose nodes do not correspond to the system topology). The synthetic level may include a global reader lock and a global writer lock. A writer thread may acquire a node-level writer lock, then the global writer lock, and then the top-level lock, after which it may access a critical section protected by the lock. The writer may release the lock (if an upper bound on consecutive writers has been met), or may pass the lock to another writer (on the same node or a different node, according to a fairness policy). A reader may acquire the global reader lock (whether or not node-level reader locks are present), and then the top-level lock. However, readers may only hold these locks long enough to increment reader counts associated with them.

    Generic Concurrency Restriction
    35.
    发明申请

    公开(公告)号:US20180107514A1

    公开(公告)日:2018-04-19

    申请号:US15298090

    申请日:2016-10-19

    Abstract: Generic Concurrency Restriction (GCR) may divide a set of threads waiting to acquire a lock into two sets: an active set currently able to contend for the lock, and a passive set waiting for an opportunity to join the active set and contend for the lock. The number of threads in the active set may be limited to a predefined maximum or even a single thread. Generic Concurrency Restriction may be implemented as a wrapper around an existing lock implementation. Generic Concurrency Restriction may, in some embodiments, be unfair (e.g., to some threads) over the short term, but may improve the overall throughput of the underlying multithreaded application via passivation of a portion of the waiting threads.

    Systems and methods for adaptive integration of hardware and software lock elision techniques

    公开(公告)号:US09619281B2

    公开(公告)日:2017-04-11

    申请号:US14936619

    申请日:2015-11-09

    Abstract: Particular techniques for improving the scalability of concurrent programs (e.g., lock-based applications) may be effective in some environments and for some workloads, but not others. The systems described herein may automatically choose appropriate ones of these techniques to apply when executing lock-based applications at runtime, based on observations of the application in the current environment and with the current workload. In one example, two techniques for improving lock scalability (e.g., transactional lock elision using hardware transactional memory, and optimistic software techniques) may be integrated together. A lightweight runtime library built for this purpose may adapt its approach to managing concurrency by dynamically selecting one or more of these techniques (at different times) during execution of a given application. In this Adaptive Lock Elision approach, the techniques may be selected (based on pluggable policies) at runtime to achieve good performance on different platforms and for different workloads.

    Systems and Methods for Performing Concurrency Restriction and Throttling over Contended Locks
    38.
    发明申请
    Systems and Methods for Performing Concurrency Restriction and Throttling over Contended Locks 审中-公开
    执行并发限制和限制锁定的系统和方法

    公开(公告)号:US20170039094A1

    公开(公告)日:2017-02-09

    申请号:US14818213

    申请日:2015-08-04

    Inventor: David Dice

    CPC classification number: G06F9/526

    Abstract: A concurrency-restricting lock may divide a set of threads waiting to acquire the lock into an active circulating set (ACS) that contends for the lock, and a passive set (PS) that awaits an opportunity to contend for the lock. The lock, which may include multiple constituent lock types, lists, or queues, may be unfair over the short term, but improve throughput of the underlying multithreaded application. Culling and long-term fairness policies may be applied to the lock to move excess threads from the ACS to the PS or promote threads from the PS to the ACS. These policies may constraint the size or distribution of threads in the ACS (which may be NUMA-aware). A waiting policy may avoid aggressive promotion from the PS to the ACS, and a short-term fairness policy may move a thread from the tail of a list or queue to its head.

    Abstract translation: 并发限制锁可以将等待获取锁的一组线程划分为竞争锁的活动循环集(ACS),以及等待有机会竞争锁的被动集(PS)。 可能包括多个组成锁类型,列表或队列的锁可能在短期内可能是不公平的,但是提高了底层多线程应用程序的吞吐量。 可以将锁定和长期公平策略应用于锁定,以将多余线程从ACS移动到PS或将线程从PS推送到ACS。 这些策略可能会限制ACS中线程的大小或分布(可能是NUMA感知的)。 等待政策可能会避免从PS到ACS的积极推广,短期公平政策可能会将线程从列表或队列的尾部移动到其头部。

    System and Method for Implementing Reader-Writer Locks Using Hardware Transactional Memory
    39.
    发明申请
    System and Method for Implementing Reader-Writer Locks Using Hardware Transactional Memory 审中-公开
    使用硬件事务内存实现读写器锁的系统和方法

    公开(公告)号:US20160259663A1

    公开(公告)日:2016-09-08

    申请号:US15156110

    申请日:2016-05-16

    CPC classification number: G06F9/467 G06F9/5027 G06F9/528 G06F2209/523

    Abstract: Transactional reader-writer locks may leverage available hardware transactional memory (HTM) to simplify the procedures of the reader-writer lock algorithm and to eliminate a requirement for type stable memory An HTM-based reader-writer lock may include an ordered list of client-provided nodes, each of which represents a thread that holds (or desires to acquire) the lock, and a tail pointer. The locking and unlocking procedures invoked by readers and writers may access the tail pointer or particular ones of the nodes in the list using various combinations of transactions and non-transactional accesses to insert nodes into the list or to remove nodes from the list. A reader or writer that owns a node at the head of the list (or a reader whose node is preceded in the list only by other readers' nodes) may access a critical section of code or shared resource.

    Abstract translation: 事务读写器锁可以利用可用的硬件事务存储器(HTM)来简化读写器锁定算法的过程,并消除对类型稳定存储器的要求基于HTM的读写器锁可以包括客户端 - 提供的节点,每个节点表示一个保存(或希望获取)锁的线程和一个尾部指针。 读者和作者调用的锁定和解锁过程可以使用事务和非事务性访问的各种组合来访问列表中的尾部指针或特定节点,以将节点插入到列表中或从列表中删除节点。 在列表头部拥有一个节点的阅读器或写入器(或者只有其他读取器节点在其列表中的节点之前的读取器)可以访问代码或共享资源的关键部分。

    System and method for implementing NUMA-aware statistics counters
    40.
    发明授权
    System and method for implementing NUMA-aware statistics counters 有权
    用于实现NUMA感知统计计数器的系统和方法

    公开(公告)号:US08918596B2

    公开(公告)日:2014-12-23

    申请号:US13722817

    申请日:2012-12-20

    CPC classification number: G06F13/18 G06F9/5027 G06F9/52 G06F9/526

    Abstract: The systems and methods described herein may be used to implement scalable statistics counters suitable for use in systems that employ a NUMA style memory architecture. The counters may be implemented as data structures that include a count value portion and a node identifier portion. The counters may be accessible within transactions. The node identifier portion may identify a node on which a thread that most recently incremented the counter was executing or one on which a thread that has requested priority to increment the shared counter was executing. Threads executing on identified nodes may have higher priority to increment the counter than other threads. Threads executing on other nodes may delay their attempts to increment the counter, thus encouraging consecutive updates from threads on a single node. Impatient threads may attempt to update the node identifier portion or may update an anti-starvation variable to indicate a request for priority.

    Abstract translation: 本文描述的系统和方法可以用于实现适用于采用NUMA风格存储器架构的系统中的可伸缩统计计数器。 计数器可以被实现为包括计数值部分和节点标识符部分的数据结构。 柜台可能在交易中可访问。 节点标识符部分可以标识正在执行计数器最近递增的线程的节点,或者正在执行已经请求优先级以增加共享计数器的线程的节点。 在标识节点上执行的线程可能比其他线程具有更高的优先级来增加计数器。 在其他节点上执行的线程可能会延迟其增加计数器的尝试,从而鼓励单个节点上线程的连续更新。 不耐烦的线程可以尝试更新节点标识符部分,或者可以更新反饥饿变量以指示优先级请求。

Patent Agency Ranking