Global secondary path locking technique enabling high read concurrency for read-mostly workloads

    公开(公告)号:US11056145B2

    公开(公告)日:2021-07-06

    申请号:US16203511

    申请日:2018-11-28

    Abstract: A reader of a set of data accessors that includes readers and writer detects that a particular lock of a first collection of non-global locks associated with a data object of a computing environment is held by another accessor. After checking a blocking indicator, the reader uses a second lock (which is not part of the first collection) to obtain read access to the data object and implements its reads without acquiring the particular lock. Prior to implementing a write on the data object, a writer acquires at least some locks of the first collection, and sets the blocking indicator to prevent readers from using the second lock to obtain read access to the data object.

    Method and system for inter-thread communication using processor messaging

    公开(公告)号:US10776154B2

    公开(公告)日:2020-09-15

    申请号:US14697510

    申请日:2015-04-27

    Abstract: In shared-memory computer systems, threads may communicate with one another using shared memory. A receiving thread may poll a message target location repeatedly to detect the delivery of a message. Such polling may cause excessive cache coherency traffic and/or congestion on various system buses and/or other interconnects. A method for inter-processor communication may reduce such bus traffic by reducing the number of reads performed and/or the number of cache coherency messages necessary to pass messages. The method may include a thread reading the value of a message target location once, and determining that this value has been modified by detecting inter-processor messages, such as cache coherence messages, indicative of such modification. In systems that support transactional memory, a thread may use transactional memory primitives to detect the cache coherence messages. This may be done by starting a transaction, reading the target memory location, and spinning until the transaction is aborted.

    Reader Bias Based Locking Technique Enabling High Read Concurrency For Read-Mostly Workloads

    公开(公告)号:US20200152236A1

    公开(公告)日:2020-05-14

    申请号:US16739851

    申请日:2020-01-10

    Abstract: A data object has a lock and a condition indicator associated with it. Based at least partly on detecting a first setting of the condition indicator, a reader stores an indication that the reader has obtained read access to the data object in an element of a readers structure and reads the data object without acquiring the lock. A writer detects the first setting and replaces it with a second setting, indicating that the lock is to be acquired by readers before reading the data object. Prior to performing a write on the data object, the writer verifies that one or more elements of the readers structure have been cleared.

    Fine-grained hardware transactional lock elision

    公开(公告)号:US10534538B2

    公开(公告)日:2020-01-14

    申请号:US15050393

    申请日:2016-02-22

    Abstract: Concurrent threads may be synchronized at the level of the memory words they access rather than at the level of the lock that protects the execution of critical sections. Each lock may be associated with an array of flags and each flag may indicate ownership of certain memory words. A pessimistic thread may set flags corresponding to memory words it is accessing in the critical section, while an optimistic thread may read the corresponding flag before any memory access to ensure that the flag is not set and that therefore the associated memory word is not being accessed by the other thread. Thus, optimistic threads that do not have conflicts with the pessimistic thread may not have to wait for the pessimistic thread to release the lock before proceeding.

    Systems and methods for performing concurrency restriction and throttling over contended locks

    公开(公告)号:US10417056B2

    公开(公告)日:2019-09-17

    申请号:US14818213

    申请日:2015-08-04

    Inventor: David Dice

    Abstract: A concurrency-restricting lock may divide a set of threads waiting to acquire the lock into an active circulating set (ACS) that contends for the lock, and a passive set (PS) that awaits an opportunity to contend for the lock. The lock, which may include multiple constituent lock types, lists, or queues, may be unfair over the short term, but improve throughput of the underlying multithreaded application. Culling and long-term fairness policies may be applied to the lock to move excess threads from the ACS to the PS or promote threads from the PS to the ACS. These policies may constraint the size or distribution of threads in the ACS (which may be NUMA-aware). A waiting policy may avoid aggressive promotion from the PS to the ACS, and a short-term fairness policy may move a thread from the tail of a list or queue to its head.

    Permuted Memory Access Mapping
    106.
    发明申请

    公开(公告)号:US20180307617A1

    公开(公告)日:2018-10-25

    申请号:US15493035

    申请日:2017-04-20

    Abstract: When performing non-sequential accesses to large data sets, hot spots may be avoided by permuting the memory locations being accesses to more evenly spread those accesses across the memory and across multiple memory channels. A permutation step may be used when accessing data, such as to improve the distribution of those memory accesses within the system. Instead of accessing one memory address, that address may be permuted so that another memory address is accessed. Non-sequential accesses to an array may be modified such that each index to the array is permuted to another index in the array. Collisions between pre- and post-translation addresses may be prevented and one-to-one mappings may be used. Permutation mechanisms may be implemented in software, hardware, or a combination of both, with or without the knowledge of the process performing the memory accesses.

    System and method for implementing shared probabilistic counters storing update probability values
    107.
    发明授权
    System and method for implementing shared probabilistic counters storing update probability values 有权
    用于实现存储更新概率值的共享概率计数器的系统和方法

    公开(公告)号:US09417910B2

    公开(公告)日:2016-08-16

    申请号:US13722839

    申请日:2012-12-20

    CPC classification number: G06F9/4881 G06F9/52 G06F9/524

    Abstract: The systems and methods described herein may implement probabilistic counters and/or update mechanisms for those counters such that they are dependent on the value of a configurable accuracy parameter. The accuracy parameter value may be adjusted to provide fine-grained control over the tradeoff between the accuracy of the counters and the performance of applications that access them. The counters may be implemented as data structures that include a mantissa portion and an exponent portion that collectively represent an update probability value. When updating the counters, the value of the configurable accuracy parameter may affect whether, when, how often, or by what amount the mantissa portion and/or the exponent portion are updated. Updating a probabilistic counter may include multiplying its value by a constant that is dependent on the value of a configurable accuracy parameter. The counters may be accessible within transactions. The counters may have deterministic update policies.

    Abstract translation: 这里描述的系统和方法可以为这些计数器实现概率计数器和/或更新机制,使得它们取决于可配置精度参数的值。 可以调整精度参数值,以便在计数器的准确性和访问它们的应用程序的性能之间进行权衡,以提供细粒度的控制。 计数器可以被实现为数据结构,其包括统一地表示更新概率值的尾数部分和指数部分。 当更新计数器时,可配置的精度参数的值可能会影响是否,何时,多长时间还是影响尾数部分和/或指数部分的更新量。 更新概率计数器可以包括将其值乘以一个取决于可配置精度参数的值的常数。 柜台可能在交易中可访问。 计数器可能具有确定性的更新策略。

    Systems and Methods for Safely Subscribing to Locks Using Hardware Extensions
    108.
    发明申请
    Systems and Methods for Safely Subscribing to Locks Using Hardware Extensions 审中-公开
    使用硬件扩展安全地使用锁定的系统和方法

    公开(公告)号:US20160011915A1

    公开(公告)日:2016-01-14

    申请号:US14736123

    申请日:2015-06-10

    Abstract: Transactional Lock Elision allows hardware transactions to execute unmodified critical sections protected by the same lock concurrently, by subscribing to the lock and verifying that it is available before committing the transaction. A “lazy subscription” optimization, which delays lock subscription, can potentially cause behavior that cannot occur when the critical sections are executed under the lock. Hardware extensions may provide mechanisms to ensure that lazy subscriptions are safe (e.g., that they result in correct behavior). Prior to executing a critical section transactionally, its lock and subscription code may be identified (e.g., by writing their locations to special registers). Prior to committing the transaction, the thread executing the critical section may verify that the correct lock was correctly subscribed to. If not, or if locations identified by the special registers have been modified, the transaction may be aborted. Nested critical sections associated with different lock types may invoke different subscription code.

    Abstract translation: 事务锁定Elision允许硬件事务通过订阅锁并在提交事务之前验证它是否可用来同时执行受同一锁定保护的未修改的关键段。 延迟锁订阅的“延迟订阅”优化可能会导致在锁定下执行关键部分时不会发生的行为。 硬件扩展可以提供机制来确保延迟订阅是安全的(例如,它们导致正确的行为)。 在事务执行关键部分之前,可以识别其锁定和订阅代码(例如,通过将其位置写入特殊寄存器)。 在提交事务之前,执行关键部分的线程可能会验证正确锁定是否正确。 如果不是,或者如果由特殊寄存器识别的位置已被修改,则可能会中止该事务。 与不同锁类型相关联的嵌套关键部分可能会调用不同的订阅代码。

    Supporting targeted stores in a shared-memory multiprocessor system
    109.
    发明授权
    Supporting targeted stores in a shared-memory multiprocessor system 有权
    在共享内存多处理器系统中支持目标存储

    公开(公告)号:US09110718B2

    公开(公告)日:2015-08-18

    申请号:US13625700

    申请日:2012-09-24

    CPC classification number: G06F9/50 G06F9/5066 G06F9/544 G06F12/0888

    Abstract: The present embodiments provide a system for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor in the shared-memory multiprocessor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. The system includes an interface, such as an application programming interface (API), and a system call interface or an instruction-set architecture (ISA) that provides access to a number of mechanisms for supporting targeted stores. These mechanisms include a thread-location mechanism that determines a location near where a thread is executing in the shared-memory multiprocessor, and a targeted-store mechanism that targets a store to a location (e.g., cache memory) in the shared-memory multiprocessor.

    Abstract translation: 本实施例提供一种用于在共享存储器多处理器中支持目标存储的系统。 目标商店使得第一处理器能够将存储在共享存储器多处理器中的第二处理器的高速缓冲存储器中的高速缓存行推送。 这消除了对多个高速缓存相干操作的需要,以将高速缓存行从第一处理器传送到第二处理器。 该系统包括诸如应用编程接口(API)的接口以及提供对用于支持目标商店的多种机制的访问的系统调用接口或指令集架构(ISA)。 这些机制包括一个线程定位机制,它确定线程在共享存储器多处理器中执行的位置附近的位置,以及将存储器定位到共享存储器多处理器中的位置(例如高速缓冲存储器)的目标存储机制 。

    System and Method for Implementing NUMA-Aware Statistics Counters
    110.
    发明申请
    System and Method for Implementing NUMA-Aware Statistics Counters 有权
    实现NUMA感知统计计数器的系统和方法

    公开(公告)号:US20140181423A1

    公开(公告)日:2014-06-26

    申请号:US13722817

    申请日:2012-12-20

    CPC classification number: G06F13/18 G06F9/5027 G06F9/52 G06F9/526

    Abstract: The systems and methods described herein may be used to implement scalable statistics counters suitable for use in systems that employ a NUMA style memory architecture. The counters may be implemented as data structures that include a count value portion and a node identifier portion. The counters may be accessible within transactions. The node identifier portion may identify a node on which a thread that most recently incremented the counter was executing or one on which a thread that has requested priority to increment the shared counter was executing. Threads executing on identified nodes may have higher priority to increment the counter than other threads. Threads executing on other nodes may delay their attempts to increment the counter, thus encouraging consecutive updates from threads on a single node. Impatient threads may attempt to update the node identifier portion or may update an anti-starvation variable to indicate a request for priority.

    Abstract translation: 本文描述的系统和方法可以用于实现适用于采用NUMA风格存储器架构的系统中的可伸缩统计计数器。 计数器可以被实现为包括计数值部分和节点标识符部分的数据结构。 柜台可能在交易中可访问。 节点标识符部分可以标识正在执行计数器最近递增的线程的节点,或者正在执行已经请求优先级以增加共享计数器的线程的节点。 在标识节点上执行的线程可能比其他线程具有更高的优先级来增加计数器。 在其他节点上执行的线程可能会延迟其增加计数器的尝试,从而鼓励单个节点上线程的连续更新。 不耐烦的线程可以尝试更新节点标识符部分,或者可以更新反饥饿变量以指示优先级请求。

Patent Agency Ranking