Compact NUMA-aware locks
    73.
    发明授权

    公开(公告)号:US11494242B2

    公开(公告)日:2022-11-08

    申请号:US17200610

    申请日:2021-03-12

    Abstract: A computer comprising multiple processors and non-uniform memory implements multiple threads that perform a lock operation using a shared lock structure that includes a pointer to a tail of a first-in-first-out (FIFO) queue of threads waiting to acquire the lock. To acquire the lock, a thread allocates and appends a data structure to the FIFO queue. The lock is released by selecting and notifying a waiting thread to which control is transferred, with the thread selected executing on the same processor socket as the thread controlling the lock. A secondary queue of threads is managed for threads deferred during the selection process and maintained within the data structures of the waiting threads such that no memory is required within the lock structure. If no threads executing on the same processor socket are waiting for the lock, entries in the secondary queue are transferred to the FIFO queue preserving FIFO order.

    Fine-grained hardware transactional lock elision

    公开(公告)号:US11487427B2

    公开(公告)日:2022-11-01

    申请号:US16739839

    申请日:2020-01-10

    Abstract: Concurrent threads may be synchronized at the level of the memory words they access rather than at the level of the lock that protects the execution of critical sections. Each lock may be associated with an array of flags and each flag may indicate ownership of certain memory words. A pessimistic thread may set flags corresponding to memory words it is accessing in the critical section, while an optimistic thread may read the corresponding flag before any memory access to ensure that the flag is not set and that therefore the associated memory word is not being accessed by the other thread. Thus, optimistic threads that do not have conflicts with the pessimistic thread may not have to wait for the pessimistic thread to release the lock before proceeding.

    Intra-Process Caching and Reuse of Threads

    公开(公告)号:US20220188144A1

    公开(公告)日:2022-06-16

    申请号:US17119998

    申请日:2020-12-11

    Abstract: A computer comprising one or more processors and memory implements a thread manager for multiple threads of an application. The thread manager may implement a process-local cache of standby threads for the application. Upon request to create a thread for the application, the thread manager may use a standby thread from the process-local cache to create the requested thread, initializing thread-local storage elements and scheduling the thread for execution. Upon request to terminate a thread of the application, the thread manager may place the thread in an unscheduled state and add the thread to the process-local cache of standby threads. The thread manager may also add or remove standby threads to the process-local cache of standby threads in the event the thread manager determines that the number of standby threads in the process-local cache lies outside a target range.

    Systems and methods for performing concurrency restriction and throttling over contended locks

    公开(公告)号:US11314562B2

    公开(公告)日:2022-04-26

    申请号:US16570952

    申请日:2019-09-13

    Inventor: David Dice

    Abstract: A concurrency-restricting lock may divide a set of threads waiting to acquire the lock into an active circulating set (ACS) that contends for the lock, and a passive set (PS) that awaits an opportunity to contend for the lock. The lock, which may include multiple constituent lock types, lists, or queues, may be unfair over the short term, but improve throughput of the underlying multithreaded application. Culling and long-term fairness policies may be applied to the lock to move excess threads from the ACS to the PS or promote threads from the PS to the ACS. These policies may constraint the size or distribution of threads in the ACS (which may be NUMA-aware). A waiting policy may avoid aggressive promotion from the PS to the ACS, and a short-term fairness policy may move a thread from the tail of a list or queue to its head.

    Generic Concurrency Restriction
    77.
    发明申请

    公开(公告)号:US20220100586A1

    公开(公告)日:2022-03-31

    申请号:US17547505

    申请日:2021-12-10

    Abstract: Generic Concurrency Restriction (GCR) may divide a set of threads waiting to acquire a lock into two sets: an active set currently able to contend for the lock, and a passive set waiting for an opportunity to join the active set and contend for the lock. The number of threads in the active set may be limited to a predefined maximum or even a single thread. Generic Concurrency Restriction may be implemented as a wrapper around an existing lock implementation. Generic Concurrency Restriction may, in some embodiments, be unfair (e.g., to some threads) over the short term, but may improve the overall throughput of the underlying multithreaded application via passivation of a portion of the waiting threads.

    Efficient Condition Variables via Delegated Condition Evaluation

    公开(公告)号:US20210311773A1

    公开(公告)日:2021-10-07

    申请号:US16837856

    申请日:2020-04-01

    Abstract: Efficient use of condition variables for communication between threads of a multi-threaded application may be ensured using delegated condition evaluation. A thread in a runnable state may request to wait for a change to a condition, the request including instructions that, when executed, return a value indicating if the wait is to be terminated. The thread may then be placed in a non-runnable state waiting for a change to the condition, and upon determining a change to the condition, the instructions are executed to receive the value indicating if the wait is to be terminated. If the value indicates that the wait is to be terminated, the thread is placed in a runnable state. If the value indicates that the wait is not to be terminated, the thread remains in a non-runnable state.

    SYSTEMS AND METHODS FOR SAFELY SUBSCRIBING TO LOCKS USING HARDWARE EXTENSIONS

    公开(公告)号:US20210191788A1

    公开(公告)日:2021-06-24

    申请号:US16723766

    申请日:2019-12-20

    Abstract: Transactional Lock Elision allows hardware transactions to execute unmodified critical sections protected by the same lock concurrently, by subscribing to the lock and verifying that it is available before committing the transaction. A “lazy subscription” optimization, which delays lock subscription, can potentially cause behavior that cannot occur when the critical sections are executed under the lock. Hardware extensions may provide mechanisms to ensure that lazy subscriptions are safe (e.g., that they result in correct behavior). Prior to executing a critical section transactionally, its lock and subscription code may be identified (e.g., by writing their locations to special registers). Prior to committing the transaction, the thread executing the critical section may verify that the correct lock was correctly subscribed to. If not, or if locations identified by the special registers have been modified, the transaction may be aborted. Nested critical sections associated with different lock types may invoke different subscription code.

Patent Agency Ranking