System and method for tracking references to shared objects using byte-addressable per-thread reference counters
    51.
    发明授权
    System and method for tracking references to shared objects using byte-addressable per-thread reference counters 有权
    用于使用字节可寻址的每线程引用计数器来跟踪对共享对象的引用的系统和方法

    公开(公告)号:US08677076B2

    公开(公告)日:2014-03-18

    申请号:US12750455

    申请日:2010-03-30

    CPC classification number: G06F12/0261

    Abstract: The system described herein may track references to a shared object by concurrently executing threads using a reference tracking data structure that includes an owner field and an array of byte-addressable per-thread entries, each including a per-thread reference counter and a per-thread counter lock. Slotted threads assigned to a given array entry may increment or decrement the per-thread reference counter in that entry in response to referencing or dereferencing the shared object. Unslotted threads may increment or decrement a shared unslotted reference counter. A thread may update the data structure and/or examine it to determine whether the number of references to the shared object is zero or non-zero using a blocking-optimistic or a non-blocking mechanism. A checking thread may acquire ownership of the data structure, obtain an instantaneous snapshot of all counters, and return a value indicating whether the number of references to the shared object is zero or non-zero.

    Abstract translation: 本文描述的系统可以通过使用包括所有者字段和字节可寻址每个线程条目的数组的参考跟踪数据结构并行执行线程来跟踪对共享对象的引用,每个线程项包括每线程参考计数器和每线程参考计数器, 螺纹计数器锁。 分配给给定阵列条目的时隙线程可以增加或减少该条目中的每个线程引用计数器,以响应引用或取消引用共享对象。 未分配的线程可以递增或递减共享的未引用的引用计数器。 线程可以更新数据结构和/或检查它,以使用阻塞乐观或非阻塞机制来确定对共享对象的引用数量是零还是非零。 检查线程可以获取数据结构的所有权,获得所有计数器的瞬时快照,并返回一个值,该值指示对共享对象的引用数是零还是非零。

    System and Method for Implementing NUMA-Aware Reader-Writer Locks
    52.
    发明申请
    System and Method for Implementing NUMA-Aware Reader-Writer Locks 有权
    实现NUMA感知读写器锁的系统和方法

    公开(公告)号:US20130290967A1

    公开(公告)日:2013-10-31

    申请号:US13458868

    申请日:2012-04-27

    CPC classification number: G06F9/526 G06F2209/523

    Abstract: NUMA-aware reader-writer locks may leverage lock cohorting techniques to band together writer requests from a single NUMA node. The locks may relax the order in which the lock schedules the execution of critical sections of code by reader threads and writer threads, allowing lock ownership to remain resident on a single NUMA node for long periods, while also taking advantage of parallelism between reader threads. Threads may contend on node-level structures to get permission to acquire a globally shared reader-writer lock. Writer threads may follow a lock cohorting strategy of passing ownership of the lock in write mode from one thread to a cohort writer thread without releasing the shared lock, while reader threads from multiple NUMA nodes may simultaneously acquire the shared lock in read mode. The reader-writer lock may follow a writer-preference policy, a reader-preference policy or a hybrid policy.

    Abstract translation: NUMA感知的读写器锁可以利用锁定队列技术将来自单个NUMA节点的写入器请求带到一起。 锁可以放松锁定通过读取器线程和写入器线程调度关键代码段的顺序,允许锁定所有权长时间保持驻留在单个NUMA节点上,同时还利用读取器线程之间的并行性。 线程可能会争取节点级结构获得获取全局共享读写器锁的权限。 编写者线程可能遵循锁定队列策略,将锁定的所有权从写入模式从一个线程传递到队列写入器线程,而不会释放共享锁定,而来自多个NUMA节点的读取器线程可以同时在读取模式下获取共享锁定。 读写器锁可以遵循写入者偏好策略,读者偏好策略或混合策略。

    System and method for optimizing a code section by forcing a code section to be executed atomically
    53.
    发明授权
    System and method for optimizing a code section by forcing a code section to be executed atomically 有权
    通过强制代码段以原子方式执行来优化代码段的系统和方法

    公开(公告)号:US08533699B2

    公开(公告)日:2013-09-10

    申请号:US13077793

    申请日:2011-03-31

    CPC classification number: G06F9/467 G06F8/443 G06F9/30087 G06F12/0261

    Abstract: Systems and methods for optimizing code may use transactional memory to optimize one code section by forcing another code section to execute atomically. Application source code may be analyzed to identify instructions in one code section that only need to be executed if there exists the possibility that another code section (e.g., a critical section) could be partially executed or that its results could be affected by interference. In response to identifying such instructions, alternate code may be generated that forces the critical section to be executed as an atomic transaction, e.g., using best-effort hardware transactional memory. This alternate code may replace the original code or may be included in an alternate execution path that can be conditionally selected for execution at runtime. The alternate code may elide the identified instructions (which are rendered unnecessary by the transaction) by removing them, or by including them in the alternate execution path.

    Abstract translation: 用于优化代码的系统和方法可以使用事务存储器来通过强制另一个代码部分以原子方式执行来优化一个代码段。 可以分析应用源代码以识别一个代码部分中的指令,其仅在存在可以部分地执行另一代码部分(例如,关键部分)或其结果可能受到干扰的影响的情况下才需要执行。 响应于识别这样的指令,可以生成迫使关键部分作为原子事务执行的替代代码,例如使用尽力而为的硬件事务存储器。 该替代代码可以替换原始代码,或者可以被包括在可以有选择地在运行时执行的备用执行路径中。 替代代码可以通过删除它们或者将它们包括在备用执行路径中来去除所识别的指令(由事务变得不必要)。

    System and Method for Mitigating the Impact of Branch Misprediction When Exiting Spin Loops
    54.
    发明申请
    System and Method for Mitigating the Impact of Branch Misprediction When Exiting Spin Loops 有权
    减少自旋循环中分支预测影响的系统和方法

    公开(公告)号:US20130198499A1

    公开(公告)日:2013-08-01

    申请号:US13362903

    申请日:2012-01-31

    CPC classification number: G06F9/30058 G06F9/30079 G06F9/325 G06F9/3848

    Abstract: A computer system may recognize a busy-wait loop in program instructions at compile time and/or may recognize busy-wait looping behavior during execution of program instructions. The system may recognize that an exit condition for a busy-wait loop is specified by a conditional branch type instruction in the program instructions. In response to identifying the loop and the conditional branch type instruction that specifies its exit condition, the system may influence or override a prediction made by a dynamic branch predictor, resulting in a prediction that the exit condition will be met and that the loop will be exited regardless of any observed branch behavior for the conditional branch type instruction. The looping instructions may implement waiting for an inter-thread communication event to occur or for a lock to become available. When the exit condition is met, the loop may be exited without incurring a misprediction delay.

    Abstract translation: 计算机系统可以在编译时识别程序指令中的忙等待循环和/或可以在程序指令执行期间识别忙等待循环行为。 系统可以认识到忙 - 等待循环的退出条件由程序指令中的条件分支类型指令指定。 响应于识别循环和指定其退出条件的条件分支类型指令,系统可以影响或覆盖由动态分支预测器做出的预测,导致预测退出条件将被满足,并且循环将 退出条件分支类型指令的任何观察到的分支行为。 循环指令可以实现等待线程间通信事件发生或锁定变得可用。 当满足退出条件时,可以退出循环而不产生误预计延迟。

    System and method for implementing hierarchical queue-based locks using flat combining
    55.
    发明授权
    System and method for implementing hierarchical queue-based locks using flat combining 有权
    使用平面组合实现基于层次化的队列锁的系统和方法

    公开(公告)号:US08458721B2

    公开(公告)日:2013-06-04

    申请号:US13152079

    申请日:2011-06-02

    CPC classification number: G06F9/526

    Abstract: The system and methods described herein may be used to implement a scalable, hierarchal, queue-based lock using flat combining. A thread executing on a processor core in a cluster of cores that share a memory may post a request to acquire a shared lock in a node of a publication list for the cluster using a non-atomic operation. A combiner thread may build an ordered (logical) local request queue that includes its own node and nodes of other threads (in the cluster) that include lock requests. The combiner thread may splice the local request queue into a (logical) global request queue for the shared lock as a sub-queue. A thread whose request has been posted in a node that has been combined into a local sub-queue and spliced into the global request queue may spin on a lock ownership indicator in its node until it is granted the shared lock.

    Abstract translation: 本文描述的系统和方法可以用于使用平坦组合来实现可扩展的,分级的基于队列的锁。 在共享内存的核心集群中的处理器核心上执行的线程可以使用非原子操作来发布用于获取集群的发布列表的节点中的共享锁定的请求。 组合线程可以构建一个有序(逻辑)本地请求队列,其包括其自己的节点和包含锁定请求的其他线程(在集群中)的节点。 组合器线程可以将本地请求队列拼接成用于共享锁的(逻辑)全局请求队列作为子队列。 已经将其请求已经发布在已经组合到本地子队列中并被拼接到全局请求队列中的节点的线程可以旋转其节点中的所有权所有者指示符,直到被授予共享锁为止。

    Techniques for providing improved affinity scheduling in a multiprocessor computer system
    56.
    发明授权
    Techniques for providing improved affinity scheduling in a multiprocessor computer system 有权
    在多处理器计算机系统中提供改进的关联调度的技术

    公开(公告)号:US08407708B2

    公开(公告)日:2013-03-26

    申请号:US13211762

    申请日:2011-08-17

    Applicant: David Dice

    Inventor: David Dice

    CPC classification number: G06F9/5033

    Abstract: Techniques for controlling a thread on a computerized system having multiple processors involve accessing state information of a blocked thread, and maintaining the state information of the blocked thread at current values when the state information indicates that less than a predetermined amount of time has elapsed since the blocked thread ran on the computerized system. Such techniques further involve setting the state information of the blocked thread to identify affinity for a particular processor of the multiple processors when the state information indicates that at least the predetermined amount of time has elapsed since the blocked thread ran on the computerized system. Such operation enables the system to place a cold blocked thread which shares data with another thread on the same processor of that other thread so that, when the blocked thread awakens and runs, that thread is closer to the shared data.

    Abstract translation: 用于控制具有多个处理器的计算机化系统上的线程的技术涉及访问被阻塞线程的状态信息,并且当状态信息指示自从该时间起经过了预定时间量时,将该线程的状态信息保持在当前值 阻塞的线程在计算机化系统上运行。 这种技术进一步涉及当状态信息指示自阻塞的线程在计算机化系统上运行以来经过了至少预定的时间量时,设置阻塞线程的状态信息以识别对多个处理器的特定处理器的亲和性。 这样的操作使得系统能够将与另一个线程共享数据的冷的阻塞线程放置在该另一个线程的同一处理器上,使得当被阻塞的线程唤醒并运行时,该线程更接近共享数据。

    System and method for managing contention in transactional memory using global execution data
    57.
    发明授权
    System and method for managing contention in transactional memory using global execution data 有权
    使用全局执行数据管理事务内存中的争用的系统和方法

    公开(公告)号:US08402464B2

    公开(公告)日:2013-03-19

    申请号:US12325870

    申请日:2008-12-01

    CPC classification number: G06F9/466

    Abstract: Transactional Lock Elision (TLE) may allow threads in a multi-threaded system to concurrently execute critical sections as speculative transactions. Such speculative transactions may abort due to contention among threads. Systems and methods for managing contention among threads may increase overall performance by considering both local and global execution data in reducing, resolving, and/or mitigating such contention. Global data may include aggregated and/or derived data representing thread-local data of remote thread(s), including transactional abort history, abort causal history, resource consumption history, performance history, synchronization history, and/or transactional delay history. Local and/or global data may be used in determining the mode by which critical sections are executed, including TLE and mutual exclusion, and/or to inform concurrency throttling mechanisms. Local and/or global data may also be used in determining concurrency throttling parameters (e.g., delay intervals) used in delaying a thread when attempting to execute a transaction and/or when retrying a previously aborted transaction.

    Abstract translation: 事务锁定Elision(TLE)可允许多线程系统中的线程同时执行关键部分作为投机事务。 这种投机交易可能由于线程之间的争用而中止。 用于管理线程争用的系统和方法可以通过在减少,解决和/或减轻这种争用时考虑本地和全局执行数据来增加总体性能。 全局数据可以包括表示远程线程的线程本地数据的聚合和/或导出数据,包括事务中止历史,中止因果历史,资源消耗历史,性能历史,同步历史和/或事务延迟历史。 本地和/或全局数据可用于确定执行关键段的模式,包括TLE和互斥,和/或通知并发调节机制。 本地和/或全局数据也可用于确定在尝试执行事务时和/或重试先前中止的事务时用于延迟线程的并发调节参数(例如,延迟间隔)。

    Efficient implicit privatization of transactional memory
    58.
    发明授权
    Efficient implicit privatization of transactional memory 有权
    事务记忆的有效隐含私有化

    公开(公告)号:US08332374B2

    公开(公告)日:2012-12-11

    申请号:US12101316

    申请日:2008-04-11

    CPC classification number: G06F9/466 G06F9/526

    Abstract: Apparatus, methods, and program products are disclosed that provide a technology that implicitly isolates a portion of a transactional memory that is shared between multiple threads for exclusive use by an isolating thread without the possibility of other transactions modifying the isolated portion of the transactional memory. In some of the described embodiments read locations of a shared memory are covered by a first set of lock objects, and write locations are covered by a second set of lock objects, each lock object in each set having a reader mode and a writer mode. Some of these embodiments acquiring each lock object in the first set using the reader mode, and acquire each lock object in the second set using the writer mode. These embodiments store result data values at write locations in the shared memory subsequent to the acquiring said first and second set of lock objects.

    Abstract translation: 公开了装置,方法和程序产品,其提供隐含地隔离在多个线程之间共享的事务存储器的一部分以供隔离线程独占使用的技术,而不会有其他事务修改事务存储器的隔离部分的可能性。 在一些描述的实施例中,共享存储器的读取位置由第一组锁定对象覆盖,并且写入位置由第二组锁定对象覆盖,每个组中的每个锁定对象具有读取器模式和写入器模式。 这些实施例中的一些实施例使用读取器模式来获取第一组中的每个锁定对象,并且使用写入器模式来获取第二组中的每个锁定对象。 这些实施例在获取所述第一和第二组锁定对象之后将结果数据值存储在共享存储器中的写入位置处。

    Techniques for Providing Improved Affinity Scheduling in a Multiprocessor Computer System
    59.
    发明申请
    Techniques for Providing Improved Affinity Scheduling in a Multiprocessor Computer System 有权
    在多处理器计算机系统中提供改进的亲和性调度的技术

    公开(公告)号:US20110302585A1

    公开(公告)日:2011-12-08

    申请号:US13211762

    申请日:2011-08-17

    Applicant: David Dice

    Inventor: David Dice

    CPC classification number: G06F9/5033

    Abstract: Techniques for controlling a thread on a computerized system having multiple processors involve accessing state information of a blocked thread, and maintaining the state information of the blocked thread at current values when the state information indicates that less than a predetermined amount of time has elapsed since the blocked thread ran on the computerized system. Such techniques further involve setting the state information of the blocked thread to identify affinity for a particular processor of the multiple processors when the state information indicates that at least the predetermined amount of time has elapsed since the blocked thread ran on the computerized system. Such operation enables the system to place a cold blocked thread which shares data with another thread on the same processor of that other thread so that, when the blocked thread awakens and runs, that thread is closer to the shared data.

    Abstract translation: 用于控制具有多个处理器的计算机化系统上的线程的技术涉及访问被阻塞线程的状态信息,并且当状态信息指示自从该时间起经过了预定的一段时间时,将被阻塞的线程的状态信息保持在当前值 阻塞的线程在计算机化系统上运行。 这种技术进一步涉及当状态信息指示自阻塞的线程在计算机化系统上运行以来经过了至少预定的时间量时,设置阻塞线程的状态信息以识别对多个处理器的特定处理器的亲和性。 这样的操作使得系统能够将与另一个线程共享数据的冷的阻塞线程放置在该另一个线程的同一处理器上,使得当被阻塞的线程唤醒并运行时,该线程更接近共享数据。

    System and Method for Managing Contention in Transactional Memory Using Global Execution Data
    60.
    发明申请
    System and Method for Managing Contention in Transactional Memory Using Global Execution Data 有权
    使用全局执行数据管理事务性内存中的争用的系统和方法

    公开(公告)号:US20100138841A1

    公开(公告)日:2010-06-03

    申请号:US12325870

    申请日:2008-12-01

    CPC classification number: G06F9/466

    Abstract: Transactional Lock Elision (TLE) may allow threads in a multi-threaded system to concurrently execute critical sections as speculative transactions. Such speculative transactions may abort due to contention among threads. Systems and methods for managing contention among threads may increase overall performance by considering both local and global execution data in reducing, resolving, and/or mitigating such contention. Global data may include aggregated and/or derived data representing thread-local data of remote thread(s), including transactional abort history, abort causal history, resource consumption history, performance history, synchronization history, and/or transactional delay history. Local and/or global data may be used in determining the mode by which critical sections are executed, including TLE and mutual exclusion, and/or to inform concurrency throttling mechanisms. Local and/or global data may also be used in determining concurrency throttling parameters (e.g., delay intervals) used in delaying a thread when attempting to execute a transaction and/or when retrying a previously aborted transaction.

    Abstract translation: 事务锁定Elision(TLE)可允许多线程系统中的线程同时执行关键部分作为投机事务。 这种投机交易可能由于线程之间的争用而中止。 用于管理线程争用的系统和方法可以通过在减少,解决和/或减轻这种争用时考虑本地和全局执行数据来增加总体性能。 全局数据可以包括表示远程线程的线程本地数据的聚合和/或导出的数据,包括事务中止历史,中止因果历史,资源消耗历史,性能历史,同步历史和/或事务延迟历史。 本地和/或全局数据可用于确定执行关键段的模式,包括TLE和互斥,和/或通知并发调节机制。 本地和/或全局数据也可用于确定在尝试执行事务时和/或重试先前中止的事务时用于延迟线程的并发调节参数(例如,延迟间隔)。

Patent Agency Ranking