System and method for reducing transactional abort rates using compiler optimization techniques
    41.
    发明授权
    System and method for reducing transactional abort rates using compiler optimization techniques 有权
    使用编译器优化技术减少事务中止率的系统和方法

    公开(公告)号:US09424013B2

    公开(公告)日:2016-08-23

    申请号:US12345189

    申请日:2008-12-29

    Applicant: David Dice

    Inventor: David Dice

    CPC classification number: G06F8/4441 G06F9/467

    Abstract: In transactional memory systems, transactional aborts due to conflicts between concurrent threads may cause system performance degradation. A compiler may attempt to minimize runtime abort rates by performing code transformations and/or other optimizations on a transactional memory program in an attempt to minimize store-commit intervals. The compiler may employ store deferral, hoisting of long-latency operations from within a transaction body and/or store-commit interval, speculative hoisting of long-latency operations, and/or redundant store squashing optimizations. The compiler may perform optimizing transformations on source code and/or on any intermediate representation thereof (e.g., parse trees, un-optimized assembly code, etc.). The compiler may preemptively avoid naïve target code constructions. The compiler may perform static and/or dynamic analysis of a program in order to determine which, if any, transformations should be applied and/or may dynamically recompile code sections at runtime, based on execution analysis.

    Abstract translation: 在事务性内存系统中,由于并发线程之间的冲突而导致的事务中止可能会导致系统性能下降。 编译器可以尝试通过在事务性存储器程序上执行代码变换和/或其他优化来最小化存储提交间隔来最小化运行时中止速率。 编译器可以使用存储延迟,从事务体内和/或存储提交间隔,长时间延迟操作的推测性提升和/或冗余存储压缩优化中提取长延迟操作。 编译器可以对源代码和/或其任何中间表示执行优化变换(例如,解析树,未优化的汇编代码等)。 编译器可以抢先避免天真的目标代码结构。 编译器可以执行程序的静态和/或动态分析,以便基于执行分析来确定应该应用哪个(如果有的话)转换和/或可以在运行时动态重新编译代码段。

    Method and system for inter-thread communication using processor messaging
    42.
    发明授权
    Method and system for inter-thread communication using processor messaging 有权
    使用处理器消息传递的线程间通信的方法和系统

    公开(公告)号:US09021502B2

    公开(公告)日:2015-04-28

    申请号:US12345179

    申请日:2008-12-29

    CPC classification number: G06F9/466 G06F9/3009 G06F9/544

    Abstract: In shared-memory computer systems, threads may communicate with one another using shared memory. A receiving thread may poll a message target location repeatedly to detect the delivery of a message. Such polling may cause excessive cache coherency traffic and/or congestion on various system buses and/or other interconnects. A method for inter-processor communication may reduce such bus traffic by reducing the number of reads performed and/or the number of cache coherency messages necessary to pass messages. The method may include a thread reading the value of a message target location once, and determining that this value has been modified by detecting inter-processor messages, such as cache coherence messages, indicative of such modification. In systems that support transactional memory, a thread may use transactional memory primitives to detect the cache coherence messages. This may be done by starting a transaction, reading the target memory location, and spinning until the transaction is aborted.

    Abstract translation: 在共享内存计算机系统中,线程可以使用共享内存彼此进行通信。 接收线程可以重复轮询消息目标位置以检测消息的传递。 这种轮询可能导致各种系统总线和/或其他互连上的高速缓存一致性业务和/或拥塞。 用于处理器间通信的方法可以通过减少执行的读取的数量和/或传递消息所需的高速缓存一致性消息的数量来减少这种总线流量。 该方法可以包括读取消息目标位置的值一次的线程,并且通过检测指示这种修改的处理器间消息(例如高速缓存一致性消息)来确定该值已被修改。 在支持事务内存的系统中,线程可以使用事务存储器原语来检测高速缓存一致性消息。 这可以通过启动事务,读取目标内存位置和旋转直到事务中止来完成。

    Method and system for reducing abort rates in speculative lock elision using contention management mechanisms
    43.
    发明授权
    Method and system for reducing abort rates in speculative lock elision using contention management mechanisms 有权
    使用竞争管理机制减少投机锁定中断流失率的方法和系统

    公开(公告)号:US08914620B2

    公开(公告)日:2014-12-16

    申请号:US12345162

    申请日:2008-12-29

    Applicant: David Dice

    Inventor: David Dice

    CPC classification number: G06F9/3842 G06F9/526 G06F9/528

    Abstract: Hardware-based transactional memory mechanisms, such as Speculative Lock Elision (SLE), may allow multiple threads to concurrently execute critical sections protected by the same lock as speculative transactions. Such transactions may abort due to contention or due to misidentification of code as a critical section. In various embodiments, speculative execution mechanisms may be augmented with software and/or hardware contention management mechanisms to reduce abort rates. Speculative execution hardware may send a hardware interrupt signal to notify software components of a speculative execution event (e.g., abort). Software components may respond by implementing concurrency-throttling mechanisms and/or by determining a mode of execution (e.g., speculative, non-speculative) for a given section and communicating that determination to the hardware speculative execution mechanisms, e.g., by writing it into a lock predictor cache. Subsequently, hardware speculative execution mechanisms may determine a preferred mode of execution for the section by reading the corresponding entry from the lock predictor cache.

    Abstract translation: 基于硬件的事务性存储机制(如推测锁定Elision(SLE))可能允许多个线程同时执行由与投机事务相同锁定的关键部分。 此类交易可能由于争用或由于将代码误认为关键部分而中止。 在各种实施例中,可以用软件和/或硬件争用管理机制来增强推测执行机制,以减少中止率。 推测执行硬件可以发送硬件中断信号以通知软件组件推测执行事件(例如,中止)。 软件组件可以通过实现并发调节机制和/或通过确定给定部分的执行模式(例如,推测性,非推测性)来进行响应,并将该确定传达给硬件推测执行机制,例如通过将其写入 锁定预测器缓存。 随后,硬件推测执行机制可以通过从锁定预测器高速缓存读取相应的条目来确定该部分的优选执行模式。

    Method and system for hardware feedback in transactional memory
    44.
    发明授权
    Method and system for hardware feedback in transactional memory 有权
    事务性存储器中硬件反馈的方法和系统

    公开(公告)号:US08776063B2

    公开(公告)日:2014-07-08

    申请号:US12324109

    申请日:2008-11-26

    Abstract: Multi-threaded, transactional memory systems may allow concurrent execution of critical sections as speculative transactions. These transactions may abort due to contention among threads. Hardware feedback mechanisms may detect information about aborts and provide that information to software, hardware, or hybrid software/hardware contention management mechanisms. For example, they may detect occurrences of transactional aborts or conditions that may result in transactional aborts, and may update local readable registers or other storage entities (e.g., performance counters) with relevant contention information. This information may include identifying data (e.g., information outlining abort relationships between the processor and other specific physical or logical processors) and/or tallied data (e.g., values of event counters reflecting the number of aborted attempts by the current thread or the resources consumed by those attempts). This contention information may be accessible by contention management mechanisms to inform contention management decisions (e.g. whether to revert transactions to mutual exclusion, delay retries, etc.).

    Abstract translation: 多线程事务内存系统可允许将关键部分作为投机事务并发执行。 这些事务可能由于线程之间的争用而中止。 硬件反馈机制可以检测关于中止的信息,并将该信息提供给软件,硬件或混合软件/硬件争用管理机制。 例如,它们可以检测可能导致事务中止的事务中止或条件的发生,并且可以用相关争用信息来更新本地可读寄存器或其他存储实体(例如,性能计数器)。 该信息可以包括识别数据(例如,概述处理器与其他特定物理或逻辑处理器之间的中止关系的信息)和/或计数数据(例如,反映当前线程的中止尝试次数或消耗的资源的事件计数器的值 通过这些尝试)。 该争用信息可以通过争用管理机制来访问,以通知争用管理决策(例如,是否将交易恢复为互斥,延迟重试等)。

    Lock-clustering compilation for software transactional memory
    45.
    发明授权
    Lock-clustering compilation for software transactional memory 有权
    软件事务内存的锁聚类编译

    公开(公告)号:US08677331B2

    公开(公告)日:2014-03-18

    申请号:US13250369

    申请日:2011-09-30

    CPC classification number: G06F9/467 G06F8/443 G06F8/457 G06F9/526

    Abstract: A lock-clustering compiler is configured to compile program code for a software transactional memory system. The compiler determines that a group of data structures are accessed together within one or more atomic memory transactions defined in the program code. In response to determining that the group is accessed together, the compiler creates an executable version of the program code that includes clustering code, which is executable to associate the data structures of the group with the same software transactional memory lock. The lock is usable by the software transactional memory system to coordinate concurrent transactional access to the group of data structures by multiple concurrent threads.

    Abstract translation: 锁集群编译器被配置为编译软件事务存储器系统的程序代码。 编译器确定一组数据结构在程序代码中定义的一个或多个原子存储器事务中被一起访问。 响应于确定组被一起访问,编译器创建包括可以将该组的数据结构与相同的软件事务存储器锁相关联的聚类代码的程序代码的可执行版本。 该锁可由软件事务内存系统使用,以通过多个并发线程协调对数据结构组的并发事务访问。

    System and Method for NUMA-Aware Locking Using Lock Cohorts
    46.
    发明申请
    System and Method for NUMA-Aware Locking Using Lock Cohorts 有权
    使用锁定队列进行NUMA感知锁定的系统和方法

    公开(公告)号:US20130290583A1

    公开(公告)日:2013-10-31

    申请号:US13458871

    申请日:2012-04-27

    CPC classification number: G06F9/526

    Abstract: The system and methods described herein may be used to implement NUMA-aware locks that employ lock cohorting. These lock cohorting techniques may reduce the rate of lock migration by relaxing the order in which the lock schedules the execution of critical code sections by various threads, allowing lock ownership to remain resident on a single NUMA node longer than under strict FIFO ordering, thus reducing coherence traffic and improving aggregate performance. A NUMA-aware cohort lock may include a global shared lock that is thread-oblivious, and multiple node-level locks that provide cohort detection. The lock may be constructed from non-NUMA-aware components (e.g., spin-locks or queue locks) that are modified to provide thread-obliviousness and/or cohort detection. Lock ownership may be passed from one thread that holds the lock to another thread executing on the same NUMA node without releasing the global shared lock.

    Abstract translation: 本文描述的系统和方法可以用于实现采用锁定队列的NUMA感知锁。 这些锁定队列技术可以通过放松锁定通过各种线程调度关键代码段的执行顺序来降低锁定迁移速率,从而允许锁定所有权保持驻留在单个NUMA节点上比在严格FIFO排序之前更长时间,从而减少 一致性流量和提高总体性能。 NUMA感知的群组锁可能包括线程忽略的全局共享锁和提供队列检测的多个节点级锁。 锁可以由修改为提供线程忽略性和/或队列检测的非NUMA感知组件(例如,旋转锁或队列锁)构建。 锁定所有权可以从保存锁的一个线程传递到在同一NUMA节点上执行的另一个线程,而不会释放全局共享锁。

    System and method for performing incremental register checkpointing in transactional memory
    47.
    发明授权
    System and method for performing incremental register checkpointing in transactional memory 有权
    用于在事务性存储器中执行增量寄存器检查点的系统和方法

    公开(公告)号:US08560816B2

    公开(公告)日:2013-10-15

    申请号:US12827842

    申请日:2010-06-30

    CPC classification number: G06F9/3863 G06F9/3834 G06F9/3859

    Abstract: Systems and methods described herein for performing incremental register checkpointing may employ a special register to indicate which registers have already been checkpointed. This register may include one bit per register. These systems may also include a special pointer register whose value identifies a location in user memory or in dedicated on-chip storage at which a copy of a register's value should be saved by a checkpointing operation. Only registers modified during speculative execution or execution of a transaction may be checkpointed (e.g., when register modifying instructions are encountered) and subsequently restored (e.g., due to misspeculation or transaction abort), rather than all of the registers of the processor. Each register may be checkpointed at most once for a given speculative episode or atomic transaction. Setting a bit in the special register may prevent checkpointing of the corresponding register. Setting all of the bits in the special register may disable checkpointing.

    Abstract translation: 本文描述的用于执行增量寄存器检查点的系统和方法可以使用特殊寄存器来指示哪些寄存器已经被检查点。 该寄存器可以包括每个寄存器一位。 这些系统还可以包括特殊的指针寄存器,其特征指针寄存器的值标识用户存储器中的位置或专用片上存储器,通过检查点操作应该保存寄存器值的副本。 只有在推测性执行或执行交易期间修改的寄存器可以是检查点(例如,当遇到寄存器修改指令时)并且随后恢复(例如,由于错误设置或事务中止)而不是处理器的所有寄存器。 对于给定的投机事件或原子事务,每个寄存器最多可以被检查点一次。 在特殊寄存器中设置一位可能会阻止相应寄存器的检查点。 设置特殊寄存器中的所有位可能会禁用检查点。

    System and method for utilizing available best effort hardware mechanisms for supporting transactional memory
    48.
    发明授权
    System and method for utilizing available best effort hardware mechanisms for supporting transactional memory 有权
    利用可用的最有效的硬件机制来支持事务性存储器的系统和方法

    公开(公告)号:US08533663B2

    公开(公告)日:2013-09-10

    申请号:US12250409

    申请日:2008-10-13

    CPC classification number: G06F9/466

    Abstract: Systems and methods for managing divergence of best effort transactional support mechanisms in various transactional memory implementations using a portable transaction interface are described. This interface may be implemented by various combinations of best effort hardware features, including none at all. Because the features offered by this interface may be best effort, a default (e.g., software) implementation may always be possible without the need for special hardware support. Software may be written to the interface, and may be executable on a variety of platforms, taking advantage of best effort hardware features included on each one, while not depending on any particular mechanism. Multiple implementations of each operation defined by the interface may be included in one or more portable transaction interface libraries. Systems and/or application software may be written as platform-independent and/or portable, and may call functions of these libraries to implement the operations for a targeted execution environment.

    Abstract translation: 描述了使用便携式事务接口来管理各种事务存储器实现中的尽力而为事务支持机制的分歧的系统和方法。 该接口可以通过尽力而为的硬件特征的各种组合来实现,包括根本没有。 由于此接口提供的功能可能是最大的努力,默认(例如,软件)实现可能始终是可能的,而不需要特殊的硬件支持。 可以将软件写入接口,并且可以在各种平台上执行,利用包括在每个平台上的尽力而为的硬件特征,而不依赖于任何特定的机制。 由接口定义的每个操作的多个实现可以包括在一个或多个便携式事务接口库中。 系统和/或应用软件可以被写为独立于平台的和/或可移植的,并且可以调用这些库的功能来实现针对性的执行环境的操作。

    Lock-Clustering Compilation for Software Transactional Memory
    49.
    发明申请
    Lock-Clustering Compilation for Software Transactional Memory 有权
    软件事务内存锁集群编译

    公开(公告)号:US20130086348A1

    公开(公告)日:2013-04-04

    申请号:US13250369

    申请日:2011-09-30

    CPC classification number: G06F9/467 G06F8/443 G06F8/457 G06F9/526

    Abstract: A lock-clustering compiler is configured to compile program code for a software transactional memory system. The compiler determines that a group of data structures are accessed together within one or more atomic memory transactions defined in the program code. In response to determining that the group is accessed together, the compiler creates an executable version of the program code that includes clustering code, which is executable to associate the data structures of the group with the same software transactional memory lock. The lock is usable by the software transactional memory system to coordinate concurrent transactional access to the group of data structures by multiple concurrent threads.

    Abstract translation: 锁集群编译器被配置为编译软件事务存储器系统的程序代码。 编译器确定一组数据结构在程序代码中定义的一个或多个原子存储器事务中被一起访问。 响应于确定组被一起访问,编译器创建包括可以将该组的数据结构与相同的软件事务存储器锁相关联的聚类代码的程序代码的可执行版本。 该锁可由软件事务内存系统使用,以通过多个并发线程协调对数据结构组的并发事务访问。

    Multi-Lane Concurrent Bag for Facilitating Inter-Thread Communication
    50.
    发明申请
    Multi-Lane Concurrent Bag for Facilitating Inter-Thread Communication 有权
    多通道并行袋,促进线程间通信

    公开(公告)号:US20130081061A1

    公开(公告)日:2013-03-28

    申请号:US13241015

    申请日:2011-09-22

    Abstract: A method, system, and medium are disclosed for facilitating communication between multiple concurrent threads of execution using a multi-lane concurrent bag. The bag comprises a plurality of independently-accessible concurrent intermediaries (lanes) that are each configured to store data elements. The bag provides an insert function executable to insert a given data element into the bag by selecting one of the intermediaries and inserting the data element into the selected intermediary. The bag also provides a consume function executable to consume a data element from the bag by choosing one of the intermediaries and consuming (removing and returning) a data element stored in the chosen intermediary. The bag guarantees that execution of the consume function consumes a data element if the bag is non-empty and permits multiple threads to execute the insert or consume functions concurrently.

    Abstract translation: 公开了一种方法,系统和介质,用于促进使用多通道并行包的多个并行执行线程之间的通信。 袋子包括多个独立可访问的并行中间件(通道),其被配置为存储数据元素。 该袋提供插入功能可执行以通过选择一个中间体并将数据元素插入所选择的中间体来将给定的数据元素插入袋中。 该袋还提供消耗功能,可通过选择一个中间体并消耗(去除和返回)存储在所选择的中间体中的数据元素来从袋中消耗数据元素。 该包保证消费功能的执行消耗数据元素,如果包不是空的,并允许多个线程同时执行插入或者消费功能。

Patent Agency Ranking