Methods and apparatus for executing code while avoiding interference
    61.
    发明授权
    Methods and apparatus for executing code while avoiding interference 有权
    避免干扰时执行代码的方法和装置

    公开(公告)号:US06799236B1

    公开(公告)日:2004-09-28

    申请号:US10044214

    申请日:2001-11-20

    IPC分类号: G06F1214

    摘要: Mechanisms and techniques operate in a computerized device to execute critical code without interference from interruptions. Critical code is registered for invocation of a critical execution manager in the event of an interruption to the critical code. The critical code is then executed until an interruption to the critical code occurs. After handling the interruption, a critical execution manager is invoked and the critical execution manager detects if an interference signal indicates a reset value. If the interference signal indicates the reset value, the critical execution manager performs a reset operation on the critical code to reset a current state of the critical code to allow execution of the critical code while avoiding interference from handling the interruption and returns to execution of the critical code using the current state of the critical code.

    摘要翻译: 机制和技术在计算机化设备中运行,以执行关键代码,而不受中断的干扰。 在关键代码中断的情况下,注册关键代码用于调用关键执行管理器。 然后执行关键代码,直到出现关键代码的中断。 处理中断后,调用关键执行管理器,关键执行管理器检测干扰信号是否指示复位值。 如果干扰信号指示复位值,则关键执行管理器对关键代码执行复位操作,以重置关键代码的当前状态,以允许执行关键代码,同时避免干扰来处理中断并返回执行 关键代码使用关键代码的当前状态。

    Relaxed lock protocol
    62.
    发明授权
    Relaxed lock protocol 失效
    轻松锁定协议

    公开(公告)号:US06735760B1

    公开(公告)日:2004-05-11

    申请号:US09708576

    申请日:2000-11-08

    申请人: David Dice

    发明人: David Dice

    IPC分类号: G06F945

    CPC分类号: G06F9/52

    摘要: An object-oriented compiler/interpreter allocates monitor records for use in implementing synchronized operations on objects. When a synchronization operation is to be performed on an object, a thread that is to perform the operation “inflates” the object's monitor by placing into its header a pointer to the monitor record as well as an indication of the monitor's inflated status. When a thread is to release its lock on an object, it first consults a reference-count field in the monitor record to determine whether any other threads are synchronized on the object. It then dissociates the object from the monitor record. The dissociation is not atomic with the reference-count check, so the releasing thread checks the reference count again. If that count indicates that further objects had employed the monitor record to synchronize on the object in the interim, then the unlocking thread wakes all waiting threads.

    摘要翻译: 面向对象的编译器/解释器分配监视器记录以用于实现对象上的同步操作。 当对对象执行同步操作时,要执行操作的线程通过将对象的监视器放置在其头部,指示监视器记录的指针以及监视器的充气状态的指示来“膨胀”对象的监视器。 当一个线程释放对对象的锁定时,它首先查看监视器记录中的引用计数字段,以确定对象上是否有任何其他线程同步。 然后它将对象从监视器记录中解离。 解离不是引用计数检查的原子,所以释放线程再次检查引用计数。 如果该计数表示进一步的对象已经使用监视器记录在临时对象上同步,则解锁线程将唤醒所有等待的线程。

    System and method for synchronizing access to shared variables in a
virtual machine in a digital computer system
    63.
    发明授权
    System and method for synchronizing access to shared variables in a virtual machine in a digital computer system 有权
    用于在数字计算机系统中的虚拟机中同步对共享变量的访问的系统和方法

    公开(公告)号:US6141794A

    公开(公告)日:2000-10-31

    申请号:US174278

    申请日:1998-10-16

    IPC分类号: G06F9/44 G06F9/45 G06F9/46

    CPC分类号: G06F9/526 G06F8/447 G06F8/45

    摘要: A code generating system generates, from code in a program, native code that is executable by a computer system. The code generating system may be included in a just-in-time compiler used to generate native code that is executable by a computer system, from a program in Java Byte Code form, and specifically generates, in response to Java Byte Code representative of a synchronization statement that synchronizes access by multiple threads of execution to at least one variable contained in the Java Byte code, one or more native code instructions that implements a wait-free synchronization methodology to synchronization access to the at least one variable. Since the instructions which implement the wait-free synchronization methodology do not require calls to the operating system, they can generally be processed more rapidly than other synchronization techniques which do require operating system calls.

    摘要翻译: 代码生成系统从程序中的代码生成可由计算机系统执行的本地代码。 代码生成系统可以被包括在用于生成可由计算机系统执行的本地代码的即时编译器中,从Java字节代码形式的程序中,并且具体地生成响应于Java字节代码 同步语句,其将多个执行线程的访问同步到包含在所述Java字节代码中的至少一个变量,一个或多个本机代码指令,其实现等待所述同步方法以同步访问所述至少一个变量。 由于实现无等待同步方法的指令不需要对操作系统的调用,因此通常可以比需要操作系统调用的其他同步技术更快地处理它们。

    "> System and method for processing load instruction in accordance with
    64.
    发明授权
    System and method for processing load instruction in accordance with "no-fault" processing facility including arrangement for preserving access fault indicia 失效
    根据“无故障”处理设备处理加载指令的系统和方法,包括用于保存访问故障标记的布置

    公开(公告)号:US5903739A

    公开(公告)日:1999-05-11

    申请号:US013788

    申请日:1998-01-26

    申请人: David Dice

    发明人: David Dice

    IPC分类号: G06F9/312 G06F9/38 G06F9/30

    摘要: A microprocessor in a computer system processes an instruction stream comprising instructions of a plurality of instruction types including an information retrieval instruction type. The microprocessor comprises a register set, a pending fault flag set, a functional unit, an information retrieval subsystem, and a control subsystem. The register set comprises a plurality of registers, each register for storing information. The pending fault flag set comprises a plurality of pending fault flags each associated with one of said registers, each pending fault flag having selected conditions including a pending fault condition and a no pending fault condition. The functional unit performs processing operations in response to information input thereto. The information retrieval subsystem initiates an information retrieval operation to retrieve of information from said information storage subsystem for storage in a register. The control subsystem controls the other elements of the microprocessor in response to the instructions in the instruction stream. In response to an instruction in the instruction stream of the information retrieval type, the control subsystem enables the information retrieval subsystem to initiate an information retrieval operation, and conditions the pending fault flag associated with said one of said registers to the pending fault condition in response to detection of a fault condition during the information retrieval operation. In response to an instruction in the instruction stream of another type, the control subsystem identifies a selected one of said registers as a source register, and enables information to be transferred from said source register to the functional unit for processing if the pending fault flag associated with said source register is in the no pending fault condition.

    摘要翻译: 计算机系统中的微处理器处理包括包括信息检索指令类型的多种指令类型的指令的指令流。 微处理器包括寄存器组,未决故障标志集,功能单元,信息检索子系统和控制子系统。 寄存器组包括多个寄存器,每个寄存器用于存储信息。 待决故障标志集合包括多个等待的故障标志,每个挂起的故障标志各自与所述寄存器之一相关联,每个未决故障标志具有包括待决故障条件和无待命故障条件的选定条件。 功能单元响应输入的信息执行处理操作。 信息检索子系统启动信息检索操作以从所述信息存储子系统检索信息以存储在寄存器中。 控制子系统响应于指令流中的指令控制微处理器的其他元件。 响应于信息检索类型的指令流中的指令,控制子系统使得信息检索子系统启动信息检索操作,并且将与所述寄存器中的所述一个寄存器相关联的未决故障标志响应于待处理故障条件 在信息检索操作期间检测故障状况。 响应于另一类型的指令流中的指令,控制子系统将所选择的一个所述寄存器识别为源寄存器,并且使信息能够从所述源寄存器传送到功能单元以用于处理,如果挂起的故障标志相关联 所述源寄存器处于不等待故障状态。

    System and method for reducing transactional abort rates using compiler optimization techniques
    65.
    发明授权
    System and method for reducing transactional abort rates using compiler optimization techniques 有权
    使用编译器优化技术减少事务中止率的系统和方法

    公开(公告)号:US09424013B2

    公开(公告)日:2016-08-23

    申请号:US12345189

    申请日:2008-12-29

    申请人: David Dice

    发明人: David Dice

    IPC分类号: G06F9/45 G06F9/46

    CPC分类号: G06F8/4441 G06F9/467

    摘要: In transactional memory systems, transactional aborts due to conflicts between concurrent threads may cause system performance degradation. A compiler may attempt to minimize runtime abort rates by performing code transformations and/or other optimizations on a transactional memory program in an attempt to minimize store-commit intervals. The compiler may employ store deferral, hoisting of long-latency operations from within a transaction body and/or store-commit interval, speculative hoisting of long-latency operations, and/or redundant store squashing optimizations. The compiler may perform optimizing transformations on source code and/or on any intermediate representation thereof (e.g., parse trees, un-optimized assembly code, etc.). The compiler may preemptively avoid naïve target code constructions. The compiler may perform static and/or dynamic analysis of a program in order to determine which, if any, transformations should be applied and/or may dynamically recompile code sections at runtime, based on execution analysis.

    摘要翻译: 在事务性内存系统中,由于并发线程之间的冲突而导致的事务中止可能会导致系统性能下降。 编译器可以尝试通过在事务性存储器程序上执行代码变换和/或其他优化来最小化存储提交间隔来最小化运行时中止速率。 编译器可以使用存储延迟,从事务体内和/或存储提交间隔,长时间延迟操作的推测性提升和/或冗余存储压缩优化中提取长延迟操作。 编译器可以对源代码和/或其任何中间表示执行优化变换(例如,解析树,未优化的汇编代码等)。 编译器可以抢先避免天真的目标代码结构。 编译器可以执行程序的静态和/或动态分析,以便基于执行分析来确定应该应用哪个(如果有的话)转换和/或可以在运行时动态重新编译代码段。

    Method and system for inter-thread communication using processor messaging
    66.
    发明授权
    Method and system for inter-thread communication using processor messaging 有权
    使用处理器消息传递的线程间通信的方法和系统

    公开(公告)号:US09021502B2

    公开(公告)日:2015-04-28

    申请号:US12345179

    申请日:2008-12-29

    摘要: In shared-memory computer systems, threads may communicate with one another using shared memory. A receiving thread may poll a message target location repeatedly to detect the delivery of a message. Such polling may cause excessive cache coherency traffic and/or congestion on various system buses and/or other interconnects. A method for inter-processor communication may reduce such bus traffic by reducing the number of reads performed and/or the number of cache coherency messages necessary to pass messages. The method may include a thread reading the value of a message target location once, and determining that this value has been modified by detecting inter-processor messages, such as cache coherence messages, indicative of such modification. In systems that support transactional memory, a thread may use transactional memory primitives to detect the cache coherence messages. This may be done by starting a transaction, reading the target memory location, and spinning until the transaction is aborted.

    摘要翻译: 在共享内存计算机系统中,线程可以使用共享内存彼此进行通信。 接收线程可以重复轮询消息目标位置以检测消息的传递。 这种轮询可能导致各种系统总线和/或其他互连上的高速缓存一致性业务和/或拥塞。 用于处理器间通信的方法可以通过减少执行的读取的数量和/或传递消息所需的高速缓存一致性消息的数量来减少这种总线流量。 该方法可以包括读取消息目标位置的值一次的线程,并且通过检测指示这种修改的处理器间消息(例如高速缓存一致性消息)来确定该值已被修改。 在支持事务内存的系统中,线程可以使用事务存储器原语来检测高速缓存一致性消息。 这可以通过启动事务,读取目标内存位置和旋转直到事务中止来完成。

    Method and system for reducing abort rates in speculative lock elision using contention management mechanisms
    67.
    发明授权
    Method and system for reducing abort rates in speculative lock elision using contention management mechanisms 有权
    使用竞争管理机制减少投机锁定中断流失率的方法和系统

    公开(公告)号:US08914620B2

    公开(公告)日:2014-12-16

    申请号:US12345162

    申请日:2008-12-29

    申请人: David Dice

    发明人: David Dice

    IPC分类号: G06F9/38 G06F9/52

    摘要: Hardware-based transactional memory mechanisms, such as Speculative Lock Elision (SLE), may allow multiple threads to concurrently execute critical sections protected by the same lock as speculative transactions. Such transactions may abort due to contention or due to misidentification of code as a critical section. In various embodiments, speculative execution mechanisms may be augmented with software and/or hardware contention management mechanisms to reduce abort rates. Speculative execution hardware may send a hardware interrupt signal to notify software components of a speculative execution event (e.g., abort). Software components may respond by implementing concurrency-throttling mechanisms and/or by determining a mode of execution (e.g., speculative, non-speculative) for a given section and communicating that determination to the hardware speculative execution mechanisms, e.g., by writing it into a lock predictor cache. Subsequently, hardware speculative execution mechanisms may determine a preferred mode of execution for the section by reading the corresponding entry from the lock predictor cache.

    摘要翻译: 基于硬件的事务性存储机制(如推测锁定Elision(SLE))可能允许多个线程同时执行由与投机事务相同锁定的关键部分。 此类交易可能由于争用或由于将代码误认为关键部分而中止。 在各种实施例中,可以用软件和/或硬件争用管理机制来增强推测执行机制,以减少中止率。 推测执行硬件可以发送硬件中断信号以通知软件组件推测执行事件(例如,中止)。 软件组件可以通过实现并发调节机制和/或通过确定给定部分的执行模式(例如,推测性,非推测性)来进行响应,并将该确定传达给硬件推测执行机制,例如通过将其写入 锁定预测器缓存。 随后,硬件推测执行机制可以通过从锁定预测器高速缓存读取相应的条目来确定该部分的优选执行模式。

    Method and system for hardware feedback in transactional memory
    68.
    发明授权
    Method and system for hardware feedback in transactional memory 有权
    事务性存储器中硬件反馈的方法和系统

    公开(公告)号:US08776063B2

    公开(公告)日:2014-07-08

    申请号:US12324109

    申请日:2008-11-26

    摘要: Multi-threaded, transactional memory systems may allow concurrent execution of critical sections as speculative transactions. These transactions may abort due to contention among threads. Hardware feedback mechanisms may detect information about aborts and provide that information to software, hardware, or hybrid software/hardware contention management mechanisms. For example, they may detect occurrences of transactional aborts or conditions that may result in transactional aborts, and may update local readable registers or other storage entities (e.g., performance counters) with relevant contention information. This information may include identifying data (e.g., information outlining abort relationships between the processor and other specific physical or logical processors) and/or tallied data (e.g., values of event counters reflecting the number of aborted attempts by the current thread or the resources consumed by those attempts). This contention information may be accessible by contention management mechanisms to inform contention management decisions (e.g. whether to revert transactions to mutual exclusion, delay retries, etc.).

    摘要翻译: 多线程事务内存系统可允许将关键部分作为投机事务并发执行。 这些事务可能由于线程之间的争用而中止。 硬件反馈机制可以检测关于中止的信息,并将该信息提供给软件,硬件或混合软件/硬件争用管理机制。 例如,它们可以检测可能导致事务中止的事务中止或条件的发生,并且可以用相关争用信息来更新本地可读寄存器或其他存储实体(例如,性能计数器)。 该信息可以包括识别数据(例如,概述处理器与其他特定物理或逻辑处理器之间的中止关系的信息)和/或计数数据(例如,反映当前线程的中止尝试次数或消耗的资源的事件计数器的值 通过这些尝试)。 该争用信息可以通过争用管理机制来访问,以通知争用管理决策(例如,是否将交易恢复为互斥,延迟重试等)。

    Lock-clustering compilation for software transactional memory
    69.
    发明授权
    Lock-clustering compilation for software transactional memory 有权
    软件事务内存的锁聚类编译

    公开(公告)号:US08677331B2

    公开(公告)日:2014-03-18

    申请号:US13250369

    申请日:2011-09-30

    IPC分类号: G06F9/45 G06F9/44

    摘要: A lock-clustering compiler is configured to compile program code for a software transactional memory system. The compiler determines that a group of data structures are accessed together within one or more atomic memory transactions defined in the program code. In response to determining that the group is accessed together, the compiler creates an executable version of the program code that includes clustering code, which is executable to associate the data structures of the group with the same software transactional memory lock. The lock is usable by the software transactional memory system to coordinate concurrent transactional access to the group of data structures by multiple concurrent threads.

    摘要翻译: 锁集群编译器被配置为编译软件事务存储器系统的程序代码。 编译器确定一组数据结构在程序代码中定义的一个或多个原子存储器事务中被一起访问。 响应于确定组被一起访问,编译器创建包括可以将该组的数据结构与相同的软件事务存储器锁相关联的聚类代码的程序代码的可执行版本。 该锁可由软件事务内存系统使用,以通过多个并发线程协调对数据结构组的并发事务访问。

    System and Method for NUMA-Aware Locking Using Lock Cohorts
    70.
    发明申请
    System and Method for NUMA-Aware Locking Using Lock Cohorts 有权
    使用锁定队列进行NUMA感知锁定的系统和方法

    公开(公告)号:US20130290583A1

    公开(公告)日:2013-10-31

    申请号:US13458871

    申请日:2012-04-27

    IPC分类号: G06F13/14

    CPC分类号: G06F9/526

    摘要: The system and methods described herein may be used to implement NUMA-aware locks that employ lock cohorting. These lock cohorting techniques may reduce the rate of lock migration by relaxing the order in which the lock schedules the execution of critical code sections by various threads, allowing lock ownership to remain resident on a single NUMA node longer than under strict FIFO ordering, thus reducing coherence traffic and improving aggregate performance. A NUMA-aware cohort lock may include a global shared lock that is thread-oblivious, and multiple node-level locks that provide cohort detection. The lock may be constructed from non-NUMA-aware components (e.g., spin-locks or queue locks) that are modified to provide thread-obliviousness and/or cohort detection. Lock ownership may be passed from one thread that holds the lock to another thread executing on the same NUMA node without releasing the global shared lock.

    摘要翻译: 本文描述的系统和方法可以用于实现采用锁定队列的NUMA感知锁。 这些锁定队列技术可以通过放松锁定通过各种线程调度关键代码段的执行顺序来降低锁定迁移速率,从而允许锁定所有权保持驻留在单个NUMA节点上比在严格FIFO排序之前更长时间,从而减少 一致性流量和提高总体性能。 NUMA感知的群组锁可能包括线程忽略的全局共享锁和提供队列检测的多个节点级锁。 锁可以由修改为提供线程忽略性和/或队列检测的非NUMA感知组件(例如,旋转锁或队列锁)构建。 锁定所有权可以从保存锁的一个线程传递到在同一NUMA节点上执行的另一个线程,而不会释放全局共享锁。