System and method for mitigating the impact of branch misprediction when exiting spin loops
    101.
    发明授权
    System and method for mitigating the impact of branch misprediction when exiting spin loops 有权
    退出自旋回路时减轻分支错误预测的影响的系统和方法

    公开(公告)号:US09304776B2

    公开(公告)日:2016-04-05

    申请号:US13362903

    申请日:2012-01-31

    IPC分类号: G06F9/30 G06F9/38 G06F9/32

    摘要: A computer system may recognize a busy-wait loop in program instructions at compile time and/or may recognize busy-wait looping behavior during execution of program instructions. The system may recognize that an exit condition for a busy-wait loop is specified by a conditional branch type instruction in the program instructions. In response to identifying the loop and the conditional branch type instruction that specifies its exit condition, the system may influence or override a prediction made by a dynamic branch predictor, resulting in a prediction that the exit condition will be met and that the loop will be exited regardless of any observed branch behavior for the conditional branch type instruction. The looping instructions may implement waiting for an inter-thread communication event to occur or for a lock to become available. When the exit condition is met, the loop may be exited without incurring a misprediction delay.

    摘要翻译: 计算机系统可以在编译时识别程序指令中的忙等待循环和/或可以在程序指令执行期间识别忙等待循环行为。 系统可以认识到忙 - 等待循环的退出条件由程序指令中的条件分支类型指令指定。 响应于识别循环和指定其退出条件的条件分支类型指令,系统可以影响或覆盖由动态分支预测器做出的预测,导致预测退出条件将被满足,并且循环将 退出条件分支类型指令的任何观察到的分支行为。 循环指令可以实现等待线程间通信事件发生或锁定变得可用。 当满足退出条件时,可以退出循环而不产生误预计延迟。

    System and method for implementing shared scalable nonzero indicators
    102.
    发明授权
    System and method for implementing shared scalable nonzero indicators 有权
    实现共享可扩展非零指标的系统和方法

    公开(公告)号:US08909601B2

    公开(公告)日:2014-12-09

    申请号:US11939372

    申请日:2007-11-13

    IPC分类号: G06F9/52 G06F13/14

    CPC分类号: G06F9/52

    摘要: A Scalable NonZero Indicator (SNZI) object in a concurrent computing application may include a shared data portion (e.g., a counter portion) and a shared nonzero indicator portion, and/or may be an element in a hierarchy of SNZI objects that filters changes in non-root nodes to a root node. SNZI objects may be accessed by software applications through an API that includes a query operation to return the value of the nonzero indicator, and arrive (increment) and depart (decrement) operations. Modifications of the data portion and/or the indicator portion may be performed using atomic read-modify-write type operations. Some SNZI objects may support a reset operation. A shared data object may be set to an intermediate value, or an announce bit may be set, to indicate that a modification is in progress that affects its corresponding indicator value. Another process or thread seeing this indication may “help” complete the modification before proceeding.

    摘要翻译: 并发计算应用中的可扩展非零指示符(SNZI)对象可以包括共享数据部分(例如,计数器部分)和共享非零指示符部分,和/或可以是SNZI对象的层次结构中的元素, 非根节点到根节点。 软件应用程序可以通过API来访问SNZI对象,该API包括返回非零指示符值的查询操作,以及到达(递增)和离开(递减)操作。 可以使用原子读取 - 修改 - 写入类型操作来执行数据部分和/或指示符部分的修改。 一些SNZI对象可能支持复位操作。 可以将共享数据对象设置为中间值,或者可以设置通告位,以指示正在进行的修改影响其对应的指示符值。 看到此指示的另一个进程或线程可能会在进行之前“帮助”完成修改。

    Page-protection based memory access barrier traps
    103.
    发明授权
    Page-protection based memory access barrier traps 有权
    基于页面保护的内存访问障碍陷阱

    公开(公告)号:US08725974B2

    公开(公告)日:2014-05-13

    申请号:US11654456

    申请日:2007-01-17

    IPC分类号: G06F12/00 G06F13/00

    CPC分类号: G06F12/0253

    摘要: A method, apparatus and computer program product for providing page-protection based memory access barrier traps is presented. A value for a user-mode bit (u-bit) is computed for each extant virtual page in an address space, the u-bit indicative that an object on the virtual page is being moved by a Garbage Collector process. An instruction is executed which causes an access protection fault. The state of the u-bit for the virtual page associated with the access protection fault is consulted when the access protection fault is encountered. Additionally, the access protection fault is translated into a user-trap (utrap) and the utrap is serviced when the u-bit is set.

    摘要翻译: 提出了一种用于提供基于页面保护的存储器访问障碍阱的方法,装置和计算机程序产品。 为地址空间中的每个现有虚拟页面计算用户模式位(u位)的值,表示虚拟页面上的对象正由垃圾收集器进程移动的u位。 执行导致访问保护故障的指令。 当遇到访问保护故障时,将查阅与访问保护故障相关联的虚拟页面的u位状态。 另外,访问保护故障被转换为用户陷阱(utrap),并且当u位置1时,接口保护故障被服务。

    Cache index coloring for virtual-address dynamic allocators
    104.
    发明授权
    Cache index coloring for virtual-address dynamic allocators 有权
    虚拟地址动态分配器的缓存索引着色

    公开(公告)号:US08707006B2

    公开(公告)日:2014-04-22

    申请号:US12899493

    申请日:2010-10-06

    申请人: David Dice

    发明人: David Dice

    IPC分类号: G06F12/00 G06F13/00 G06F13/28

    摘要: A method for managing a memory, including obtaining a number of indices and a cache line size of a cache memory, computing a cache page size by multiplying the number of indices by the cache line size, calculating a greatest common denominator (GCD) of the cache page size and a first size class, incrementing, in response to the GCD of the cache page size and the first size class exceeding the cache line size, the first size class to generate an updated first size class, calculating a GCD of the cache page size and the updated first size class, creating, in response to the GCD of the cache page size and the updated first size class being less than the cache line size, a first superblock in the memory including a first plurality of blocks of the updated first size class, and creating a second superblock in the memory.

    摘要翻译: 一种用于管理存储器的方法,包括获得多个索引和高速缓存存储器的高速缓存行大小,通过将索引数乘以高速缓存行大小来计算高速缓存页大小,计算最大公分母(GCD) 缓存页面大小和第一大小类别,响应于高速缓存页大小的GCD和超过高速缓存行大小的第一大小类而递增,生成更新的第一大小类的第一大小类,计算高速缓存的GCD 页面大小和更新的第一大小类别,响应于缓存页面大小的GCD和更新的第一大小类别小于高速缓存行大小,创建存储器中的第一超级块,其包括更新的第一大小块 第一大小类,并在内存中创建第二个超级块。

    Method and system for providing a current time value
    105.
    发明授权
    Method and system for providing a current time value 有权
    提供当前时间值的方法和系统

    公开(公告)号:US08473772B2

    公开(公告)日:2013-06-25

    申请号:US12898371

    申请日:2010-10-05

    IPC分类号: G06F1/14

    CPC分类号: G06F1/14

    摘要: A method for providing applications with a current time value includes receiving a trap for an application to access a time memory page, creating, in a memory map corresponding to the application, a mapping between an address space of the application and the time memory page in response to the trap, accessing, based on the trap, a hardware clock to obtain a time value, and updating the time memory page with the time value. The application reads the time value from the time memory page using the memory map.

    摘要翻译: 用于向当前时间值提供应用的方法包括接收应用程序访问时间存储器页面的陷阱,在与应用程序相对应的存储器映射中创建应用程序的地址空间与时间存储器页面之间的映射 响应陷阱,根据陷阱访问硬件时钟以获取时间值,并使用时间值更新时间存储器页面。 应用程序使用存储器映射从时间存储器页面读取时间值。

    System and Method for Reducing Transactional Abort Rates Using Compiler Optimization Techniques
    107.
    发明申请
    System and Method for Reducing Transactional Abort Rates Using Compiler Optimization Techniques 有权
    使用编译器优化技术降低事务中止率的系统和方法

    公开(公告)号:US20100169870A1

    公开(公告)日:2010-07-01

    申请号:US12345189

    申请日:2008-12-29

    申请人: David Dice

    发明人: David Dice

    IPC分类号: G06F9/44

    CPC分类号: G06F8/4441 G06F9/467

    摘要: In transactional memory systems, transactional aborts due to conflicts between concurrent threads may cause system performance degradation. A compiler may attempt to minimize runtime abort rates by performing one or more code transformations and/or other optimizations on a transactional memory program in an attempt to minimize one or more store-commit intervals. The compiler may employ store deferral, hoisting of long-latency operations from within a transaction body and/or store-commit interval, speculative hoisting of long-latency operations, and/or redundant store squashing optimizations. The compiler may perform optimizing transformations on source code and/or on any intermediate representation of the source code (e.g., parse trees, un-optimized assembly code, etc.). In some embodiments, the compiler may preemptively avoid naïve target code constructions. The compiler may perform static and/or dynamic analysis of a program in order to determine which, if any, transformations should be applied and/or may dynamically recompile code sections at runtime, based on execution analysis.

    摘要翻译: 在事务性内存系统中,由于并发线程之间的冲突而导致的事务中止可能会导致系统性能下降。 编译器可以尝试通过在事务性存储器程序上执行一个或多个代码变换和/或其他优化来最小化一个或多个存储提交间隔来最小化运行时中止速率。 编译器可以使用存储延迟,从事务体内和/或存储提交间隔,长时间延迟操作的推测性提升和/或冗余存储压缩优化中提取长延迟操作。 编译器可以对源代码和/或源代码的任何中间表示(例如,解析树,未优化的汇编代码等)执行优化变换。 在一些实施例中,编译器可以抢先避免天真的目标代码结构。 编译器可以执行程序的静态和/或动态分析,以便基于执行分析来确定应该应用哪个(如果有的话)转换和/或可以在运行时动态重新编译代码段。

    System and Method for Reducing Serialization in Transactional Memory Using Gang Release of Blocked Threads
    108.
    发明申请
    System and Method for Reducing Serialization in Transactional Memory Using Gang Release of Blocked Threads 有权
    使用阻塞线程释放事件存储器来减少序列化的系统和方法

    公开(公告)号:US20100138836A1

    公开(公告)日:2010-06-03

    申请号:US12327659

    申请日:2008-12-03

    IPC分类号: G06F9/46

    摘要: Transactional Lock Elision (TLE) may allow multiple threads to concurrently execute critical sections as speculative transactions. Transactions may abort due to various reasons. To avoid starvation, transactions may revert to execution using mutual exclusion when transactional execution fails. Because threads may revert to mutual exclusion in response to the mutual exclusion of other threads, a positive feedback loop may form in times of high congestion, causing a “lemming effect”. To regain the benefits of concurrent transactional execution, the system may allow one or more threads awaiting a given lock to be released from the wait queue and instead attempt transactional execution. A gang release may allow a subset of waiting threads to be released simultaneously. The subset may be chosen dependent on the number of waiting threads, historical abort relationships between threads, analysis of transactions of each thread, sensitivity of each thread to abort, and/or other thread-local or global criteria.

    摘要翻译: 事务锁定Elision(TLE)可允许多个线程同时执行关键部分作为投机交易。 交易可能因各种原因而中止。 为了避免饥饿,当事务执行失败时,事务可以使用互斥来恢复执行。 因为线程可能会因为其他线程的相互排斥而回到互斥状态,所以在高拥塞的时候可能形成正反馈回路,导致“线性效应”。 为了重新获得并发事务执行的好处,系统可能允许一个或多个线程等待给定的锁从等待队列中释放,而不是尝试事务执行。 帮派版本可能允许同时释放等待线程的子集。 可以根据等待线程的数量,线程之间的历史中止关系,每个线程的事务的分析,每个线程的中止的灵敏度和/或其他线程局部或全局标准来选择该子集。

    Method and System for Hardware Feedback in Transactional Memory
    109.
    发明申请
    Method and System for Hardware Feedback in Transactional Memory 有权
    事务内存中硬件反馈的方法和系统

    公开(公告)号:US20100131953A1

    公开(公告)日:2010-05-27

    申请号:US12324109

    申请日:2008-11-26

    IPC分类号: G06F9/46 G06F12/00 G06F12/08

    摘要: Multi-threaded, transactional memory systems may allow concurrent execution of critical sections as speculative transactions. These transactions may abort due to contention among threads. Hardware feedback mechanisms may detect information about aborts and provide that information to software, hardware, or hybrid software/hardware contention management mechanisms. For example, they may detect occurrences of transactional aborts or conditions that may result in transactional aborts, and may update local readable registers or other storage entities (e.g., performance counters) with relevant contention information. This information may include identifying data (e.g., information outlining abort relationships between the processor and other specific physical or logical processors) and/or tallied data (e.g., values of event counters reflecting the number of aborted attempts by the current thread or the resources consumed by those attempts). This contention information may be accessible by contention management mechanisms to inform contention management decisions (e.g. whether to revert transactions to mutual exclusion, delay retries, etc.).

    摘要翻译: 多线程事务内存系统可允许将关键部分作为投机事务并发执行。 这些事务可能由于线程之间的争用而中止。 硬件反馈机制可以检测关于中止的信息,并将该信息提供给软件,硬件或混合软件/硬件争用管理机制。 例如,它们可以检测可能导致事务中止的事务中止或条件的发生,并且可以用相关争用信息来更新本地可读寄存器或其他存储实体(例如,性能计数器)。 该信息可以包括识别数据(例如,概述处理器与其他特定物理或逻辑处理器之间的中止关系的信息)和/或计数数据(例如,反映当前线程的中止尝试次数或消耗的资源的事件计数器的值 通过这些尝试)。 该争用信息可以通过争用管理机制来访问,以通知争用管理决策(例如,是否将交易恢复为互斥,延迟重试等)。

    System and Method for Integrating Best Effort Hardware Mechanisms for Supporting Transactional Memory
    110.
    发明申请
    System and Method for Integrating Best Effort Hardware Mechanisms for Supporting Transactional Memory 有权
    集成支持事务性存储器的最佳努力硬件机制的系统和方法

    公开(公告)号:US20090282405A1

    公开(公告)日:2009-11-12

    申请号:US12238172

    申请日:2008-09-25

    IPC分类号: G06F9/46

    CPC分类号: G06F9/52 G06F9/467

    摘要: Systems and methods for integrating multiple best effort hardware transactional support mechanisms, such as Read Set Monitoring (RSM) and Best Effort Hardware Transactional Memory (BEHTM), in a single transactional memory implementation are described. The best effort mechanisms may be integrated such that the overhead associated with support of multiple mechanisms may be reduced and/or the performance of the resulting transactional memory implementations may be improved over those that include any one of the mechanisms, or an un-integrated collection of multiple such mechanisms. Two or more of the mechanisms may be employed concurrently or serially in a single attempt to execute a transaction, without aborting or retrying the transaction. State maintained or used by a first mechanism may be shared with or transferred to another mechanism for use in execution of the transaction. This transfer may be performed automatically by the integrated mechanisms (e.g., without user, programmer, or software intervention).

    摘要翻译: 描述了在单个事务存储器实现中集成多个尽力而为的硬件事务支持机制(诸如读集监视(RSM)和最佳努力硬件事务存储器(BEHTM))的系统和方法。 可以集成尽力而为的机制,使得可以减少与多个机制的支持相关联的开销,和/或可以提高所产生的事务存储器实现的性能,而不是包括机构中的任何一个或非集成集合 的多个这样的机制。 可以在不中止或重试事务的情况下,单次尝试同时执行或连续执行两个或多个机制来执行事务。 由第一机制维护或使用的状态可以与另一机制共享或转移以用于执行交易。 该传送可以由集成机制(例如,没有用户,程序员或软件干预)自动执行。