Method and apparatus for switching between per-thread and per-processor resource pools in multi-threaded programs
    1.
    发明申请
    Method and apparatus for switching between per-thread and per-processor resource pools in multi-threaded programs 有权
    用于在多线程程序中在每线程和每处理器资源池之间切换的方法和装置

    公开(公告)号:US20060218557A1

    公开(公告)日:2006-09-28

    申请号:US11090398

    申请日:2005-03-25

    IPC分类号: G06F9/46

    CPC分类号: G06F9/5016 G06F2209/507

    摘要: In a multi-processor multi-threaded computer system, resources are dynamically assigned during program operation to either threads or processors in such a manner that resource usage is maximized. In one embodiment, the choice of whether to assign resources to threads or processors is dependent on the number of threads versus the number of processors. In another embodiment, when the system is operating in one assignment mode, the amount of wasted resources is measured and when this measured amount exceeds a predetermined threshold based on the maximum resources that could be wasted were the system operating in the other assignment mode, the assignment is switched to the other assignment mode.

    摘要翻译: 在多处理器多线程计算机系统中,在程序操作期间以资源使用最大化的方式将资源动态地分配给线程或处理器。 在一个实施例中,是否向线程或处理器分配资源的选择取决于线程的数量与处理器的数量。 在另一个实施例中,当系统在一个分配模式下操作时,测量浪费的资源量,并且当系统以另一分配模式运行时,基于可浪费的最大资源,该测量量超过预定阈值时, 分配切换到其他分配模式。

    Method and apparatus for switching between per-thread and per-processor resource pools in multi-threaded programs
    2.
    发明授权
    Method and apparatus for switching between per-thread and per-processor resource pools in multi-threaded programs 有权
    用于在多线程程序中在每线程和每处理器资源池之间切换的方法和装置

    公开(公告)号:US07882505B2

    公开(公告)日:2011-02-01

    申请号:US11090398

    申请日:2005-03-25

    IPC分类号: G06F9/46 G06F13/00

    CPC分类号: G06F9/5016 G06F2209/507

    摘要: In a multi-processor multi-threaded computer system, resources are dynamically assigned during program operation to either threads or processors in such a manner that resource usage is maximized. In one embodiment, the choice of whether to assign resources to threads or processors is dependent on the number of threads versus the number of processors. In another embodiment, when the system is operating in one assignment mode, the amount of wasted resources is measured and when this measured amount exceeds a predetermined threshold based on the maximum resources that could be wasted were the system operating in the other assignment mode, the assignment is switched to the other assignment mode.

    摘要翻译: 在多处理器多线程计算机系统中,在程序操作期间以资源使用最大化的方式将资源动态地分配给线程或处理器。 在一个实施例中,是否向线程或处理器分配资源的选择取决于线程的数量与处理器的数量。 在另一个实施例中,当系统在一个分配模式下操作时,测量浪费的资源量,并且当系统以另一分配模式运行时,基于可浪费的最大资源,该测量量超过预定阈值时, 分配切换到其他分配模式。

    System and method for integrating best effort hardware mechanisms for supporting transactional memory
    3.
    发明授权
    System and method for integrating best effort hardware mechanisms for supporting transactional memory 有权
    集成用于支持事务性内存的最佳硬件机制的系统和方法

    公开(公告)号:US09367363B2

    公开(公告)日:2016-06-14

    申请号:US12238172

    申请日:2008-09-25

    IPC分类号: G06F9/46 G06F9/52

    CPC分类号: G06F9/52 G06F9/467

    摘要: Systems and methods for integrating multiple best effort hardware transactional support mechanisms, such as Read Set Monitoring (RSM) and Best Effort Hardware Transactional Memory (BEHTM), in a single transactional memory implementation are described. The best effort mechanisms may be integrated such that the overhead associated with support of multiple mechanisms may be reduced and/or the performance of the resulting transactional memory implementations may be improved over those that include any one of the mechanisms, or an un-integrated collection of multiple such mechanisms. Two or more of the mechanisms may be employed concurrently or serially in a single attempt to execute a transaction, without aborting or retrying the transaction. State maintained or used by a first mechanism may be shared with or transferred to another mechanism for use in execution of the transaction. This transfer may be performed automatically by the integrated mechanisms (e.g., without user, programmer, or software intervention).

    摘要翻译: 描述了在单个事务存储器实现中集成多个尽力而为的硬件事务支持机制(诸如读集监视(RSM)和最佳努力硬件事务存储器(BEHTM))的系统和方法。 可以集成尽力而为的机制,使得可以减少与多个机制的支持相关联的开销,和/或可以提高所产生的事务存储器实现的性能,而不是包括机构中的任何一个或非集成集合 的多个这样的机制。 可以在不中止或重试事务的情况下,单次尝试同时执行或连续执行两个或多个机制来执行事务。 由第一机制维护或使用的状态可以与另一机制共享或转移以用于执行交易。 该传送可以由集成机制(例如,没有用户,程序员或软件干预)自动执行。

    Partitioned ticket locks with semi-local spinning
    4.
    发明授权
    Partitioned ticket locks with semi-local spinning 有权
    分区门锁与半地方纺纱

    公开(公告)号:US09158596B2

    公开(公告)日:2015-10-13

    申请号:US13051877

    申请日:2011-03-18

    申请人: David Dice

    发明人: David Dice

    IPC分类号: G06F9/46 G06F9/52

    CPC分类号: G06F9/526 G06F2209/522

    摘要: A partitioned ticket lock may control access to a shared resource, and may include a single ticket value field and multiple grant value fields. Each grant value may be the sole occupant of a respective cache line, an event count or sequencer instance, or a sub-lock. The number of grant values may be configurable and/or adaptable during runtime. To acquire the lock, a thread may obtain a value from the ticket value field using a fetch-and-increment type operation, and generate an identifier of a particular grant value field by applying a mathematical or logical function to the obtained ticket value. The thread may be granted the lock when the value of that grant value field matches the obtained ticket value. Releasing the lock may include computing a new ticket value, generating an identifier of another grant value field, and storing the new ticket value in the other grant value field.

    摘要翻译: 分区票锁可以控制对共享资源的访问,并且可以包括单个票证值字段和多个授权值字段。 每个授权值可以是相应的高速缓存行,事件计数或定序器实例或子锁的唯一占用者。 许可值的数量可以在运行时间内配置和/或适应。 为了获取锁,线程可以使用获取和增量类型操作从票值字段获得值,并且通过对获得的票值应用数学或逻辑函数来生成特定授权值字段的标识符。 当该授权值字段的值与获得的票值匹配时,线程可以被授予锁定。 释放锁可以包括计算新的票值,生成另一授​​权值字段的标识符,并将新的票值存储在另一授权值字段中。

    Method and system for optimizing code for a multi-threaded application
    5.
    发明授权
    Method and system for optimizing code for a multi-threaded application 有权
    用于优化多线程应用程序代码的方法和系统

    公开(公告)号:US08826249B2

    公开(公告)日:2014-09-02

    申请号:US12708014

    申请日:2010-02-18

    IPC分类号: G06F9/45 G06F9/455

    CPC分类号: G06F9/45516 G06F8/456

    摘要: In modern multi-threaded environments, threads often work cooperatively toward providing collective or aggregate throughput for an application as a whole. Optimizing in the small for “thread local” common path latency is often but not always the best approach for a concurrent system composed of multiple cooperating threads. Some embodiments provide a technique for augmenting traditional code emission with thread-aware policies and optimization strategies for a multi-threaded application. During operation, the system obtains information about resource contention between executing threads of the multi-threaded application. The system analyzes the resource contention information to identify regions of the code to be optimized. The system recompiles these identified regions to produce optimized code, which is then stored for subsequent execution.

    摘要翻译: 在现代多线程环境中,线程通常协同工作,为整个应用程序提供集体或聚合吞吐量。 对于“线程本地”公共路径延迟的优化通常并不总是对由多个协作线程组成的并发系统的最佳方法。 一些实施例提供了一种用于针对多线程应用程序的线程感知策略和优化策略来增加传统代码排放的技术。 在运行期间,系统获取有关多线程应用程序的执行线程之间资源争用的信息。 系统分析资源争用信息以识别要优化的代码区域。 系统重新编译这些识别的区域以产生优化的代码,然后将其存储用于随后的执行。

    System and Method for Implementing Hierarchical Queue-Based Locks Using Flat Combining
    6.
    发明申请
    System and Method for Implementing Hierarchical Queue-Based Locks Using Flat Combining 有权
    使用平面组合实现层次化基于队列的锁的系统和方法

    公开(公告)号:US20120311606A1

    公开(公告)日:2012-12-06

    申请号:US13152079

    申请日:2011-06-02

    IPC分类号: G06F9/46

    CPC分类号: G06F9/526

    摘要: The system and methods described herein may be used to implement a scalable, hierarchal, queue-based lock using flat combining. A thread executing on a processor core in a cluster of cores that share a memory may post a request to acquire a shared lock in a node of a publication list for the cluster using a non-atomic operation. A combiner thread may build an ordered (logical) local request queue that includes its own node and nodes of other threads (in the cluster) that include lock requests. The combiner thread may splice the local request queue into a (logical) global request queue for the shared lock as a sub-queue. A thread whose request has been posted in a node that has been combined into a local sub-queue and spliced into the global request queue may spin on a lock ownership indicator in its node until it is granted the shared lock.

    摘要翻译: 本文描述的系统和方法可以用于使用平坦组合来实现可扩展的,分级的基于队列的锁。 在共享内存的核心集群中的处理器核心上执行的线程可以使用非原子操作来发布用于获取集群的发布列表的节点中的共享锁定的请求。 组合线程可以构建一个有序(逻辑)本地请求队列,其包括其自己的节点和包含锁定请求的其他线程(在集群中)的节点。 组合器线程可以将本地请求队列拼接成用于共享锁的(逻辑)全局请求队列作为子队列。 已经将其请求已经发布在已经组合到本地子队列中并被拼接到全局请求队列中的节点的线程可以旋转其节点中的所有权所有者指示符,直到被授予共享锁为止。

    CACHE INDEX COLORING FOR VIRTUAL-ADDRESS DYNAMIC ALLOCATORS
    7.
    发明申请
    CACHE INDEX COLORING FOR VIRTUAL-ADDRESS DYNAMIC ALLOCATORS 有权
    虚拟地址动态分配器的高速缓存索引颜色

    公开(公告)号:US20120089803A1

    公开(公告)日:2012-04-12

    申请号:US12899493

    申请日:2010-10-06

    申请人: David Dice

    发明人: David Dice

    IPC分类号: G06F12/02

    摘要: A method for managing a memory, including obtaining a number of indices and a cache line size of a cache memory, computing a cache page size by multiplying the number of indices by the cache line size, calculating a greatest common denominator (GCD) of the cache page size and a first size class, incrementing, in response to the GCD of the cache page size and the first size class exceeding the cache line size, the first size class to generate an updated first size class, calculating a GCD of the cache page size and the updated first size class, creating, in response to the GCD of the cache page size and the updated first size class being less than the cache line size, a first superblock in the memory including a first plurality of blocks of the updated first size class, and creating a second superblock in the memory.

    摘要翻译: 一种用于管理存储器的方法,包括获得多个索引和高速缓存存储器的高速缓存行大小,通过将索引数乘以高速缓存行大小来计算高速缓存页大小,计算最大公分母(GCD) 缓存页面大小和第一大小类别,响应于高速缓存页大小的GCD和超过高速缓存行大小的第一大小类而递增,生成更新的第一大小类的第一大小类,计算高速缓存的GCD 页面大小和更新的第一大小类别,响应于缓存页面大小的GCD和更新的第一大小类别小于高速缓存行大小,创建存储器中的第一超级块,其包括更新的第一大小块 第一大小类,并在内存中创建第二个超级块。

    System and method for transactional locking using reader-lists
    8.
    发明授权
    System and method for transactional locking using reader-lists 有权
    使用阅读器列表进行事务锁定的系统和方法

    公开(公告)号:US08103838B2

    公开(公告)日:2012-01-24

    申请号:US12350792

    申请日:2009-01-08

    IPC分类号: G06F12/16

    CPC分类号: G06F9/52

    摘要: In traditional transactional locking systems, such as Transactional Locking with Read-Write locks (TLRW), threads may frequently update lock metadata, causing system performance degradation. A system and method for implementing transactional locking using reader-lists (TLRL) may associate a respective reader-list with each stripe of data in a shared memory system. Before reading a given stripe as part of a transaction, a thread may add itself to the stripe's reader-list, if the thread is not already on the reader-list. A thread may leave itself on a reader-list after finishing the transaction. Before a thread modifies a stripe, the modifying thread may acquire a write-lock for the stripe. The writer thread may indicate to each reader thread on the stripe's reader-list that if the reader thread is executing a transaction, the reader thread should abort. The indication may include setting an invalidation flag for the reader. The writer thread may clear the reader-list of a stripe it modified.

    摘要翻译: 在传统的事务锁定系统中,例如使用读写锁定(TLRW)的事务锁定,线程可能会频繁更新锁元数据,从而导致系统性能下降。 用于使用读取器列表(TLRL)实现事务锁定的系统和方法可将相应的读取器列表与共享存储器系统中的每条数据条相关联。 在作为事务的一部分读取给定的条带之前,线程可能会将自身添加到条带的读取器列表中,如果该线程尚未在读取器列表中。 完成交易后,线程可能会在读取器列表上留下。 在线程修改条带之前,修改线程可以获取条带的写锁定。 作者线程可能会在条形码读取器列表上向每个读取器线程指示如果读取器线程正在执行事务,则读取器线程应该中止。 该指示可以包括设置读取器的无效标志。 作者线程可以清除其修改的条带的阅读器列表。

    Methods and apparatus to implement parallel transactions
    9.
    发明授权
    Methods and apparatus to implement parallel transactions 有权
    实现并行交易的方法和设备

    公开(公告)号:US08065499B2

    公开(公告)日:2011-11-22

    申请号:US11475262

    申请日:2006-06-27

    IPC分类号: G06F12/08

    摘要: A computer system includes multiple processing threads that execute in parallel. The multiple processing threads have access to a global environment including different types of metadata enabling the processing threads to carry out simultaneous execution depending on a currently selected type of lock mode. A mode controller monitoring the processing threads initiates switching from one type of lock mode to another depending on current operating conditions such as an amount of contention amongst the multiple processing threads to modify the shared data. The mode controller can switch from one lock mode another regardless of whether any of the multiple processes are in the midst of executing a respective transaction. A most efficient lock mode can be selected to carry out the parallel transactions. In certain cases, switching of lock modes causes one or more of the processing threads to abort and retry a respective transaction according to the new mode.

    摘要翻译: 计算机系统包括并行执行的多个处理线程。 多个处理线程可以访问包括不同类型的元数据的全局环境,使得处理线程可以根据当前选择的锁定模式类型执行同时执行。 监视处理线程的模式控制器根据当前操作条件(例如多个处理线程之间的争用量)来启动从一种类型的锁定模式切换到另一种类型的锁模式以修改共享数据。 模式控制器可以从一种锁定模式切换,而不管多个进程中的任何一个是否在执行相应的事务中。 可以选择最有效的锁定模式来执行并行事务。 在某些情况下,切换锁定模式会使一个或多个处理线程根据新模式中止并重试相应的事务。

    FACILITATING TRANSACTIONAL EXECUTION THROUGH FEEDBACK ABOUT MISSPECULATION
    10.
    发明申请
    FACILITATING TRANSACTIONAL EXECUTION THROUGH FEEDBACK ABOUT MISSPECULATION 有权
    通过关于错误的反馈来促进交易的执行

    公开(公告)号:US20100333093A1

    公开(公告)日:2010-12-30

    申请号:US12493447

    申请日:2009-06-29

    IPC分类号: G06F9/46 G06F12/08

    摘要: One embodiment provides a system that facilitates the execution of a transaction for a program in a hardware-supported transactional memory system. During operation, the system records a misspeculation indicator of the transaction during execution of the transaction using hardware transactional memory mechanisms. Next, the system detects a transaction failure associated with the transaction. Finally, the system provides the recorded misspeculation indicator to the program to facilitate a response to the transaction failure by the program.

    摘要翻译: 一个实施例提供了一种便于在硬件支持的事务存储器系统中执行程序的事务的系统。 在操作期间,系统使用硬件事务存储器机制在执行事务期间记录事务的错误指示符。 接下来,系统检测与事务相关联的事务失败。 最后,系统向程序提供记录的错误指示符,以便程序响应交易失败。