System and method for supporting multiple alternative methods for executing transactions
    1.
    发明授权
    System and method for supporting multiple alternative methods for executing transactions 有权
    支持执行交易的多种替代方法的系统和方法

    公开(公告)号:US07921407B2

    公开(公告)日:2011-04-05

    申请号:US11591919

    申请日:2006-11-02

    IPC分类号: G06F9/44 G06F9/45 G06F12/00

    摘要: Transaction code written by the programmer may be translated, replaced or transformed into a code that is configured to implement transactions according to any of various techniques. A compiler may replace programmer written transaction code into code allowing multiple compatible transaction implementation techniques to be used in the same program, and at the same time. A programmer may write transaction code once using familiar coding styles, but the transaction to be effected according to one of a number of compatible alternative implementation techniques. The compiler may enable the implementation of multiple, alternative transactional memory schemes. The particular technique implemented for each transaction may not be decided until runtime. At runtime, any of the various implemented techniques may be used to effect the transaction and if a first technique fails or is inappropriate for a particular transaction, one or more other techniques may be attempted.

    摘要翻译: 由程序员编写的事务代码可以被翻译,替换或变换为被配置为根据各种技术中的任何一种实现事务的代码。 编译器可以将程序员的书面交易代码替换为允许在同一程序中同时使用多个兼容的事务实现技术的代码。 程序员可以使用熟悉的编码风格一次写入交易代码,但是根据许多兼容的替代实现技术之一进行交易。 编译器可以实现多个替代事务存储器方案。 对于每个事务实现的特定技术可能不会在运行时间之前决定。 在运行时,可以使用各种实现的技术中的任一种来实现事务,并且如果第一技术失败或不适合于特定事务,则可以尝试一种或多种其他技术。

    System and method for scheduling instructions to maximize outstanding prefetches and loads
    2.
    发明授权
    System and method for scheduling instructions to maximize outstanding prefetches and loads 有权
    用于调度指令以最大程度提高预取和负载的系统和方法

    公开(公告)号:US06918111B1

    公开(公告)日:2005-07-12

    申请号:US09679434

    申请日:2000-10-03

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4442

    摘要: The present invention discloses a method and device for ordering memory operation instructions in an optimizing compiler. for a processor that can potentially enter a stall state if a memory queue is full. The method uses a dependency graph coupled with one or more memory queues. The dependency graph is used to show the dependency relationships between instructions in a program being compiled. After creating the dependency graph, the ready nodes are identified. Dependency graph nodes that correspond to memory operations may have the effect of adding an element to the memory queue or removing one or more elements from the memory queue. The ideal situation is to keep the memory queue as full as possible without exceeding the maximum desirable number of elements, by scheduling memory operations to maximize the parallelism of memory operations while avoiding stalls on the target processor.

    摘要翻译: 本发明公开了一种在优化编译器中排序存储器操作指令的方法和装置。 对于处理器,如果内存队列已满,则可能会进入失速状态。 该方法使用与一个或多个存储器队列耦合的依赖图。 依赖图用于显示正在编译的程序中的指令之间的依赖关系。 创建依赖关系图后,就可以识别就绪节点。 对应于存储器操作的依赖性图节点可能具有将元素添加到存储器队列或从存储器队列中移除一个或多个元件的效果。 理想的情况是通过调度存储器操作来最大化存储器操作的并行性,同时避免目标处理器上的停顿,使存储器队列保持尽可能不满,而不超过最大期望数量的元件。

    Method, apparatus and computer program product for processing stack
related exception traps
    3.
    发明授权
    Method, apparatus and computer program product for processing stack related exception traps 失效
    用于处理堆栈相关异常陷阱的方法,设备和计算机程序产品

    公开(公告)号:US6167504A

    公开(公告)日:2000-12-26

    申请号:US122172

    申请日:1998-07-24

    申请人: Peter C. Damron

    发明人: Peter C. Damron

    摘要: Apparatus, methods, and computer program products are disclosed that improve the operation of a computer that uses a top-of-stack cache by reducing the number of overflow and underflow traps generated during the execution of a program. The invention maintains a predictor value that controls the number of stack elements that are spilled from, or filled to, the top-of-stack cache in response to an overflow trap or an underflow trap (respectively). The predictor reflects the history of overflow traps and underflow traps.

    摘要翻译: 公开了装置,方法和计算机程序产品,其通过减少在执行程序期间产生的溢出和下溢陷阱的数量来改进使用堆叠高速缓存的计算机的操作。 本发明保持预测值,该预测值响应于溢出陷阱或下溢陷阱(分别)控制从顶部堆栈高速缓存溢出或填充到堆叠高速缓存的堆栈元素的数量。 预测器反映了溢流陷阱和下溢陷阱的历史。

    Extending a register file utilizing stack and queue techniques
    4.
    发明授权
    Extending a register file utilizing stack and queue techniques 有权
    使用堆栈和队列技术扩展寄存器文件

    公开(公告)号:US07203820B2

    公开(公告)日:2007-04-10

    申请号:US10185200

    申请日:2002-06-28

    申请人: Peter C. Damron

    发明人: Peter C. Damron

    IPC分类号: G06F12/00

    摘要: In a set of registers, each individually addressable by register operations using a corresponding register identification, at least one register of the set of registers is an extended register having multiple storage locations. Values stored in the multiple storage locations are accessed, for example, according to the order in which they have been stored. Less than all of the multiple storage locations are accessible by a register operation at a given time. Older versions of software that do not recognize extended registers identify the extended register as having only one storage location. An extended register can be, for example, a stack register, a queue register, or a mixed register and values stored in the multiple storage locations are read and stored according to the characteristics of the register.

    摘要翻译: 在一组寄存器中,每个可通过使用对应寄存器标识的寄存器操作单独寻址,寄存器组中的至少一个寄存器是具有多个存储位置的扩展寄存器。 存储在多个存储位置中的值例如根据它们已被存储的顺序被访问。 少于所有多个存储位置可以通过在给定时间的寄存器操作访问。 不识别扩展寄存器的较旧版本的软件将扩展寄存器标识为只有一个存储位置。 扩展寄存器可以是例如堆栈寄存器,队列寄存器或混合寄存器,并且根据寄存器的特性来读取和存储存储在多个存储单元中的值。

    System and method for a software controlled cache
    5.
    发明授权
    System and method for a software controlled cache 有权
    软件控制缓存的系统和方法

    公开(公告)号:US06668307B1

    公开(公告)日:2003-12-23

    申请号:US09677092

    申请日:2000-09-29

    申请人: Peter C. Damron

    发明人: Peter C. Damron

    IPC分类号: G06F1200

    CPC分类号: G06F9/383 G06F9/30101

    摘要: A system and method are provided for improved handling of data in a cache memory system (105) for caching data transferred between a processor (110) capable of executing a program and a main-memory (115). The cache memory system (105) has at least one cache (135) with several cache-lines (160) capable of caching data therein. In the method, a cache address space is provided for each cache (135) and special instructions are generated and inserted into the program to directly control caching of data in at least one ofthe cache-lines (160). Special instructions received in the cache memory system (105) are then executed to cache the data. The special instructions can be generated by a compiler during compiling of the program. Where the cache memory system (105) includes a set-associative-cache having a number of sets each with several cache-lines (160), the method can further include the step of determining which cache-line in a set to flush to main-memory (115) before caching new data to the set.

    摘要翻译: 提供了一种系统和方法,用于改进高速缓冲存储器系统(105)中的数据处理,用于缓存在能够执行程序的处理器(110)和主存储器(115)之间传送的数据。 高速缓冲存储器系统(105)具有至少一个具有能够在其中缓存数据的高速缓存行(160)的高速缓存(135)。 在该方法中,为每个高速缓存(135)提供高速缓存地址空间,并且生成特殊指令并将其插入到程序中以直接控制高速缓存行(160)中的至少一个中的数据的高速缓存。 然后执行在高速缓冲存储器系统(105)中接收的特殊指令来缓存数据。 在编译程序期间,编译器可以生成特殊指令。 在高速缓冲存储器系统(105)包括具有多个具有若干高速缓存行(160)的集合的集合关联高速缓存的情况下,该方法还可以包括以下步骤:确定集合中的哪个高速缓存行与主要 - 在将新数据缓存到集合之前的内存(115)。

    Method and apparatus for performing prefetching at the critical section level
    6.
    发明授权
    Method and apparatus for performing prefetching at the critical section level 有权
    在临界区段执行预取的方法和装置

    公开(公告)号:US06427235B1

    公开(公告)日:2002-07-30

    申请号:US09434714

    申请日:1999-11-05

    IPC分类号: G06F944

    摘要: One embodiment of the present invention provides a system for compiling source code into executable code that performs prefetching for memory operations within critical sections of code that are subject to mutual exclusion. The system operates by compiling a source code module containing programming language instructions into an executable code module containing instructions suitable for execution by a processor. Next, the system identifies a critical section within the executable code module by identifying a region of code between a mutual exclusion lock operation and a mutual exclusion unlock operation. The system schedules explicit prefetch instructions into the critical section in advance of associated memory operations. In one embodiment, the system identifies the critical section of code by using a first macro to perform the mutual exclusion lock operation, wherein the first macro additionally activates prefetching. The system also uses a second macro to perform the mutual exclusion unlock operation, wherein the second macro additionally deactivates prefetching.

    摘要翻译: 本发明的一个实施例提供了一种用于将源代码编译成可执行代码的系统,该系统执行代码相互排斥的关键代码段内的存储器操作的预取。 该系统通过将包含编程语言指令的源代码模块编译成包含适合于处理器执行的指令的可执行代码模块来操作。 接下来,系统通过识别互斥锁定操作和互斥解锁操作之间的代码区域来识别可执行代码模块内的关键部分。 在关联的存储器操作之前,系统将显式预取指令调度到临界区。 在一个实施例中,系统通过使用第一宏来执行互斥锁定操作来识别代码的关键部分,其中第一宏另外激活预取。 系统还使用第二宏来执行互斥解锁操作,其中第二宏附加地去激活预取。

    Virtual register set expanding processor internal storage
    7.
    发明授权
    Virtual register set expanding processor internal storage 有权
    虚拟寄存器集扩展处理器内部存储

    公开(公告)号:US07210026B2

    公开(公告)日:2007-04-24

    申请号:US10184333

    申请日:2002-06-28

    申请人: Peter C. Damron

    发明人: Peter C. Damron

    IPC分类号: G06F9/44 G06F9/54 G06F12/08

    摘要: A processor includes a set of registers, each individually addressable using a corresponding register identification, and plural virtual registers, each individually addressable using a corresponding virtual register identification. The processor transfers values between the set of registers and the plural virtual registers under control of a transfer operation. The processor can include a virtual register cache configured to store multiple sets of virtual register values, such that each of the multiple sets of virtual register values corresponds to a different context. Each of the plural virtual registers can include a valid bit that is reset on a context switch and set when a value is loaded from the virtual register cache. The processor can include a virtual register translation look-aside buffer for tracking the location of each set of virtual register values associated with each context.

    摘要翻译: 处理器包括一组寄存器,每个寄存器可使用对应的寄存器标识单独寻址,以及多个虚拟寄存器,每个虚拟寄存器可使用相应的虚拟寄存器标识单独寻址。 处理器在传送操作的控制下传送寄存器组和多个虚拟寄存器之间的值。 处理器可以包括被配置为存储多组虚拟寄存器值的虚拟寄存器高速缓存,使得多组虚拟寄存器值中的每一个对应于不同的上下文。 多个虚拟寄存器中的每一个可以包括在上下文切换上重置的有效位,并且当从虚拟寄存器高速缓存加载值时设置。 处理器可以包括用于跟踪与每个上下文相关联的每组虚拟寄存器值的位置的虚拟寄存器转换后备缓冲器。

    System and method for pre-fetching for pointer linked data structures
    8.
    发明授权
    System and method for pre-fetching for pointer linked data structures 有权
    用于指针链接数据结构的预取的系统和方法

    公开(公告)号:US06782454B1

    公开(公告)日:2004-08-24

    申请号:US09677090

    申请日:2000-09-29

    申请人: Peter C. Damron

    发明人: Peter C. Damron

    IPC分类号: G06F1200

    摘要: A system and method are provided for efficiently prefetching data in a pointer linked data structure (140). In one embodiment, a data processing system (100) is provided including a processor (110) capable of executing a program, a main-memory (115) and a prefetch engine (175) configured to prefetch data from a plurality of locations in main-memory in response to a prefetch request from the processor. When the data in main-memory (115) has a linked-data-structure having a number nodes (145) each with data (150) stored therein, prefetch engine (175) is configured to traverse the linked-data-structure and prefetch data from the nodes. The prefetch engine (175) is configured to determine from data contained in a prefetched first node (145A) and an offset value a new starting address for a second node (145B) to be prefetched. In one embodiment, the prefetch engine (175) includes a number of sets of prefetch registers (180), one set of prefetch registers for each prefetch request from processor (110) that is yet to be completed. Each set of prefetch registers (180) includes (i) a prefetch address register (190); (ii) an offset register (195); (iii) a termination register (200); (iv) a status register (205); and (v) a returned data register (210).

    摘要翻译: 提供了一种系统和方法,用于有效地预取指针关联数据结构(140)中的数据。 在一个实施例中,提供了一种数据处理系统(100),其包括能够执行程序的处理器(110),主存储器(115)和预取引擎(175),其被配置为从主机中的多个位置预取数据 - 内存响应于来自处理器的预取请求。 当主存储器(115)中的数据具有数据节点(145)的链接数据结构,每个节点(145)中存储有数据(150)时,预取引擎(175)被配置为遍历链接数据结构和预取 节点数据。 预取引擎(175)被配置为从包含在预取的第一节点(145A)中的数据和偏移值确定要预取的第二节点(145B)的新起始地址。 在一个实施例中,预取引擎(175)包括多个预取寄存器组(180),来自处理器(110)的每个预取请求的一组预取寄存器组尚未完成。 每组预取寄存器(180)包括(i)预取地址寄存器(190); (ii)偏移寄存器(195); (iii)终止寄存器(200); (iv)状态登记册(205); 和(v)返回的数据寄存器(210)。

    Disambiguating memory references based upon user-specified programming constraints
    9.
    发明授权
    Disambiguating memory references based upon user-specified programming constraints 有权
    根据用户指定的编程约束消除内存引用

    公开(公告)号:US06718542B1

    公开(公告)日:2004-04-06

    申请号:US09549806

    申请日:2000-04-14

    IPC分类号: G06F945

    CPC分类号: G06F8/434 G06F8/445

    摘要: A system that allows a programmer to specify a set of constraints that the programmer has adhered to in writing code so that a compiler is able to assume the set of constraints in disambiguating memory references within the code. The system operates by receiving an identifier for a set of constraints on memory references that the programmer has adhered to in writing the code. The system uses the identifier to select a disambiguation technique from a set of disambiguation techniques. Note that each disambiguation technique is associated with a different set of constraints on memory references. The system uses the selected disambiguation technique to identify memory references within the code that can alias with each other.

    摘要翻译: 一种允许程序员指定程序员在编写代码时遵守的一组约束的系统,以便编译器能够在消除代码内存储器引用的歧义的情况下承担一组约束。 系统通过接收编程人员在编写代码时所遵守的存储器引用上的一组约束的标识符来操作。 系统使用标识符从一组消歧技术中选择消歧技术。 请注意,每个消歧技术都与存储器引用的一组不同的约束相关联。 系统使用所选择的消歧技术来识别代码中可以彼此别名的内存引用。

    Method for apparatus for prefetching linked data structures
    10.
    发明授权
    Method for apparatus for prefetching linked data structures 有权
    用于预取链接的数据结构的方法和装置

    公开(公告)号:US06687807B1

    公开(公告)日:2004-02-03

    申请号:US09551292

    申请日:2000-04-18

    申请人: Peter C. Damron

    发明人: Peter C. Damron

    IPC分类号: G06F900

    CPC分类号: G06F12/0862 G06F2212/6028

    摘要: Additional memory hardware in a computer system which is distinct in function from the main memory system architecture permits the storage and retrieval of prefetch addresses and allows the compiler to more efficiently generate prefetch instructions for execution while traversing pointer-based or recursive data structures. The additional memory hardware makes up a content addressable memory (CAM) or a hash table/array memory that is relatively close in cycle time to the CPU and relatively small when compared to the main memory system. The additional CAM hardware permits the compiler to write data access loops which remember the addresses for each node visited while traversing the linked data structure by providing storage space to hold a prefetch address or a set of prefetch addresses. Since the additional CAM is separate from the main memory system and acts as an alternate cache for holding the prefetch addresses, it prevents the overwriting of desired information in the regular cache and thus leaves the regular cache unpolluted. Furthermore, rather than having the addresses for the entire memory system stored in the CAM, only the addresses to those data nodes traversed along the pointer-based data structure are stored and thus remembered, which allows the size of the CAM to remain relatively small and access to the CAM by the CPU, relatively fast.

    摘要翻译: 与主存储器系统架构不同的计算机系统中的附加存储器硬件允许预取地址的存储和检索,并允许编译器在遍历基于指针或递归数据结构的同时更有效地生成预取指令用于执行。 附加存储器硬件组成内容可寻址存储器(CAM)或与CPU相比在周期时间内相对较近的散列表/阵列存储器,并且与主存储器系统相比较小。 附加的CAM硬件允许编译器写入数据访问循环,数据访问循环通过提供保存预取地址或一组预取地址的存储空间来记住遍历链接的数据结构时访问的每个节点的地址。 由于附加CAM与主存储器系统分离,并且用作用于保存预取地址的替代高速缓存,所以它防止在常规高速缓存中覆盖所需信息,从而使常规高速缓存未受到污染。 此外,除了存储在CAM中的整个存储器系统的地址之外,仅存储沿着基于指针的数据结构遍历的那些数据节点的地址并因此被记住,这允许CAM的尺寸保持相对较小, 通过CPU访问CAM,速度比较快。