Method and structure for pipelining of SIMD conditional moves
    1.
    发明授权
    Method and structure for pipelining of SIMD conditional moves 有权
    SIMD条件移动流水线的方法和结构

    公开(公告)号:US07480787B1

    公开(公告)日:2009-01-20

    申请号:US11341001

    申请日:2006-01-27

    IPC分类号: G06F9/00

    摘要: A mask is first generated in a general-purpose integer register. The mask is generated by executing a single instruction multiple data (SIMD) instruction on a plurality of operands stored in a plurality of registers and by writing the result to the general-purpose integer register. Next, a conditional-move mask is generated in a register using the mask, and then the conditional-move mask is used in selecting operands from the plurality of operands to generate a result in another register.

    摘要翻译: 首先在通用整数寄存器中生成掩码。 通过对存储在多个寄存器中的多个操作数执行单指令多数据(SIMD)指令并将结果写入通用整数寄存器来生成掩码。 接下来,使用掩码在寄存器中生成条件移动掩码,然后使用条件移动掩码来选择来自多个操作数的操作数,以在另一个寄存器中生成结果。

    METHOD AND APPARATUS FOR DETERMINING CACHE STORAGE LOCATIONS BASED ON LATENCY REQUIREMENTS
    2.
    发明申请
    METHOD AND APPARATUS FOR DETERMINING CACHE STORAGE LOCATIONS BASED ON LATENCY REQUIREMENTS 有权
    基于延迟要求确定缓存存储位置的方法和装置

    公开(公告)号:US20100299482A1

    公开(公告)日:2010-11-25

    申请号:US12470639

    申请日:2009-05-22

    IPC分类号: G06F12/08 G06F12/00 G06F12/10

    摘要: A method for determining whether to store binary information in a fast way or a slow way of a cache is disclosed. The method includes receiving a block of binary information to be stored in a cache memory having a plurality of ways. The plurality of ways includes a first subset of ways and a second subset of ways, wherein a cache access by a first execution core from one of the first subset of ways has a lower latency time than a cache access from one of the second subset of ways. The method further includes determining, based on a predetermined access latency and one or more parameters associated with the block of binary information, whether to store the block of binary information into one of the first set of ways or one of the second set of ways.

    摘要翻译: 公开了一种用于以快速方式或缓存方式存储二进制信息的方法。 该方法包括接收要存储在具有多个方式的高速缓冲存储器中的二进制信息块。 多种方式包括方法的第一子集和方法的第二子集,其中来自第一方法子集之一的第一执行核心的高速缓存访​​问具有比来自第二子集的第二子集 方法。 该方法还包括基于预定访问等待时间和与二进制信息块相关联的一个或多个参数确定是否将二进制信息块存储为第一组方式之一或第二组路径之一。

    Method and apparatus for measuring performance during speculative execution
    3.
    发明授权
    Method and apparatus for measuring performance during speculative execution 有权
    用于在推测执行期间测量性能的方法和装置

    公开(公告)号:US07757068B2

    公开(公告)日:2010-07-13

    申请号:US11654270

    申请日:2007-01-16

    IPC分类号: G06F7/38

    摘要: One embodiment of the present invention provides a system for measuring processor performance during speculative-execution. The system starts by executing instructions in a normal-execution mode. The system then enters a speculative-execution episode wherein instructions are speculatively executed without being committed to the architectural state of the processor. While entering the speculative-execution episode the system enables a speculative execution monitor. The system then uses the speculative execution monitor to monitor instructions during the speculative-execution episode to record data values relating to the speculative-execution episode. Upon returning to normal-execution mode, the system disables the speculative execution monitor. The data values recorded by the speculative execution monitor facilitate measuring processor performance during speculative execution.

    摘要翻译: 本发明的一个实施例提供了一种用于在推测执行期间测量处理器性能的系统。 系统通过在正常执行模式下执行指令来启动。 然后,系统进入推测执行情节,其中指令被推测地执行而不被提交到处理器的架构状态。 在进入推测执行情节时,系统启用推测执行监视器。 然后,系统使用推测执行监视器在推测执行情节期间监视指令,以记录与推测执行情节相关的数据值。 返回到正常执行模式后,系统将禁用推测执行监视器。 由推测执行监视器记录的数据值有助于在推测执行期间测量处理器的性能。

    Mechanism for hardware tracking of return address after tail call elimination of return-type instruction
    4.
    发明授权
    Mechanism for hardware tracking of return address after tail call elimination of return-type instruction 有权
    尾部呼叫消除返回类型指令后返回地址的硬件跟踪机制

    公开(公告)号:US07610474B2

    公开(公告)日:2009-10-27

    申请号:US11352147

    申请日:2006-02-10

    IPC分类号: G06F9/00

    摘要: A technique maintains return address stack (RAS) content and alignment of a RAS top-of-stack (TOS) pointer upon detection of a tail-call elimination of a return-type instruction. In at least one embodiment of the invention, an apparatus includes a processor pipeline and at least a first return address stack for maintaining a stack of return addresses associated with instruction flow at a first stage of the processor pipeline. The processor pipeline is configured to maintain the first return address stack unchanged in response to detection of a tail-call elimination sequence of one or more instructions associated with a first call-type instruction encountered by the first stage. The processor pipeline is configured to push a return address associated with the first call-type instruction onto the first return address stack otherwise.

    摘要翻译: 检测到返回类型指令的尾部消除消息后,技术维护返回地址堆栈(RAS)内容和RAS顶层(TOS)指针的对齐。 在本发明的至少一个实施例中,一种装置包括处理器流水线和至少第一返回地址堆栈,用于在处理器流水线的第一级保持与指令流相关联的返回地址堆栈。 响应于检测到与第一级遇到的第一呼叫类型指令相关联的一个或多个指令的尾部呼叫消除序列,处理器流水线被配置为维持第一返回地址堆栈不变。 否则处理器流水线被配置为将与第一调用类型指令相关联的返回地址推送到第一返回地址堆栈。

    FACILITATING TRANSACTIONAL EXECUTION IN A PROCESSOR THAT SUPPORTS SIMULTANEOUS SPECULATIVE THREADING
    5.
    发明申请
    FACILITATING TRANSACTIONAL EXECUTION IN A PROCESSOR THAT SUPPORTS SIMULTANEOUS SPECULATIVE THREADING 有权
    在支持同时进行线性加工的处理器中促进交易执行

    公开(公告)号:US20090254905A1

    公开(公告)日:2009-10-08

    申请号:US12061554

    申请日:2008-04-02

    IPC分类号: G06F9/46

    摘要: Embodiments of the present invention provide a system that executes a transaction on a simultaneous speculative threading (SST) processor. In these embodiments, the processor includes a primary strand and a subordinate strand. Upon encountering a transaction with the primary strand while executing instructions non-transactionally, the processor checkpoints the primary strand and executes the transaction with the primary strand while continuing to non-transactionally execute deferred instructions with the subordinate strand. When the subordinate strand non-transactionally accesses a cache line during the transaction, the processor updates a record for the cache line to indicate the first strand ID. When the primary strand transactionally accesses a cache line during the transaction, the processor updates a record for the cache line to indicate a second strand ID.

    摘要翻译: 本发明的实施例提供了一种在同时推测的线程(SST)处理器上执行交易的系统。 在这些实施例中,处理器包括主链和从属链。 在非事务性地执行指令的同时遇到与主链的事务时,处理器检查主链,并与主链一起执行事务,同时继续非事务地执行与下级链的延迟指令。 当下级链在事务期间非事务地访问高速缓存行时,处理器更新用于高速缓存行的记录以指示第一个链ID。 当主链在事务期间事务地访问高速缓存行时,处理器更新用于高速缓存行的记录以指示第二个链ID。

    Circuitry and method for accessing an associative cache with parallel determination of data and data availability
    6.
    发明授权
    Circuitry and method for accessing an associative cache with parallel determination of data and data availability 有权
    用于通过并行确定数据和数据可用性访问关联高速缓存的电路和方法

    公开(公告)号:US07461208B1

    公开(公告)日:2008-12-02

    申请号:US11155147

    申请日:2005-06-16

    IPC分类号: G06F13/16

    摘要: A circuit for accessing an associative cache is provided. The circuit includes data selection circuitry and an outcome parallel processing circuit both in communication with the associative cache. The outcome parallel processing circuit is configured to determine whether an accessing of data from the associative cache is one of a cache hit, a cache miss, or a cache mispredict. The circuit further includes a memory in communication with the data selection circuitry and the outcome parallel processing circuit. The memory is configured to store a bank select table, whereby the bank select table is configured to include entries that define a selection of one of a plurality of banks of the associative cache from which to output data. Methods for accessing the associative cache are also described.

    摘要翻译: 提供了一种用于访问关联高速缓存的电路。 电路包括与关联高速缓存通信的数据选择电路和结果并行处理电路。 结果并行处理电路被配置为确定来自关联高速缓存的数据的访问是否是高速缓存命中,高速缓存未命中或高速缓存错误预测中的一个。 电路还包括与数据选择电路和结果并行处理电路通信的存储器。 存储器被配置为存储存储体选择表,由此存储体选择表被配置为包括定义从其输出数据的关联高速缓存的多个存储区之一的选择的条目。 还描述了访问关联高速缓存的方法。

    Method for graphically displaying hardware performance simulators
    7.
    发明授权
    Method for graphically displaying hardware performance simulators 有权
    用于图形显示硬件性能模拟器的方法

    公开(公告)号:US07331039B1

    公开(公告)日:2008-02-12

    申请号:US10688763

    申请日:2003-10-15

    IPC分类号: G06F9/44 G06F3/048

    摘要: A method for graphically tracking progression of instructions through hardware components. Instructions of a code segment are defined by graphical icons where each graphical icon has a displayable appearance that identifies a type of instruction. The method tracks each graphical icon when simulating execution of the code segment through the hardware components. The method then displays a progression of each graphical icon through the hardware components during execution of the code segment.

    摘要翻译: 一种通过硬件组件图形跟踪指令进度的方法。 代码段的指令由图形图标定义,其中每个图形图标具有标识一种指令类型的可显示外观。 该方法通过硬件组件模拟代码段的执行时跟踪每个图形图标。 然后,该方法在代码段的执行期间通过硬件组件显示每个图形图标的进程。

    Anti-prefetch instruction
    8.
    发明授权
    Anti-prefetch instruction 有权
    反预取指令

    公开(公告)号:US08732438B2

    公开(公告)日:2014-05-20

    申请号:US12104159

    申请日:2008-04-16

    IPC分类号: G06F9/30

    摘要: Embodiments of the present invention execute an anti-prefetch instruction. These embodiments start by decoding instructions in a decode unit in a processor to prepare the instructions for execution. Upon decoding an anti-prefetch instruction, these embodiments stall the decode unit to prevent decoding subsequent instructions. These embodiments then execute the anti-prefetch instruction, wherein executing the anti-prefetch instruction involves: (1) sending a prefetch request for a cache line in an L1 cache; (2) determining if the prefetch request hits in the L1 cache; (3) if the prefetch request hits in the L1 cache, determining if the cache line contains a predetermined value; and (4) conditionally performing subsequent operations based on whether the prefetch request hits in the L1 cache or the value of the data in the cache line.

    摘要翻译: 本发明的实施例执行反预取指令。 这些实施例首先解码处理器中的解码单元中的指令,以准备执行指令。 在对反预取指令进行解码时,这些实施例使解码单元停止以防止解码后续指令。 这些实施例然后执行反预取指令,其中执行反预取指令涉及:(1)在L1高速缓存中发送用于高速缓存行的预取请求; (2)确定预取请求是否在L1高速缓存中命中; (3)如果预取请求命中在L1高速缓存中,则确定高速缓存线是否包含预定值; 以及(4)基于所述预提取请求是否在所述L1高速缓存中的命中或所述高速缓存行中的数据的值有条件地执行后续操作。

    Facilitating transactional execution in a processor that supports simultaneous speculative threading
    9.
    发明授权
    Facilitating transactional execution in a processor that supports simultaneous speculative threading 有权
    促进在支持同时投机线程的处理器中的事务执行

    公开(公告)号:US08316366B2

    公开(公告)日:2012-11-20

    申请号:US12061554

    申请日:2008-04-02

    IPC分类号: G06F9/46

    摘要: Embodiments of the present invention provide a system that executes a transaction on a simultaneous speculative threading (SST) processor. In these embodiments, the processor includes a primary strand and a subordinate strand. Upon encountering a transaction with the primary strand while executing instructions non-transactionally, the processor checkpoints the primary strand and executes the transaction with the primary strand while continuing to non-transactionally execute deferred instructions with the subordinate strand. When the subordinate strand non-transactionally accesses a cache line during the transaction, the processor updates a record for the cache line to indicate the first strand ID. When the primary strand transactionally accesses a cache line during the transaction, the processor updates a record for the cache line to indicate a second strand ID.

    摘要翻译: 本发明的实施例提供了一种在同时推测的线程(SST)处理器上执行交易的系统。 在这些实施例中,处理器包括主链和从属链。 在非事务性地执行指令时遇到与主链的事务时,处理器检查主链,并且与主链执行事务,同时继续非随意地执行与下级链的延迟指令。 当下级链在事务期间非事务地访问高速缓存行时,处理器更新用于高速缓存行的记录以指示第一个链ID。 当主链在事务期间事务地访问高速缓存行时,处理器更新用于高速缓存行的记录以指示第二个链ID。

    Method and apparatus for determining cache storage locations based on latency requirements
    10.
    发明授权
    Method and apparatus for determining cache storage locations based on latency requirements 有权
    基于延迟要求确定缓存存储位置的方法和装置

    公开(公告)号:US08065485B2

    公开(公告)日:2011-11-22

    申请号:US12470639

    申请日:2009-05-22

    IPC分类号: G06F12/00 G06F9/26

    摘要: A method for determining whether to store binary information in a fast way or a slow way of a cache is disclosed. The method includes receiving a block of binary information to be stored in a cache memory having a plurality of ways. The plurality of ways includes a first subset of ways and a second subset of ways, wherein a cache access by a first execution core from one of the first subset of ways has a lower latency time than a cache access from one of the second subset of ways. The method further includes determining, based on a predetermined access latency and one or more parameters associated with the block of binary information, whether to store the block of binary information into one of the first set of ways or one of the second set of ways.

    摘要翻译: 公开了一种用于以快速方式或缓存方式存储二进制信息的方法。 该方法包括接收要存储在具有多个方式的高速缓冲存储器中的二进制信息块。 多种方式包括方法的第一子集和方法的第二子集,其中来自第一方法子集之一的第一执行核心的高速缓存访​​问具有比来自第二子集的第二子集 方法。 该方法还包括基于预定访问等待时间和与二进制信息块相关联的一个或多个参数确定是否将二进制信息块存储为第一组方式之一或第二组路径之一。