Method, system, and computer program product for out of order instruction address stride prefetch performance verification
    31.
    发明授权
    Method, system, and computer program product for out of order instruction address stride prefetch performance verification 有权
    方法,系统和计算机程序产品,用于无序指令地址步进预取性能验证

    公开(公告)号:US07996203B2

    公开(公告)日:2011-08-09

    申请号:US12023457

    申请日:2008-01-31

    IPC分类号: G06F9/44 G06F13/10 G06F13/12

    摘要: A method, system, and computer program product are provided for verifying out of order instruction address (IA) stride prefetch performance in a processor design having more than one level of cache hierarchies. Multiple instruction streams are generated and the instructions loop back to corresponding instruction addresses. The multiple instruction streams are dispatched to a processor and simulation application to process. When a particular instruction is being dispatched, the particular instruction's instruction address and operand address are recorded in the queue. The processor is monitored to determine if the processor executes fetch and prefetch commands in accordance with the simulation application. It is checked to determine if prefetch commands are issued for instructions having three or more strides.

    摘要翻译: 提供了一种方法,系统和计算机程序产品,用于在具有多于一个级别的高速缓存层级的处理器设计中验证无序指令地址(IA)跨步预取性能。 产生多个指令流,并将指令循环回相应的指令地址。 将多个指令流调度到处理器和仿真应用程序进行处理。 当调度特定指令时,将特定指令的指令地址和操作数地址记录在队列中。 监视处理器以确定处理器是否根据仿真应用执行提取和预取命令。 检查是否为具有三个或更多步长的指令发出预取命令。

    METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR SELECTIVELY ACCELERATING EARLY INSTRUCTION PROCESSING
    32.
    发明申请
    METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR SELECTIVELY ACCELERATING EARLY INSTRUCTION PROCESSING 失效
    方法,系统和计算机程序产品,用于选择性加速早期指导处理

    公开(公告)号:US20090217005A1

    公开(公告)日:2009-08-27

    申请号:US12037861

    申请日:2008-02-26

    IPC分类号: G06F9/30

    CPC分类号: G06F9/3826 G06F9/3836

    摘要: A method for selectively accelerating early instruction processing including receiving an instruction data that is normally processed in an execution stage of a processor pipeline, wherein a configuration of the instruction data allows a processing of the instruction data to be accelerated from the execution stage to an address generation stage that occurs earlier in the processor pipeline than the execution stage, determining whether the instruction data can be dispatched to the address generation stage to be processed without being delayed due to an unavailability of a processing resource needed for the processing of the instruction data in the address generation stage, dispatching the instruction data to be processed in the address generation stage if it can be dispatched without being delayed due to the unavailability of the processing resource, and dispatching the instruction data to be processed in the execution stage if it can not be dispatched without being delayed due to the unavailability of the processing resource, wherein the processing of the instruction data is selectively accelerated using an address generation interlock scheme. A corresponding system and computer program product.

    摘要翻译: 一种用于选择性地加速早期指令处理的方法,包括接收在处理器流水线的执行阶段中正常处理的指令数据,其中指令数据的配置允许指令数据的处理从执行阶段加速到地址 在处理器流水线中比执行阶段更早发生的生成阶段,确定指令数据是否可以被分派到要处理的地址生成阶段,而不会由于处理指令数据所需的处理资源的不可用而被延迟 地址生成阶段,如果能够由于处理资源的不可用而被分派而不被延迟,则在地址生成阶段调度要处理的指令数据,并且如果不能在执行阶段调度要处理的指令数据 由于你而不被推迟 处理资源的可用性,其中使用地址生成互锁方案选择性地加速指令数据的处理。 相应的系统和计算机程序产品。

    Disowning cache entries on aging out of the entry
    33.
    发明授权
    Disowning cache entries on aging out of the entry 失效
    在条目中老化的缓存条目不起作用

    公开(公告)号:US07577795B2

    公开(公告)日:2009-08-18

    申请号:US11339196

    申请日:2006-01-25

    IPC分类号: G06F12/00

    摘要: Portions of data in a processor system are stored in a slower main memory and are transferred to a faster memory comprising a hierarchy of cache structures between one or more processors and the main memory. For a system with shared L2 cache(s) between the processor(s) and the main memory, an individual L1 cache of a processor must first communicate to an associated L2 cache(s), or check with such L2 cache(s), to obtain a copy of a particular line from a given cache location prior to, or upon modification, or appropriation of data at a given cached location. The individual L1 cache further includes provisions for notifying the L2 cache(s) upon determining when the data stored in the particular cache line in the L1 cache has been replaced, and when the particular cache line is disowned by an L1 cache, the L2 cache is updated to change the state of the particular cache line therein from an ownership state of exclusive to a particular identified CPU to an ownership state of exclusive to no CPU, thereby allowing reduction of cross interrogate delays during another processor acquisition of the same cache line.

    摘要翻译: 处理器系统中的数据部分存储在较慢的主存储器中,并被传送到包括一个或多个处理器与主存储器之间的高速缓存结构层级的更快的存储器。 对于在处理器和主存储器之间具有共享L2高速缓存的系统,处理器的单个L1高速缓存必须首先通信到相关联的L2高速缓存,或者与这样的L2高速缓存进行检查, 在给定的高速缓存位置之前或之后,在给定的高速缓存位置处获取特定行的副本,或者在修改之后获得数据的占用。 单独的L1高速缓存还包括用于在确定何时存储在L1高速缓存中的特定高速缓存行中的数据已经被替换的情况下通知L2高速缓存,并且当特定高速缓存行被L1高速缓存取消时,L2高速缓存 被更新为将其中的特定高速缓存行的状态从独占的所有权状态改变为特定的所识别的CPU到不属于CPU的独占的所有权状态,从而允许在对同一高速缓存行的另一个处理器采集期间减少交叉询问延迟。

    System and method for simultaneous access of the same line in cache storage
    34.
    发明授权
    System and method for simultaneous access of the same line in cache storage 有权
    缓存存储中同一行同时访问的系统和方法

    公开(公告)号:US07035986B2

    公开(公告)日:2006-04-25

    申请号:US10435967

    申请日:2003-05-12

    IPC分类号: G06F12/00

    摘要: An embodiment of the invention is a processor for providing simultaneous access to the same data for a plurality of requests. The processor includes cache storage having an address sliced directory lookup structure. A same line detection unit receives a plurality of first instruction fields and a plurality of second instruction fields. The same line detection unit generates a same line signal in response to the first instruction fields and the second instruction fields. The cache storage simultaneously reads data from a single line in the cache storage in response to the same line signal.

    摘要翻译: 本发明的一个实施例是用于提供对多个请求的同一数据的同时访问的处理器。 处理器包括具有地址分片目录查找结构的高速缓冲存储器。 同一行检测单元接收多个第一指令字段和多个第二指令字段。 相同的行检测单元响应于第一指令字段和第二指令字段产生相同的行信号。 高速缓存存储器同时响应于相同的线路信号从缓存存储器中的单个行读取数据。

    AUTOMATIC PATTERN-BASED OPERAND PREFETCHING
    37.
    发明申请
    AUTOMATIC PATTERN-BASED OPERAND PREFETCHING 有权
    基于自动图案的操作预设

    公开(公告)号:US20130339617A1

    公开(公告)日:2013-12-19

    申请号:US13523922

    申请日:2012-06-15

    IPC分类号: G06F12/12

    摘要: Embodiments relate to automatic pattern-based operand prefetching. An aspect includes receiving, by prefetch logic in a processor, an operand cache miss from a pipeline of the processor. Another aspect includes determining that an entry in a history table corresponding to the operand cache miss exists based on an instruction address of the operand cache miss. Yet another aspect includes, based on determining that the entry corresponding to the operand cache miss exists in the history table, issuing a prefetch instruction for a second operand based on the determined entry in the history table, and writing the determined entry into a miss buffer.

    摘要翻译: 实施例涉及自动基于模式的操作数预取。 一个方面包括通过处理器中的预取逻辑从处理器的流水线接收操作数高速缓存未命中。 另一方面包括基于操作数高速缓存未命中的指令地址来确定与操作数高速缓存未命中对应的历史表中的条目。 另一方面包括:基于确定对应于操作数高速缓存未命中的条目存在于历史表中,基于历史表中确定的条目发布第二操作数的预取指令,并将所确定的条目写入未命中缓冲器 。

    PROCESS IDENTIFIER-BASED CACHE DATA TRANSFER
    38.
    发明申请
    PROCESS IDENTIFIER-BASED CACHE DATA TRANSFER 有权
    基于过程识别器的高速缓存数据传输

    公开(公告)号:US20130332670A1

    公开(公告)日:2013-12-12

    申请号:US13493636

    申请日:2012-06-11

    IPC分类号: G06F12/08

    摘要: Embodiments of the invention relate to process identifier (PID) based cache information transfer. An aspect of the invention includes sending, by a first core of a processor, a PID associated with a cache miss in a first local cache of the first core to a second cache of the processor. Another aspect of the invention includes determining that the PID associated with the cache miss is listed in a PID table of the second cache. Yet another aspect of the invention includes based on the PID being listed in the PID table of the second cache, determining a plurality of entries in a cache directory of the second cache that are associated with the PID. Yet another aspect of the invention includes pushing cache information associated with each of the determined plurality of entries in the cache directory from the second cache to the first local cache.

    摘要翻译: 本发明的实施例涉及基于过程标识符(PID)的高速缓存信息传送。 本发明的一个方面包括由处理器的第一核心将与第一核心的第一本地高速缓存中的高速缓存未命中相关联的PID发送到处理器的第二高速缓存。 本发明的另一方面包括确定与高速缓存未命中相关联的PID被列在第二高速缓存的PID表中。 本发明的另一方面包括基于PID列在第二高速缓存的PID表中,确定与PID相关联的第二高速缓存的高速缓存目录中的多个条目。 本发明的另一方面包括将高速缓存目录中的确定的多个条目中的每一个相关联的缓存信息从第二高速缓存推送到第一本地高速缓存。