Techniques for multi-level indirect data prefetching
    71.
    发明授权
    Techniques for multi-level indirect data prefetching 有权
    多级间接数据预取技术

    公开(公告)号:US08161265B2

    公开(公告)日:2012-04-17

    申请号:US12024260

    申请日:2008-02-01

    IPC分类号: G06F13/00

    摘要: A technique for performing data prefetching using multi-level indirect data prefetching includes determining a first memory address of a pointer associated with a data prefetch instruction. Content that is included in a first data block (e.g., a first cache line of a memory) at the first memory address is then fetched. A second memory address is then determined based on the content at the first memory address. Content that is included in a second data block (e.g., a second cache line) at the second memory address is then fetched (e.g., from the memory or another memory). A third memory address is then determined based on the content at the second memory address. Finally, a third data block (e.g., a third cache line) that includes another pointer or data at the third memory address is fetched (e.g., from the memory or the another memory).

    摘要翻译: 使用多级间接数据预取来执行数据预取的技术包括确定与数据预取指令相关联的指针的第一存储器地址。 然后取出包含在第一存储器地址的第一数据块(例如,存储器的第一高速缓存行)中的内容。 然后基于第一存储器地址处的内容来确定第二存储器地址。 包含在第二存储器地址的第二数据块(例如,第二高速缓存行)中的内容然后被取出(例如,从存储器或另一个存储器)。 然后基于第二存储器地址处的内容来确定第三存储器地址。 最后,取出(例如,从存储器或另一个存储器)中包含第三存储器地址处的另一指针或数据的第三数据块(例如,第三高速缓存行)。

    Techniques for indirect data prefetching
    72.
    发明授权
    Techniques for indirect data prefetching 有权
    间接数据预取技术

    公开(公告)号:US08161263B2

    公开(公告)日:2012-04-17

    申请号:US12024239

    申请日:2008-02-01

    IPC分类号: G06F13/00

    CPC分类号: G06F12/0862 G06F2212/6028

    摘要: A processor includes a first address translation engine, a second address translation engine, and a prefetch engine. The first address translation engine is configured to determine a first memory address of a pointer associated with a data prefetch instruction. The prefetch engine is coupled to the first translation engine and is configured to fetch content, included in a first data block (e.g., a first cache line) of a memory, at the first memory address. The second address translation engine is coupled to the prefetch engine and is configured to determine a second memory address based on the content of the memory at the first memory address. The prefetch engine is also configured to fetch (e.g., from the memory or another memory) a second data block (e.g., a second cache line) that includes data at the second memory address.

    摘要翻译: 处理器包括第一地址转换引擎,第二地址转换引擎和预取引擎。 第一地址转换引擎被配置为确定与数据预取指令相关联的指针的第一存储器地址。 预取引擎被耦合到第一翻译引擎,并被配置为在第一存储器地址处提取包含在存储器的第一数据块(例如,第一高速缓存行)中的内容。 第二地址转换引擎耦合到预取引擎,并且被配置为基于第一存储器地址处的存储器的内容来确定第二存储器地址。 预取引擎还被配置为从第二存储器地址提取包括数据的第二数据块(例如,第二高速缓存行)(例如,从存储器或另一存储器)。

    Instruction Set Architecture Extensions for Performing Power Versus Performance Tradeoffs
    73.
    发明申请
    Instruction Set Architecture Extensions for Performing Power Versus Performance Tradeoffs 失效
    用于执行电力与性能权衡的指令集架构扩展

    公开(公告)号:US20110296149A1

    公开(公告)日:2011-12-01

    申请号:US12788940

    申请日:2010-05-27

    IPC分类号: G06F9/318

    摘要: Mechanisms are provided for processing an instruction in a processor of a data processing system. The mechanisms operate to receive, in a processor of the data processing system, an instruction, the instruction including power/performance tradeoff information associated with the instruction. The mechanisms further operate to determine power/performance tradeoff priorities or criteria, specifying whether power conservation or performance is prioritized with regard to execution of the instruction, based on the power/performance tradeoff information. Moreover, the mechanisms process the instruction in accordance with the power/performance tradeoff priorities or criteria identified based on the power/performance tradeoff information of the instruction.

    摘要翻译: 提供了用于处理数据处理系统的处理器中的指令的机制。 这些机制操作以在数据处理系统的处理器中接收指令,该指令包括与指令相关联的功率/性能权衡信息。 这些机制进一步操作以基于功率/性能折衷信息来确定功率/性能折衷优先级或标准,指定功率节省或关于指令的执行是否优先的性能。 此外,机构根据功率/性能折衷优先级或基于指令的功率/性能折衷信息识别的标准处理指令。

    Reducing Energy Consumption of Set Associative Caches by Reducing Checked Ways of the Set Association
    74.
    发明申请
    Reducing Energy Consumption of Set Associative Caches by Reducing Checked Ways of the Set Association 失效
    通过减少集合关联检查方式降低设置关联缓存的能耗

    公开(公告)号:US20110296112A1

    公开(公告)日:2011-12-01

    申请号:US12787122

    申请日:2010-05-25

    IPC分类号: G06F12/08 G06F12/00

    摘要: Mechanisms for accessing a set associative cache of a data processing system are provided. A set of cache lines, in the set associative cache, associated with an address of a request are identified. Based on a determined mode of operation for the set, the following may be performed: determining if a cache hit occurs in a preferred cache line without accessing other cache lines in the set of cache lines; retrieving data from the preferred cache line without accessing the other cache lines in the set of cache lines, if it is determined that there is a cache hit in the preferred cache line; and accessing each of the other cache lines in the set of cache lines to determine if there is a cache hit in any of these other cache lines only in response to there being a cache miss in the preferred cache line(s).

    摘要翻译: 提供了访问数据处理系统的集合关联缓存的机制。 识别与集合关联高速缓存中的与请求的地址相关联的一组高速缓存行。 基于针对集合的确定的操作模式,可以执行以下操作:确定高速缓存命中是否发生在优选高速缓存行中,而不访问该组高速缓存行中的其他高速缓存行; 如果确定在所述优选高速缓存行中存在高速缓存命中,则从所述优选高速缓存行中检索数据而不访问所述一组高速缓存行中的其它高速缓存行; 以及访问该组高速缓存行中的每个其它高速缓存行,以仅在响应于优选高速缓存行中存在高速缓存未命中时确定在这些其它高速缓存行中的任何一个中是否存在高速缓存命中。

    CACHE DIRECTED SEQUENTIAL PREFETCH
    75.
    发明申请
    CACHE DIRECTED SEQUENTIAL PREFETCH 失效
    高速缓存指令序列预选

    公开(公告)号:US20110145509A1

    公开(公告)日:2011-06-16

    申请号:US13023615

    申请日:2011-02-09

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0862 G06F2212/6026

    摘要: A technique for performing stream detection and prefetching within a cache memory simplifies stream detection and prefetching. A bit in a cache directory or cache entry indicates that a cache line has not been accessed since being prefetched and another bit indicates the direction of a stream associated with the cache line. A next cache line is prefetched when a previously prefetched cache line is accessed, so that the cache always attempts to prefetch one cache line ahead of accesses, in the direction of a detected stream. Stream detection is performed in response to load misses tracked in the load miss queue (LMQ). The LMQ stores an offset indicating a first miss at the offset within a cache line. A next miss to the line sets a direction bit based on the difference between the first and second offsets and causes prefetch of the next line for the stream.

    摘要翻译: 用于在高速缓冲存储器内执行流检测和预取的技术简化了流检测和预取。 高速缓存目录或高速缓存条目中的一点表示高速缓存行未被访问,并且另一位指示与高速缓存行相关联的流的方向。 当先前预取的高速缓存行被访问时,预取下一个高速缓存行,使得高速缓存总是尝试在检测到的流的方向上预取访问之前的一个高速缓存行。 响应于在负载未命中队列(LMQ)中跟踪的加载未命中,执行流检测。 LMQ存储指示高速缓存行内的偏移处的第一个未命中的偏移。 下一个未命中的线路将基于第一和第二个偏移量之间的差异设置方向位,并导致流的下一行的预取。

    Complier assisted victim cache bypassing
    76.
    发明授权
    Complier assisted victim cache bypassing 失效
    Complier辅助受害者缓存绕过

    公开(公告)号:US07761673B2

    公开(公告)日:2010-07-20

    申请号:US12355019

    申请日:2009-01-16

    IPC分类号: G06F12/00

    摘要: A method for compiler assisted victim cache bypassing including: identifying a cache line as a candidate for victim cache bypassing; conveying a bypassing-the-victim-cache information to a hardware; and checking a state of the cache line to determine a modified state of the cache line, wherein the cache line is identified for cache bypassing if the cache line that has no reuse within a loop or loop nest and there is no immediate loop reuse or there is a substantial across loop reuse distance so that it will be replaced from both main and victim cache before being reused.

    摘要翻译: 一种用于编译器辅助的受害者缓存旁路的方法,包括:将高速缓存行标识为用于受害者缓存旁路的候选者; 向硬件传送绕过受害者缓存信息; 并且检查高速缓存行的状态以确定高速缓存行的修改状态,其中如果在循环或循环嵌套内没有重用的高速缓存行并且不存在立即循环重用或那里,则高速缓存行被识别用于高速缓存绕过 是一个实质的跨循环重用距离,因此它将被重新使用之前被替换为主缓存和受害缓存。

    Dynamic Adjustment of Prefetch Stream Priority
    77.
    发明申请
    Dynamic Adjustment of Prefetch Stream Priority 有权
    动态调整预取流优先级

    公开(公告)号:US20090198907A1

    公开(公告)日:2009-08-06

    申请号:US12024411

    申请日:2008-02-01

    IPC分类号: G06F12/08

    摘要: A method, processor, and data processing system for dynamically adjusting a prefetch stream priority based on the consumption rate of the data by the processor. The method includes a prefetch engine issuing a prefetch request of a first prefetch stream to fetch one or more data from the memory subsystem. The first prefetch stream has a first assigned priority that determines a relative order for scheduling prefetch requests of the first prefetch stream relative to other prefetch requests of other prefetch streams. Based on the receipt of a processor demand for the data before the data returns to the cache or return of the data along time before the receiving the processor demand, logic of the prefetch engine dynamically changes the first assigned priority to a second higher or lower priority, which priority is subsequently utilized to schedule and issue a next prefetch request of the first prefetch stream.

    摘要翻译: 一种用于基于处理器的数据的消耗速率动态地调整预取流优先级的方法,处理器和数据处理系统。 该方法包括预取引擎,其发出第一预取流的预取请求以从存储器子系统获取一个或多个数据。 第一预取流具有第一分配的优先级,其相对于其他预取流的其他预取请求确定第一预取流的调度预取请求的相对顺序。 基于在数据返回到高速缓存之前对数据的接收处理器需求,或者在接收到处理器需求之前的时间返回数据,预取引擎的逻辑动态地将第一分配的优先级改变为第二较高或更低的优先级 随后利用该优先级来调度和发出第一预取流的下一个预取请求。

    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING HAVING IMPROVED BRANCH TARGET ADDRESS CACHE
    80.
    发明申请
    DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING HAVING IMPROVED BRANCH TARGET ADDRESS CACHE 失效
    数据处理系统,具有改进的分支目标地址高速缓存的数据处理的处理器和方法

    公开(公告)号:US20090049286A1

    公开(公告)日:2009-02-19

    申请号:US11837893

    申请日:2007-08-13

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3804 G06F9/3844

    摘要: A processor includes an execution unit and instruction sequencing logic that fetches instructions from a memory system for execution. The instruction sequencing logic includes branch logic that outputs predicted branch target addresses for use as instruction fetch addresses. The branch logic includes a level one branch target address cache (BTAC) and a level two BTAC each having a respective plurality of entries each associating at least a tag with a predicted branch target address. The branch logic accesses the level one and level two BTACs in parallel with a tag portion of a first instruction fetch address to obtain a first predicted branch target address from the level one BTAC for use as a second instruction fetch address in a first processor clock cycle and a second predicted branch target address from the level two BTAC for use as a third instruction fetch address in a later second processor clock cycle.

    摘要翻译: 处理器包括执行单元和从存储器系统执行指令的指令排序逻辑。 指令排序逻辑包括分支逻辑,该分支逻辑输出用作指令获取地址的预测分支目标地址。 分支逻辑包括一级分支目标地址高速缓存(BTAC)和二级BTAC,每级具有相应的多个条目,每个条目将至少一个标签与预测的分支目标地址相关联。 分支逻辑与第一指令获取地址的标签部分并行地访问一级和二级BTAC以从第一级BTAC获得第一预测分支目标地址,以在第一处理器时钟周期中用作第二指令获取地址 以及来自第二级BTAC的第二预测分支目标地址,以在随后的第二处理器时钟周期中用作第三指令提取地址。