Cache updating in multiprocessor systems
    1.
    发明授权
    Cache updating in multiprocessor systems 失效
    多处理器系统中的缓存更新

    公开(公告)号:US06728842B2

    公开(公告)日:2004-04-27

    申请号:US10061859

    申请日:2002-02-01

    IPC分类号: G06F1200

    CPC分类号: G06F12/0831

    摘要: Embodiments are provided in which cache update is implemented by using a counter table having a plurality of entries to keep track of different modified cache lines of a cache of a processor. If a cache line of the cache is modified by the processor and the original content of the cache line came from a cache of another processor, a counter in the counter table restarts and reaches a predetermined value (e.g., overflows) triggering the broadcast of the modified cache line so that the cache of the other processor can snarf a copy of the modified cache line. As a result, when the other processor reads from a memory address matching that of the cache line, the cache of the other processor already has the most current copy for the matching memory address to feed the processor. Therefore, a cache read miss is avoided and system performance is improved.

    摘要翻译: 提供了实施例,其中通过使用具有多个条目的计数器表来实现高速缓存更新,以跟踪处理器的高速缓存的不同修改的高速缓存行。 如果高速缓存的高速缓存行被处理器修改,并且高速缓存行的原始内容来自另一个处理器的高速缓存,则计数器表中的计数器重新启动并达到触发广播的预定值(例如,溢出) 修改的高速缓存行,使得其他处理器的缓存可以绕过修改的高速缓存行的副本。 结果,当另一个处理器从与高速缓存行的存储器地址匹配的存储器地址读取时,另一个处理器的高速缓存器已经具有用于匹配存储器地址的最新的副本来馈送处理器。 因此,避免了缓存读取缺失,提高了系统性能。

    Cache line use history based done bit modification to D-cache replacement scheme
    2.
    发明授权
    Cache line use history based done bit modification to D-cache replacement scheme 失效
    高速缓存线使用基于历史的完成位修改到D缓存替换方案

    公开(公告)号:US08429350B2

    公开(公告)日:2013-04-23

    申请号:US13405572

    申请日:2012-02-27

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F12/08

    摘要: A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times.

    摘要翻译: 提供基于历史的完成逻辑的方法包括在L2高速缓存中接收高速缓存行; 确定所述高速缓存行是否具有至少三次在先前呼叫进入所述L2高速缓存的访问历史; 如果访问历史少于至少三次,则将缓存线直接提供给处理器; 并且如果访问历史是至少三次,则将高速缓存行加载到L1高速缓存中。

    D-cache line use history based done bit based on successful prefetchable counter
    3.
    发明授权
    D-cache line use history based done bit based on successful prefetchable counter 有权
    基于成功的可预取计数器,D-缓存行使用基于历史的完成位

    公开(公告)号:US08171224B2

    公开(公告)日:2012-05-01

    申请号:US12473328

    申请日:2009-05-28

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F12/08 G06F12/00

    CPC分类号: G06F12/127 G06F12/0862

    摘要: A method of providing history based done logic for a D-cache includes receiving a D-cache line in an L2 cache; determining if the D-cache line is unprefetchable; aging the D-cache line without a delay if the D-cache line is prefetchable; and aging the D-cache line with a delay if the D-cache line is unprefetchable.

    摘要翻译: 提供用于D缓存的基于历史的完成逻辑的方法包括在L2高速缓存中接收D-高速缓存行; 确定D-高速缓存行是否无法恢复; 如果D-高速缓存行是可预取的,则不延迟地老化D-高速缓存行; 并且如果D-高速缓存行是无法恢复的,则会延迟地对D-缓存行进行老化。

    Vector morphing mechanism for multiple processor cores
    4.
    发明授权
    Vector morphing mechanism for multiple processor cores 有权
    多处理器核心的矢量变形机制

    公开(公告)号:US08135941B2

    公开(公告)日:2012-03-13

    申请号:US12233729

    申请日:2008-09-19

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F15/00 G06F15/76

    摘要: One embodiment of the invention provides a processor. The processor generally includes a first and second processor core, each having a plurality of pipelined execution units for executing an issue group of multiple instructions and scheduling logic configured to issue a first issue group of instructions to the first processor core for execution and a second issue group of instructions to the second processor core for execution when the processor is in a first mode of operation and configured to issue one or more vector instructions for concurrent execution on the first and second processor cores when the processor is in a second mode of operation.

    摘要翻译: 本发明的一个实施例提供一种处理器。 处理器通常包括第一和第二处理器核心,每个处理器核心具有多个流水线执行单元,用于执行多个指令和调度逻辑的问题组,调度逻辑被配置为向第一处理器核发出第一问题指令组以供执行,第二个问题 当所述处理器处于第一操作模式并且被配置为当所述处理器处于第二操作模式时,发出用于在所述第一和第二处理器核上并行执行的一个或多个向量指令时,所述第二处理器核心的指令组用于执行。

    3-dimensional L2/L3 cache array to hide translation (TLB) delays
    5.
    发明授权
    3-dimensional L2/L3 cache array to hide translation (TLB) delays 有权
    3维L2 / L3缓存阵列隐藏翻译(TLB)延迟

    公开(公告)号:US08019968B2

    公开(公告)日:2011-09-13

    申请号:US12031006

    申请日:2008-02-14

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F12/00

    CPC分类号: G06F12/1027 G06F12/0897

    摘要: Embodiments of the invention provide a look-aside-look-aside buffer (LLB) configured to retain a portion of the real addresses in a translation look-aside (TLB) buffer to allow prefetching of data from a cache. A subset of real address bits associated with an effective address may be retrieved relatively quickly from the LLB, thereby allowing access to the cache before the complete address translation is available and reducing cache access latency.

    摘要翻译: 本发明的实施例提供了配置为将实际地址的一部分保留在翻译后备(TLB)缓冲器中以允许来自高速缓存的数据预取的查看备用缓冲器(LLB)。 可以从LLB相对快速地检索与有效地址相关联的真实地址位的子集,由此允许在完成地址转换可用之前访问高速缓存,并减少高速缓存访​​问等待时间。

    I-CACHE LINE USE HISTORY BASED DONE BIT BASED ON SUCCESSFUL PREFETCHABLE COUNTER
    6.
    发明申请
    I-CACHE LINE USE HISTORY BASED DONE BIT BASED ON SUCCESSFUL PREFETCHABLE COUNTER 失效
    基于成功的预制计数器的I-CACHE线路使用基于历史的完成位

    公开(公告)号:US20100306472A1

    公开(公告)日:2010-12-02

    申请号:US12473337

    申请日:2009-05-28

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F12/08 G06F12/00

    CPC分类号: G06F12/0862 G06F12/127

    摘要: A method of providing history based done logic for a I-cache includes receiving an I-cache line in an L2 cache; determining if the I-cache line is unprefetchable; aging the I-cache line without a delay if the I-cache line is prefetchable; and aging the I-cache line with a delay is the I-cache line is unprefetchable.

    摘要翻译: 为I缓存提供基于历史的完成逻辑的方法包括:在L2高速缓存中接收I缓存行; 确定I缓存行是否无法恢复; 如果I缓存行是可预取的,则会老化I缓存行,而不会延迟; 并且I-cache行的延迟老化是I-cache行是无法恢复的。

    Simple load and store disambiguation and scheduling at predecode
    7.
    发明授权
    Simple load and store disambiguation and scheduling at predecode 失效
    简单的加载和存储消除歧义和调度在预编码

    公开(公告)号:US07730283B2

    公开(公告)日:2010-06-01

    申请号:US12174529

    申请日:2008-07-16

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F9/30

    摘要: Embodiments of the invention provide a processor for executing instructions. In one embodiment, the processor includes circuitry to receive a load instruction and a store instruction to be executed in the processor and detect a conflict between the load instruction and the store instruction. Detecting the conflict includes determining if load-store conflict information indicates that the load instruction previously conflicted with the store instruction. The load-store conflict information is stored for both the load instruction and the store instruction. The processor further includes circuitry to schedule execution of the load instruction and the store instruction so that execution of the load instruction and the store instruction do not result in a conflict.

    摘要翻译: 本发明的实施例提供了一种用于执行指令的处理器。 在一个实施例中,处理器包括用于接收要在处理器中执行的加载指令和存储指令的电路,并检测加载指令与存储指令之间的冲突。 检测冲突包括确定加载存储冲突信息是否指示加载指令先前与存储指令冲突。 存储加载指令和存储指令的加载存储冲突信息。 处理器还包括调度加载指令和存储指令的执行的电路,使得加载指令和存储指令的执行不会导致冲突。

    LOCAL AND GLOBAL BRANCH PREDICTION INFORMATION STORAGE
    8.
    发明申请
    LOCAL AND GLOBAL BRANCH PREDICTION INFORMATION STORAGE 失效
    本地和全球分行预测信息存储

    公开(公告)号:US20090138690A1

    公开(公告)日:2009-05-28

    申请号:US12364350

    申请日:2009-02-02

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F9/38

    摘要: Embodiments of the invention provide an apparatus of storing branch prediction information. In one embodiment, an integrated circuit device includes a first table for storing local branch prediction information, a second table for storing global branch prediction information, and circuitry. The circuitry is configured to receive a branch instruction and store local branch prediction information for the branch instruction in the first table. The local branch prediction information includes a local predictability value for the local branch prediction information. The circuitry is further configured to store global branch prediction information for the branch instruction in the second table only if the local predictability value is below a threshold value of predictability.

    摘要翻译: 本发明的实施例提供一种存储分支预测信息的装置。 在一个实施例中,集成电路装置包括用于存储本地分支预测信息的第一表,用于存储全局分支预测信息的第二表和电路。 该电路被配置为在第一表中接收分支指令并存储用于分支指令的本地分支预测信息。 本地分支预测信息包括本地分支预测信息的本地可预测性值。 电路还被配置为仅在本地可预测性值低于可预测性的阈值时,才存储第二表中的转移指令的全局分支预测信息。

    Low Cost Persistent Instruction Predecoded Issue and Dispatcher
    9.
    发明申请
    Low Cost Persistent Instruction Predecoded Issue and Dispatcher 审中-公开
    低成本持续指令预编码问题和调度器

    公开(公告)号:US20080148020A1

    公开(公告)日:2008-06-19

    申请号:US11610214

    申请日:2006-12-13

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F9/30

    摘要: Improved techniques for executing instructions in a pipelined manner that may reduce stalls that occur when executing dependent instructions are provided. Stalls may be reduced by utilizing a cascaded arrangement of pipelines with execution units that are delayed with respect to each other. This cascaded delayed arrangement allows dependent instructions to be issued within a common issue group by scheduling them for execution in different pipelines to execute at different times.

    摘要翻译: 提供了用于以流水线方式执行指令的改进技术,其可以减少执行依赖指令时发生的停顿。 可以通过利用具有相对于彼此延迟的执行单元的管道的级联排列来减少停顿。 这种级联延迟安排允许在普通问题组中发布相关指令,方法是调度它们以在不同管道中执行以在不同时间执行。

    Simple Load and Store Disambiguation and Scheduling at Predecode
    10.
    发明申请
    Simple Load and Store Disambiguation and Scheduling at Predecode 失效
    简单加载和存储消除歧义和调度在预编码

    公开(公告)号:US20070288726A1

    公开(公告)日:2007-12-13

    申请号:US11422647

    申请日:2006-06-07

    申请人: David A. Luick

    发明人: David A. Luick

    IPC分类号: G06F9/40

    摘要: Embodiments of the invention provide a method and processor for executing instructions. In one embodiment, the method includes receiving a load instruction and a store instruction to be executed in the processor and detecting a conflict between the load instruction and the store instruction. Detecting the conflict includes determining if load-store conflict information indicates that the load instruction previously conflicted with the store instruction. The load-store conflict information is stored for both the load instruction and the store instruction. The method further includes scheduling execution of the load instruction and the store instruction so that execution of the load instruction and the store instruction do not result in a conflict.

    摘要翻译: 本发明的实施例提供了一种用于执行指令的方法和处理器。 在一个实施例中,该方法包括接收要在处理器中执行的加载指令和存储指令,并检测加载指令与存储指令之间的冲突。 检测冲突包括确定加载存储冲突信息是否指示加载指令之前与存储指令冲突。 存储加载指令和存储指令的加载存储冲突信息。 该方法还包括调度加载指令和存储指令的执行,使得加载指令和存储指令的执行不会导致冲突。