Apparatus and method for pre-fetching data to cached memory using persistent historical page table data
    11.
    发明申请
    Apparatus and method for pre-fetching data to cached memory using persistent historical page table data 失效
    使用持久性历史页表数据预取数据到缓存存储器的装置和方法

    公开(公告)号:US20050071571A1

    公开(公告)日:2005-03-31

    申请号:US10675732

    申请日:2003-09-30

    申请人: David Luick

    发明人: David Luick

    IPC分类号: G06F12/08

    摘要: A computer system includes a main memory, at least one processor, and at least one level of cache. The system maintains reference history data with respect to each addressable page in memory, preferably in a page table. The reference history data is preferably used to determine which cacheable sub-units of the page should be pre-fetched to the cache. The reference history data is preferably an up or down counter which is incremented if the cacheable sub-unit is loaded into cache and is referenced by the processor, and decremented if the sub-unit is loaded into cache and is not referenced before being cast out. The reference counter thus expresses an approximate likelihood, based on recent history, that the sub-unit will be referenced in the near future.

    摘要翻译: 计算机系统包括主存储器,至少一个处理器和至少一个级别的高速缓存。 该系统相对于存储器中的每个可寻址页面优选地在页表中维护参考历史数据。 参考历史数据优选地用于确定页面的哪些可高速缓存的子单元应该被预取到高速缓存。 参考历史数据优选地是一个向上或向下计数器,如果可高速缓存的子单元被加载到高速缓存中并被处理器引用,则其递增,并且如果子单元被加载到高速缓存中并且在被抛出之前未被引用则递减 。 因此,参考计数器基于最近的历史来表示近似可能性,即在不久的将来将引用子单元。

    Multiple parallel pipeline processor having self-repairing capability
    12.
    发明申请
    Multiple parallel pipeline processor having self-repairing capability 失效
    具有自修复能力的多并行管线处理器

    公开(公告)号:US20050066148A1

    公开(公告)日:2005-03-24

    申请号:US10667097

    申请日:2003-09-18

    申请人: David Luick

    发明人: David Luick

    IPC分类号: G06F9/30 G06F9/38 G06F15/00

    摘要: A multiple parallel pipeline digital processing apparatus has the capability to substitute a second pipeline for a first in the event that a failure is detected in the first pipeline. Preferably, a redundant pipeline is shared by multiple primary pipelines. Preferably, the pipelines are located physically adjacent one another in an array. A pipeline failure causes data to be shifted one position within the array of pipelines, to by-pass the failing pipeline, so that each pipeline has only two sources of data, a primary and an alternate. Preferably, selection logic controlling the selection between a primary and alternate source of pipeline data is integrated with other pipeline operand selection logic.

    摘要翻译: 如果在第一流水线中检测到故障,则多并行流水线数字处理装置具有替代第二流水线的能力。 优选地,冗余流水线由多个主要管道共享。 优选地,管线在阵列中物理上彼此相邻地定位。 流水线故障导致数据在管道阵列内移动一个位置,以绕过故障流水线,从而每个流水线只有两个数据源,一个主要和一个备用数据。 优选地,控制流水线数据的主源和备用源之间的选择的选择逻辑与其他流水线操作数选择逻辑集成。

    REGISTER FILE BIT AND METHOD FOR FAST CONTEXT SWITCH
    13.
    发明申请
    REGISTER FILE BIT AND METHOD FOR FAST CONTEXT SWITCH 失效
    寄存器文件位和快速上下文开关的方法

    公开(公告)号:US20070294515A1

    公开(公告)日:2007-12-20

    申请号:US11848498

    申请日:2007-08-31

    申请人: David Luick

    发明人: David Luick

    IPC分类号: G06F9/30

    摘要: A register file bit includes a primary latch and a secondary latch with a feedback path and a context switch mechanism that allows a fast context switch when execution changes from one thread to the next. A bit value for a second thread of execution is stored in the primary latch, then transferred to the secondary latch. The bit value for a first thread of execution is then written to the primary latch. When a context switch is needed (when the first thread stalls and the second thread needs to begin execution), the register file bit can perform a context switch from the first thread to the second thread in a single clock cycle. The register file bit contains a backup latch inside the register file itself so that minimal extra wire paths are needed to or from the existing register file.

    摘要翻译: 寄存器文件位包括主锁存器和具有反馈路径的辅助锁存器和上下文切换机制,当执行从一个线程改变到下一个线程时允许快速上下文切换。 第二个执行线程的位值存储在主锁存器中,然后传送到辅助锁存器。 然后将第一个执行线程的位值写入主锁存器。 当需要上下文切换(当第一个线程停止并且第二个线程需要开始执行时),寄存器文件位可以在单个时钟周期内执行从第一个线程到第二个线程的上下文切换。 寄存器文件位在寄存​​器文件本身内包含一个备用锁存器,以便在现有寄存器文件中需要最少额外的线路路径。

    D-cache miss prediction and scheduling
    14.
    发明申请
    D-cache miss prediction and scheduling 有权
    D缓存未命中预测和调度

    公开(公告)号:US20070186073A1

    公开(公告)日:2007-08-09

    申请号:US11351239

    申请日:2006-02-09

    申请人: David Luick

    发明人: David Luick

    IPC分类号: G06F13/00

    摘要: A method and apparatus for D-cache miss prediction and scheduling is provided. In one embodiment, execution of an instruction in a processor is scheduled. The processor may have at least one cascaded delayed execution pipeline unit having two or more execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The method includes receiving an issue group of instructions, determining if a first instruction in the issue group resulted in a cache miss during a previous execution of the first instruction, and if so, scheduling the first instruction to be executed in a pipeline in which execution is delayed with respect to another pipeline in the cascaded delayed execution pipeline unit.

    摘要翻译: 提供了一种用于D-缓存未命中预测和调度的方法和装置。 在一个实施例中,调度处理器中的指令的执行。 处理器可以具有至少一个具有两个或更多个执行流水线的级联延迟执行流水线单元,该执行流水线以相对于彼此的延迟方式在公共问题组中执行指令。 所述方法包括:接收问题组指令,确定所述问题组中的第一指令是否在先前执行所述第一指令期间导致高速缓存未命中,如果是,则在其中执行的流水线中调度要执行的所述第一指令 相对于级联延迟执行流水线单元中的另一流水线延迟。

    Completion table configured to track a larger number of outstanding instructions
    15.
    发明申请
    Completion table configured to track a larger number of outstanding instructions 有权
    完成表被配置为跟踪大量未完成的指令

    公开(公告)号:US20050228972A1

    公开(公告)日:2005-10-13

    申请号:US10821054

    申请日:2004-04-08

    IPC分类号: G06F12/08 G06F9/30 G06F9/38

    摘要: A method, completion table and processor for tracking a larger number of outstanding instructions. The completion table may include a plurality of entries where each entry tracks a consecutive number of outstanding instructions. Each entry may be configured to store an instruction address and an identification of a first of the consecutive number of outstanding instructions. By being able to track a consecutive number of outstanding instructions, such as the length of a cache line, in each entry in the completion table by only storing the instruction address and identification of the first of the consecutive number of outstanding instruction in that entry, the completion table may be able to track a larger number of outstanding instruction without increasing its size.

    摘要翻译: 一种用于跟踪大量未完成指令的方法,完成表和处理器。 完成表可以包括多个条目,其中每个条目跟踪连续数量的未完成指令。 每个条目可以被配置为存储指令地址和连续数量的未完成指令中的第一个的标识。 通过在该条目中仅存储指令地址和连续数目的未完成指令的第一个的标识,能够跟踪连续数量的未完成指令,例如高速缓存行的长度,在完成表中的每个条目中, 完成表可能能够跟踪大量未完成的指令而不增加其大小。

    Multithreaded processor and method for switching threads
    16.
    发明申请
    Multithreaded processor and method for switching threads 失效
    多线程处理器和切换线程的方法

    公开(公告)号:US20050114856A1

    公开(公告)日:2005-05-26

    申请号:US10717747

    申请日:2003-11-20

    IPC分类号: G06F9/38 G06F9/48 G06F9/46

    CPC分类号: G06F9/4843 G06F9/3851

    摘要: A processor includes primary threads of execution that may simultaneously issue instructions, and one or more backup threads. When a primary thread stalls, the contents of its instruction buffer may be switched with the instruction buffer for a backup thread, thereby allowing the backup thread to begin execution. This design allows two primary threads to issue simultaneously, which allows for overlap of instruction pipeline latencies. This design further allows a fast switch to a backup thread when a primary thread stalls, thereby providing significantly improved throughput in executing instructions by the processor.

    摘要翻译: 处理器包括可以同时发出指令的主执行线程以及一个或多个备用线程。 当主线程停顿时,其指令缓冲器的内容可以用备用线程的指令缓冲器切换,从而允许备份线程开始执行。 此设计允许两个主线程同时发出,这允许指令管线延迟的重叠。 该设计还允许在主线程停止时快速切换到备用线程,从而在由处理器执行指令时提供显着提高的吞吐量。

    Reduction of cache miss rates using shared private caches
    17.
    发明申请
    Reduction of cache miss rates using shared private caches 审中-公开
    使用共享私有缓存降低缓存未命中率

    公开(公告)号:US20050071564A1

    公开(公告)日:2005-03-31

    申请号:US10670715

    申请日:2003-09-25

    申请人: David Luick

    发明人: David Luick

    IPC分类号: G06F12/08 G06F12/12 G06F12/00

    摘要: Methods and systems for reducing cache miss rates for cache are disclosed. Embodiments may include a computer system with one or more processors and each processor may couple with a private cache. Embodiments selectively enable and implement a cache re-allocation scheme for cache lines of the private caches based upon a workload or an expected workload for the processors. In particular, a cache miss rate monitor may count the number of cache misses for each processor. A cache miss rate comparator compares the cache miss rates to determine whether one or more of the processors have significantly higher cache miss rates than the average cache miss rates within a processor module or overall. If one or more processors have significantly higher cache miss rates, cache requests from those processors are forwarded to private caches that have lower cache miss rates and have the least recently used cache lines.

    摘要翻译: 公开了减少缓存高速缓存未命中率的方法和系统。 实施例可以包括具有一个或多个处理器的计算机系统,并且每个处理器可以与专用高速缓存耦合。 实施方式基于处理器的工作负载或预期工作负载,选择性地启用和实现用于私有高速缓存的高速缓存行的高速缓存重新分配方案。 特别地,高速缓存未命中率监视器可以计数每个处理器的高速缓存未命中的数量。 高速缓存未命中率比较器比较高速缓存未命中率,以确定一个或多个处理器是否具有比处理器模块或整体中的平均高速缓存未命中率高得多的高速缓存未命中率。 如果一个或多个处理器具有显着更高的缓存未命中率,则来自那些处理器的高速缓存请求被转发到具有较低高速缓存未命中率并且具有最近最少使用的高速缓存行的专用高速缓存。

    Runtime repairable processor
    18.
    发明申请
    Runtime repairable processor 失效
    运行时可维修处理器

    公开(公告)号:US20050071406A1

    公开(公告)日:2005-03-31

    申请号:US10670716

    申请日:2003-09-25

    申请人: David Luick

    发明人: David Luick

    IPC分类号: G06F7/38

    摘要: A self repairable processor that provides a reliable computing result without increasing the footprint of the on-chip devices. The processor has a plurality of data registers connected to two identical functional units, where only one of the functional units is enabled for computing, the two functional units being placed in a chip area defined at most by data paths needed for one functional unit. When an error condition is detected in the active functional unit, the processor disables the functional unit with an error condition and enables the duplicate functional unit.

    摘要翻译: 一种可自行修复的处理器,可提供可靠的计算结果,而不会增加片上器件的占用空间。 处理器具有连接到两个相同功能单元的多个数据寄存器,其中只有一个功能单元用于计算,两个功能单元被放置在最多由一个功能单元所需的数据路径定义的芯片区域中。 当在活动功能单元中检测到错误条件时,处理器将禁用具有错误状态的功能单元,并启用复制功能单元。

    Mechanism to minimize unscheduled D-cache miss pipeline stalls
    19.
    发明申请
    Mechanism to minimize unscheduled D-cache miss pipeline stalls 有权
    最小化非计划D缓存未命中管道失速的机制

    公开(公告)号:US20070186080A1

    公开(公告)日:2007-08-09

    申请号:US11351247

    申请日:2006-02-09

    申请人: David Luick

    发明人: David Luick

    IPC分类号: G06F9/30

    摘要: A method and apparatus for minimizing unscheduled D-cache miss pipeline stalls is provided. In one embodiment, execution of an instruction in a processor is scheduled. The processor may have at least one cascaded delayed execution pipeline unit having two or more execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The method includes receiving an issue group of instructions, determining if a first instruction in the issue group is a load instruction, and if so, scheduling the first instruction to be executed in a pipeline in which execution is not delayed with respect to another pipeline in the cascaded delayed execution pipeline unit.

    摘要翻译: 提供了一种用于最小化非计划D缓存未命中管道失速的方法和装置。 在一个实施例中,调度处理器中的指令的执行。 处理器可以具有至少一个具有两个或更多个执行流水线的级联延迟执行流水线单元,该执行流水线以相对于彼此的延迟方式在公共问题组中执行指令。 该方法包括接收问题组指令,确定问题组中的第一指令是否为加载指令,如果是,则在相对于另一流水线执行执行没有延迟的流水线中调度要执行的第一指令 级联延迟执行流水线单元。

    Self prefetching L2 cache mechanism for data lines
    20.
    发明申请
    Self prefetching L2 cache mechanism for data lines 审中-公开
    自我预取数据线的L2缓存机制

    公开(公告)号:US20070186050A1

    公开(公告)日:2007-08-09

    申请号:US11347414

    申请日:2006-02-03

    申请人: David Luick

    发明人: David Luick

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0862

    摘要: Embodiments of the present invention provide a method and apparatus for prefetching instruction lines. In one embodiment, the method includes fetching a first instruction line from a level 2 cache, extracting, from the first instruction line, an address identifying a first data line containing data targeted by a data access instruction contained in the first instruction line or a different instruction line; and prefetching, from the level 2 cache, the first data line using the extracted address.

    摘要翻译: 本发明的实施例提供了一种用于预取指令行的方法和装置。 在一个实施例中,该方法包括从级别2高速缓存取出第一指令行,从第一指令行提取标识包含第一指令行中包含的数据访问指令所针对的数据的第一数据行的地址或不同 指令行 并从第二级缓存预取第一条数据线,使用提取的地址。