Method and apparatus for processing multiple cache misses using reload
folding and store merging
    1.
    发明授权
    Method and apparatus for processing multiple cache misses using reload folding and store merging 失效
    使用重载折叠和存储合并处理多个高速缓存未命中的方法和装置

    公开(公告)号:US5809530A

    公开(公告)日:1998-09-15

    申请号:US558071

    申请日:1995-11-13

    IPC分类号: G06F12/08 G06F13/00

    CPC分类号: G06F12/0897 G06F12/0859

    摘要: A data processor (40) keeps track of misses to a cache (71) so that multiple misses within the same cache line can be merged or folded at reload time. A load/store unit (60) includes a completed store queue (61) for presenting store requests to the cache (71) in order. If a store request misses in the cache (71), the completed store queue (61) requests the cache line from a lower-level memory system (90) and thereafter inactivates the store request. When a reload cache line is received, the completed store queue (61) compares the reload address to all entries. If at least one address matches the reload address, one entry's data is merged with the cache line prior to storage in the cache (71). Other matching entries become active and are allowed to reaccess the cache (71). A miss queue (80) coupled between the load/store unit (60) and the lower-level memory system (90) implements reload folding to improve efficiency.

    摘要翻译: 数据处理器(40)跟踪高速缓存(71)的未命中,使得同一高速缓存行内的多个未命中可以在重新加载时被合并或折叠。 加载/存储单元(60)包括用于向高速缓存(71)依次呈现存储请求的完成的存储队列(61)。 如果存储请求在高速缓存(71)中丢失,则完成的存储队列(61)从下级存储器系统(90)请求高速缓存行,然后使存储请求失效。 当接收到重新加载高速缓存行时,完成的存储队列(61)将重新加载地址与所有条目进行比较。 如果至少一个地址与重新加载地址匹配,则一个条目的数据在高速缓存存储之前与高速缓存行合并(71)。 其他匹配条目变为活动状态,并允许其重新访问高速缓存(71)。 耦合在加载/存储单元(60)和下层存储器系统(90)之间的缺失队列(80)实现重载折叠以提高效率。

    Method and apparatus that enforces a regional memory model in hierarchical memory systems
    2.
    发明授权
    Method and apparatus that enforces a regional memory model in hierarchical memory systems 有权
    在分层存储器系统中实施区域存储器模型的方法和装置

    公开(公告)号:US06370632B1

    公开(公告)日:2002-04-09

    申请号:US09195758

    申请日:1998-11-18

    IPC分类号: G06F1200

    摘要: The present invention discloses a method and apparatus that uses extensions to the TLB entry to dynamically identify pages of memory that can be weakly ordered or must be strongly ordered and enforces the appropriate memory model on those pages of memory. Such identification and memory model enforcement allows for more efficient execution of memory instructions in a hierarchical memory design in cases where memory instructions can be executed out of order. From the page table, the memory manager constructs TLB entries that associate page frame numbers of memory operands with page-granular client usage data and a memory order tag. The memory order tag identifies the memory model that is currently being enforced for the associated page of memory. The memory manager updates the memory order tag of the TLB entry in accordance with changes in the client usage information. In the preferred embodiment, the TLB structure is a global TLB shared by all processors. In alternative embodiments, the TLB structure may comprise either multiple distributed TLBs with shared knowledge, each assigned to a different processor, or a combination of multiple local TLBs, each assigned to a different processor, that exchange information with a global TLB, which in turn provides data to the memory manager to access the hierarchical memory system.

    摘要翻译: 本发明公开了一种方法和装置,其使用对TLB条目的扩展来动态地识别可以被弱排序或必须被强制排序的存储器的页面,并且在这些存储器页面上实施适当的存储器模型。 这样的识别和存储器模型实施允许在存储器指令可以无序执行的情况下在分层存储器设计中更有效地执行存储器指令。 从页表中,存储器管理器构造将存储器操作数的页面帧号与页面细粒度的客户端使用数据和存储器顺序标签相关联的TLB条目。 内存顺序标签标识当前正在为相关的内存页执行的内存模型。 内存管理器根据客户端使用信息的变化更新TLB条目的内存顺序标签。 在优选实施例中,TLB结构是由所有处理器共享的全局TLB。 在替代实施例中,TLB结构可以包括具有共享知识的多个分布式TLB,每个分配的TLB分配给不同的处理器,或者分配给不同处理器的多个本地TLB的组合,其与全局TLB交换信息, 向存储器管理器提供数据以访问分层存储器系统。

    Method and apparatus for TLB memory ordering
    3.
    发明授权
    Method and apparatus for TLB memory ordering 有权
    用于TLB存储器排序的方法和装置

    公开(公告)号:US06260131B1

    公开(公告)日:2001-07-10

    申请号:US09195779

    申请日:1998-11-18

    IPC分类号: G06F1202

    CPC分类号: G06F12/1027 G06F12/0837

    摘要: The present invention discloses a method and apparatus that uses extensions to the TLB entry to dynamically identify pages of memory that can be weakly ordered or must be strongly ordered and enforces the appropriate memory model on those pages of memory. Such identification and memory model enforcement allows for more efficient execution of memory instructions in a hierarchical memory design in cases where memory instructions can be executed out of order. From the page table, the memory manager constructs TLB entries that associate page frame numbers of memory operands with page-granular client usage data and a memory order tag. The memory order tag identifies the memory model that is currently being enforced for the associated page of memory. The memory manager updates the memory order tag of the TLB entry in accordance with changes in the client usage information. In the preferred embodiment, the TLB structure is a global TLB shared by all processors. In alternative embodiments, the TLB structure may comprise either multiple distributed TLBs with shared knowledge, each assigned to a different processor, or a combination of multiple local TLBs, each assigned to a different processor, that exchange information with a global TLB, which in turn provides data to the memory manager to access the hierarchical memory system.

    摘要翻译: 本发明公开了一种方法和装置,其使用对TLB条目的扩展来动态地识别可以被弱排序或必须被强制排序的存储器的页面,并且在这些存储器页面上实施适当的存储器模型。 这样的识别和存储器模型实施允许在存储器指令可以无序执行的情况下在分层存储器设计中更有效地执行存储器指令。 从页表中,存储器管理器构造将存储器操作数的页面帧号与页面细粒度的客户端使用数据和存储器顺序标签相关联的TLB条目。 内存顺序标签标识当前正在为相关的内存页执行的内存模型。 内存管理器根据客户端使用信息的变化更新TLB条目的内存顺序标签。 在优选实施例中,TLB结构是由所有处理器共享的全局TLB。 在替代实施例中,TLB结构可以包括具有共享知识的多个分布式TLB,每个分配的TLB分配给不同的处理器,或者分配给不同处理器的多个本地TLB的组合,其与全局TLB交换信息, 向存储器管理器提供数据以访问分层存储器系统。

    Data processor with unified store queue permitting hit under miss memory
accesses
    4.
    发明授权
    Data processor with unified store queue permitting hit under miss memory accesses 失效
    具有统一存储队列的数据处理器允许在错误的存储器访问下打印

    公开(公告)号:US5621896A

    公开(公告)日:1997-04-15

    申请号:US523313

    申请日:1995-09-05

    IPC分类号: G06F9/38 G06F13/00

    摘要: A store queue for use in a data processor (10) with a memory storage system has a first-in-first-out ("FIFO") queue (48) and control circuitry (52). The control circuitry maintains three pointers which index the entries in the FIFO queue: a dispatch pointer (D), a completion pointer (C), and an oldest miss pointer (OM). The control circuitry stores each stole instruction in the entry designated by the dispatch pointer and then increments the dispatch pointer. The control circuitry increments the completion pointer when the data processor indicates that the previously designated store instruction is the oldest instruction in the data processor: when the instruction is "completed." The control circuitry increments the oldest miss pointer after it presents the previously designated store instruction to the memory system.

    摘要翻译: 用于具有存储器存储系统的数据处理器(10)中的存储队列具有先进先出(“FIFO”)队列(48)和控制电路(52)。 控制电路维护三个指针,它们对FIFO队列中的条目进行索引:调度指针(D),完成指针(C)和最旧的错误指针(OM)。 控制电路将每个偷窃指令存储在由分派指针指定的条目中,然后增加分派指针。 当数据处理器指示先前指定的存储指令是数据处理器中最早的指令时,控制电路递增完成指针:当指令“完成”时。 在将先前指定的存储指令提交给存储器系统之后,控制电路使最旧的丢失指针递增。