Method and apparatus for implementing a single clock cycle line
replacement in a data cache unit
    93.
    发明授权
    Method and apparatus for implementing a single clock cycle line replacement in a data cache unit 失效
    用于在数据高速缓存单元中实现单个时钟周期线替换的方法和装置

    公开(公告)号:US5526510A

    公开(公告)日:1996-06-11

    申请号:US315889

    申请日:1994-09-30

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0831 G06F12/0859

    摘要: The data cache unit includes a separate fill buffer and a separate write-back buffer. The fill buffer stores one or more cache lines for transference into data cache banks of the data cache unit. The write-back buffer stores a single cache line evicted from the data cache banks prior to write-back to main memory. Circuitry is provided for transferring a cache line from the fill buffer into the data cache banks while simultaneously transferring a victim cache line from the data cache banks into the write-back buffer. Such allows the overall replace operation to be performed in only a single clock cycle. In a particular implementation, the data cache unit is employed within a microprocessor capable of speculative and out-of-order processing of memory instructions. Moreover, the microprocessor is incorporated within a multiprocessor computer system wherein each microprocessor is capable of snooping the cache lines of data cache units of each other microprocessor. The data cache unit is also a non-blocking cache.

    摘要翻译: 数据高速缓存单元包括单独的填充缓冲器和单独的回写缓冲器。 填充缓冲器存储用于转移到数据高速缓存单元的数据高速缓存组中的一个或多个高速缓存行。 回写缓冲器在回写到主存储器之前存储从数据高速缓冲存储器中逐出的单个高速缓存行。 提供电路用于将高速缓存行从填充缓冲器传送到数据高速缓存组,同时将受害缓存行从数据高速缓冲存储体传输到回写缓冲器。 这样允许整个替换操作仅在单个时钟周期中执行。 在特定实现中,在能够对存储器指令进行推测和无序处理的微处理器中采用数据高速缓存单元。 此外,微处理器并入多处理器计算机系统中,其中每个微处理器能够窥探每个其他微处理器的数据高速缓存单元的高速缓存行。 数据高速缓存单元也是非阻塞缓存。

    Apparatus and method for entry allocation for a resource buffer
    94.
    发明授权
    Apparatus and method for entry allocation for a resource buffer 失效
    资源缓冲区的入口分配装置和方法

    公开(公告)号:US5490280A

    公开(公告)日:1996-02-06

    申请号:US267776

    申请日:1994-06-28

    IPC分类号: G06F9/38 G06F3/00

    摘要: A method and apparatus for allocating a number of vacant entries of a buffer resource and generating a set of enable vectors based thereon for a set of issued instructions. A deallocation vector of a reservation station is searched in order to locate, within one clock cycle, the vacancies within the reservation station for storage of instruction information associated with several issued operations. Vacancies are indicated by bits of the deallocation vector. A general static and dynamic approach are disclosed for performing the vacant entry identification at high speed within a single clock cycle. Alternate embodiments are disclosed, based on the general approach, that divide the deallocation vector into separate portions (consecutive bits or interleaved) and process each portion based on the general approaches. Rotating priority reference points within the deallocation vector may be used to vary the starting point for vacancy location. Further, the vacancy search can be limited to finding only consecutive vacancies. A superscalar microprocessor using the above may, within one clock cycle, schedule a group of instructions from the instruction decoder to the reservation station for subsequent execution.

    摘要翻译: 一种分配缓冲器资源的空闲条目的方法和装置,并且基于一组发出的指令产生一组使能向量。 搜索保留站的解除分配向量,以便在一个时钟周期内定位保留站内的空位,以存储与几个发布操作相关联的指令信息。 空位由释放向量的位指示。 公开了一种一般的静态和动态方法,用于在单个时钟周期内高速执行空闲条目识别。 基于一般方法公开了替代实施例,其将解除分配向量划分为单独的部分(连续比特或交织),并且基于一般方法处理每个部分。 可以使用释放向量内的旋转优先参考点来改变空位的起点。 此外,空缺搜查可以限于仅找到连续的空缺。 使用上述的超标量微处理器可以在一个时钟周期内调度从指令译码器到保留站的一组指令,用于随后的执行。

    Mechanism to protect data saved on a local register cache during
inter-subsystem calls and returns
    95.
    发明授权
    Mechanism to protect data saved on a local register cache during inter-subsystem calls and returns 失效
    在子系统间调用和返回时保护保存在本地寄存器缓存中的数据的机制

    公开(公告)号:US5448707A

    公开(公告)日:1995-09-05

    申请号:US219408

    申请日:1994-03-29

    摘要: An apparatus for protecting the data in a local register cache during calls and returns that cross subsystem boundaries. The left most bit of a shift register is set to a 1 if a "set boundary bit" instruction is detected. A subsequent PUSH instruction shifts the shift register right one bit. A POPTOS1 instruction in the instruction flow signifies an intra-subsystem return and causes the leftmost bit of the shift register to be checked to see if it is a zero. A protection fault is signalled upon the condition that the leftmost bit is not equal to zero. The shift register is shifted left one bit upon the condition that a POPSTOS 2 instruction is detected. A POPSUB1 instruction detected in the instruction flow signifies an inter-subsystem return and causes the leftmost bit of the shift register to be checked to see if it is a one. A a protection fault is signalled upon the condition that the leftmost bit is not equal to zero. The register is shifted left one bit upon the condition that a POPSTOS 2 instruction is detected in the instruction flow.

    摘要翻译: 一种用于在调用期间保护本地寄存器高速缓存中的数据并返回跨越子系统边界的装置。 如果检测到“设置的边界位”指令,则移位寄存器的最左位设置为1。 随后的PUSH指令将移位寄存器向右移位一位。 指令流中的POPTOS1指令表示子系统内的返回,并且会检查移位寄存器的最左位以查看是否为零。 在最左边的位不等于零的条件下发出保护故障。 在检测到POPSTOS 2指令的条件下,移位寄存器向左移位1位。 在指令流程中检测到的POPSUB1指令表示子系统之间的返回,并且使检查移位寄存器的最左边的位是否为一个。 在最左边的位不等于零的条件下,发出一个保护故障信号。 在指令流中检测到POPSTOS 2指令的情况下,寄存器向左移位1位。

    Method and apparatus for preventing incorrect fetching of an instruction
of a self-modifying code sequence with dependency on a bufered store
    96.
    发明授权
    Method and apparatus for preventing incorrect fetching of an instruction of a self-modifying code sequence with dependency on a bufered store 失效
    用于防止对依赖于已经存储的商店的自修改代码序列的指令的不正确取出的方法和装置

    公开(公告)号:US5434987A

    公开(公告)日:1995-07-18

    申请号:US350379

    申请日:1994-12-05

    IPC分类号: G06F9/38 G06F9/30

    CPC分类号: G06F9/3812

    摘要: A number of identical matching circuits are integrated into the store address buffer, one matching circuit to each buffer slot, for generating a number of match signals, one for each detected match, using at most the entire source address of an instruction being fetched and the corresponding portions of the store destination addresses of the buffered store instructions. Additionally, a stall signal generator complimentary to the store address buffer is provided for generating a single stall signal for the bus controller, using the match signals, thereby stalling an instruction fetch from a source address that is potentially a store destination of one of the buffered store instructions with minimal performance cost.

    摘要翻译: 将多个相同的匹配电路集成到存储地址缓冲器中,每个缓冲器时隙具有一个匹配电路,用于生成多个匹配信号,每个检测到的匹配一个,最多使用正在读取的指令的整个源地址, 缓存存储指令的存储目标地址的对应部分。 此外,提供与存储地址缓冲器相互补充的失速信号发生器,用于使用匹配信号产生用于总线控制器的单个停止信号,从而阻止来自潜在地存储缓冲器之一的存储目的地的源地址的指令获取 存储指令,性能成本最低。

    Instruction fetch unit with early instruction fetch mechanism
    97.
    发明授权
    Instruction fetch unit with early instruction fetch mechanism 失效
    指令提取单元具有早期指令提取机制

    公开(公告)号:US5423014A

    公开(公告)日:1995-06-06

    申请号:US202710

    申请日:1994-02-24

    IPC分类号: G06F9/38 G06F12/10 G06F12/00

    摘要: An instruction fetch unit in which an early instruction fetch is initiated to access a main memory simultaneously with checking a cache for the desired instruction. On a slow path to main memory is a large main translation lookaside buffer (TLB) that holds address translations. On a fast path is a smaller translation write buffer (TWB), a mini-TLB, that holds recently used address translations. A guess fetch access in initiated by presenting an address to the main memory in parallel with presenting the address to the cache. The address is compared with the contents of the TWB for a hit and with the contents of the cache for a hit. The guess access is allowed to proceed upon the condition that there is a hit in the TWB (the TWB is able to translate the logical address into a physical address) and a miss in the I-cache (the data are not available in the I-cache and hence the guess access of main memory is necessary to get the data). The guess access is canceled upon the condition that there is either a miss in the TWB (the TWB is unable to translate the logical address into a physical address) or a hit in the I-cache (the data are available in the I-cache and hence the guess access of main memory is not necessary). In this case a fetch access is reissued on the "slow" path that goes through the large main TLB.

    摘要翻译: 指令提取单元,其中启动早期指令提取以同时检查主存储器以检查期望指令的高速缓存。 在主存储器的缓慢路径上是一个保存地址转换的大型主翻译后备缓冲区(TLB)。 在快速路径上是一个较小的翻译写缓冲区(TWB),一个mini-TLB,保存最近使用的地址转换。 通过向主存储器呈现地址并且向缓存呈现地址而发起的猜测获取访问。 将地址与TWB的内容进行比较,并将其与缓存的内容进行比较。 在TWB(TWB能够将逻辑地址转换为物理地址)和I-cache中的小命令(数据在I中不可用的情况下)允许进行猜测访问 -cache,因此获取数据需要主内存的猜测访问)。 在TWB(TWB无法将逻辑地址转换为物理地址)或I缓存中的命中(数据在I缓存中可用)的情况下,猜测访问被取消 因此不需要主存储器的猜测访问)。 在这种情况下,在通过大型主TLB的“慢”路径上重新发出提取访问。

    System for executing different cycle instructions by selectively
bypassing scoreboard register and canceling the execution of
conditionally issued instruction if needed resources are busy
    98.
    发明授权
    System for executing different cycle instructions by selectively bypassing scoreboard register and canceling the execution of conditionally issued instruction if needed resources are busy 失效
    通过选择性分录器登记人员执行不同周期指令的系统,如果需要资源繁忙,则取消执行有条件的指令

    公开(公告)号:US5185872A

    公开(公告)日:1993-02-09

    申请号:US486407

    申请日:1990-02-28

    IPC分类号: G06F9/22 G06F9/32 G06F9/38

    摘要: A scbok line is connected to a register file and other units, such as an execution unit and a multiply/divide unit, in a data processing system. A mem scbok line is connected to the register file and other units, such as an instruction unit and a memory interface unit. Each unit connected to the scbok line can pull the line to indicate that it is busy. Each unit connected to the mem scbok line can pull the line to indicate that it is busy. The scbok line indicates, when asserted, that a unit or a register in the register file that is busy with a previous instruction is not available to an instruction for a register file operation. The mem scbok line indicates, when asserted, that a unit or a register in the register file that is busy with a previous instruction is not available to an instruction for a memory operation. Registers are checked concurrently with the issuing of an instruction. An instruction lacking any needed unit or a register is stopped in response to the asserted scbok line and reissued in the next cycle. Registers to be used by a multi-cycle instruction are marked busy for an instruction that is able to be executed. When a result for the multi-cycle instruction returns the registers previously marked busy are marked as not busy.

    摘要翻译: 扫描线在数据处理系统中连接到寄存器文件和其他单元,例如执行单元和乘法/除法单元。 记忆体线连接到寄存器文件和其他单元,诸如指令单元和存储器接口单元。 连接到scbok线的每个单元可以拉线以指示它正忙。 连接到mem sckk线的每个单元可以拉线以指示它正忙。 sclok线表示当寄存器文件中忙于上一个指令的单元或寄存器不能用于寄存器文件操作的指令时。 mem sckk行表示当一个单元或寄存器文件中的一个寄存器正在忙于上一条指令时,不能用于存储器操作的指令。 寄存器与发出指令并发检查。 没有任何所需单元或寄存器的指令将停止响应断言的跳线,并在下一个周期重新发行。 多周期指令使用的寄存器将被标记为能够被执行的指令的忙。 当多周期指令的结果返回先前标记为忙的寄存器被标记为不忙时。

    Two-level system main memory
    100.
    发明授权
    Two-level system main memory 有权
    二级系统主存

    公开(公告)号:US08612676B2

    公开(公告)日:2013-12-17

    申请号:US12976545

    申请日:2010-12-22

    IPC分类号: G06F12/00

    摘要: Embodiments of the invention describe a system main memory comprising two levels of memory that include cached subsets of system disk level storage. This main memory includes “near memory” comprising memory made of volatile memory, and “far memory” comprising volatile or nonvolatile memory storage that is larger and slower than the near memory.The far memory is presented as “main memory” to the host OS while the near memory is a cache for the far memory that is transparent to the OS, thus appearing to the OS the same as prior art main memory solutions. The management of the two-level memory may be done by a combination of logic and modules executed via the host CPU. Near memory may be coupled to the host system CPU via high bandwidth, low latency means for efficient processing. Far memory may be coupled to the CPU via low bandwidth, high latency means.

    摘要翻译: 本发明的实施例描述了包括两级存储器的系统主存储器,其包括系统盘级存储器的缓存子集。 该主存储器包括包括由易失性存储器构成的存储器的“近存储器”,以及包括比近存储器更大和更慢的易失性或非易失性存储器存储器的“远存储器”。 远端存储器被呈现为主机OS的“主存储器”,而近端存储器是对于对OS是透明的远存储器的高速缓存,因此与OS显示与现有技术主存储器解决方案相同的高速缓存。 两级存储器的管理可以通过经由主机CPU执行的逻辑和模块的组合来完成。 靠近存储器可以通过高带宽,低延迟的方式耦合到主机系统CPU,用于有效处理。 远存储器可以经由低带宽,高延迟装置耦合到CPU。