Cache remapping using synonym classes
    2.
    发明授权
    Cache remapping using synonym classes 失效
    使用同义词类进行缓存重映射

    公开(公告)号:US5584002A

    公开(公告)日:1996-12-10

    申请号:US21010

    申请日:1993-02-22

    IPC分类号: G06F12/08 G11C29/00 G06F11/20

    CPC分类号: G11C29/88 G06F12/0864

    摘要: A method for addressing data in a cache unit which has a plurality of congruence classes, following a failure which disables one or more of the congruence classes in the cache unit. A plurality of synonym classes are established. A subset of the congruence classes is assigned to each of the synonym classes. Any disabled congruence classes are identified. The synonym class to which the disabled congruence class belongs is identified. An alternate congruence class is selected which belongs to the same synonym class as the disabled congruence class. When a request is received by the cache to store a line of data into the disabled congruence class, the line is stored into the alternate congruence class in response to the request.

    摘要翻译: 一种用于在具有多个同余类的高速缓存单元中寻址数据的方法,该故障在禁用高速缓存单元中的一个或多个同余类之后。 建立了多个同义词类。 同余类的一个子集被分配给每个同义词类。 确定任何残疾同侪课程。 识别残疾同伴课所属的同义词类。 选择一个替代同余类,属于与残疾同余类相同的同义词类。 当高速缓存接收到请求以将一行数据存储到禁用的同余类中时,响应于请求将该行存储到备用同余类中。

    Cache miss facility with stored sequences for data fetching
    3.
    发明授权
    Cache miss facility with stored sequences for data fetching 失效
    高速缓存存储数据存储序列的设备

    公开(公告)号:US5233702A

    公开(公告)日:1993-08-03

    申请号:US390587

    申请日:1989-08-07

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0862 G06F2212/6024

    摘要: A cache memory system develops an optimum sequence for transferring data values between a main memory and a line buffer internal to the cache. At the end of a line transfer, the data in the line buffer is written into the cache memory as a block. Following an initial cache miss, the cache memory system monitors the sequence of data requests received for data in the line that is being read in from main memory. If the sequence being used to read in the data causes the processor to wait for a specific data value in the line, a new sequence is generated in which the specific data value is read at an earlier time in the transfer cycle. This sequence is associated with the instruction that caused the first miss and is used for subsequent misses caused by the instruction. If, in the process of handling a first miss related to a specific instruction, a second miss occurs which is caused by the same instruction but which is for data in a different line of memory, the sequence associated with the instruction is marked as an ephemeral miss. Data transferred to the line buffer in response to an ephemeral miss is not stored in the cache memory and limited to that portion of the line accessed within the line buffer.

    Simultaneous prediction of multiple branches for superscalar processing
    4.
    发明授权
    Simultaneous prediction of multiple branches for superscalar processing 失效
    同时预测超标量处理的多个分支

    公开(公告)号:US5434985A

    公开(公告)日:1995-07-18

    申请号:US928851

    申请日:1992-08-11

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3806 G06F9/3844

    摘要: System and method for predicting a multiplicity of future branches simultaneously (parallel) from an executing program, to enable the simultaneous fetching of multiple disjoint program segments. Additionally, the present invention detects divergence of incorrect branch predictions and provides correction for such divergence without penalty. By predicting an entire sequence of branches in parallel, the present invention removes restrictions that decoding of multiple instructions in a superscalar environment must be limited to a single branch group. As a result, the speed of today's superscalar processors can be significantly increased. The present invention includes three main embodiments: (1) the first embodiment is directed to a simplex multibranch prediction device, that can predict a plurality of branch groups in one cycle and provide early detection of wrong predictions; (2) the second embodiments is directed to a duplex multibranch prediction device that can detect divergence in a predicted stream, and provide redirection (correction) within the stream; and (3) the third embodiment is directed to an n-plex multibranch prediction device, that can predict n multiplicity of branch predictions simultaneously and provide an early detection of wrong predictions as well as correction of wrong predictions.

    摘要翻译: 用于从执行程序同时(并行)预测多个未来分支的系统和方法,以使得能够同时获取多个不相交的程序段。 此外,本发明检测不正确分支预测的发散,并且对这种发散提供校正而没有惩罚。 通过并行地预测整个分支序列,本发明消除了限制在超标量环境中多个指令的解码必须限于单个分支组的限制。 因此,今天的超标量处理器的速度可以大大提高。 本发明包括三个主要实施例:(1)第一实施例涉及可以在一个周期内预测多个分支组并提供错误预测的早期检测的单工多分支预测装置; (2)第二实施例涉及可以检测预测流中的发散并且在流内提供重定向(校正)的双工多分支预测设备; 和(3)第三实施例涉及n-plex多分支预测装置,其可以同时预测分支预测的多个,并提供错误预测的早期检测以及错误预测的校正。

    Computer processing unit employing a separate millicode branch history
table
    6.
    发明授权
    Computer processing unit employing a separate millicode branch history table 失效
    计算机处理单元采用单独的millicode分支历史表

    公开(公告)号:US5634119A

    公开(公告)日:1997-05-27

    申请号:US369441

    申请日:1995-01-06

    IPC分类号: G06F9/318 G06F9/38 G06F9/32

    CPC分类号: G06F9/3806 G06F9/3017

    摘要: A computer processing system includes a first memory that stores instructions belonging to a first instruction set architecture and a second memory that stores instructions belonging to a second instruction set architecture. An instruction buffer is coupled to the first and second memories, for storing instructions that are executed by a processor unit. The system operates in one of two modes. In a first mode, instructions are fetched from the first memory into the instruction buffer according to data stored in a first branch history table. In the second mode, instructions are fetched from the second memory into the instruction buffer according to data stored in a second branch history table.The first instruction set architecture may be system level instructions and the second instruction set architecture may be millicode instructions that, for example, define a complex system level instruction and/or emulate a third instruction set architecture.

    摘要翻译: 计算机处理系统包括存储属于第一指令集架构的指令的第一存储器和存储属于第二指令集架构的指令的第二存储器。 指令缓冲器耦合到第一和第二存储器,用于存储由处理器单元执行的指令。 该系统以两种模式之一运行。 在第一模式中,根据存储在第一分支历史表中的数据,将指令从第一存储器提取到指令缓冲器中。 在第二模式中,根据存储在第二分支历史表中的数据,将指令从第二存储器提取到指令缓冲器中。 第一指令集架构可以是系统级指令,并且第二指令集架构可以是例如定义复杂系统级指令和/或模拟第三指令集体系结构的毫指令指令。

    Method for enabling concurrent misses in a cache memory
    7.
    发明授权
    Method for enabling concurrent misses in a cache memory 失效
    启用缓存中的并发错误的方法

    公开(公告)号:US5636364A

    公开(公告)日:1997-06-03

    申请号:US347972

    申请日:1994-12-01

    IPC分类号: G06F12/08 G06F13/14

    CPC分类号: G06F12/0806 G06F12/0859

    摘要: In a cache-to-memory interface, a means and method for timesharing a single bus to allow the concurrent processing of multiple misses. The multiplicity of misses can arise from a single processor if that processor has a nonblocking cache and/or does speculative prefetching, or it can arise from a multiplicity of processors in a shared-bus configuration.

    摘要翻译: 在缓存到存储器接口中,一种用于对单个总线进行时分多路复用以允许并发处理多个未命中的方法和方法。 如果该处理器具有非阻塞高速缓存和/或进行推测性预取,或者可能由共享总线配置中的多个处理器产生,则单个处理器可能会产生多个未命中。

    Apparatus and method for prefetching subblocks from a low speed memory
to a high speed memory of a memory hierarchy depending upon state of
replacing bit in the low speed memory
    8.
    发明授权
    Apparatus and method for prefetching subblocks from a low speed memory to a high speed memory of a memory hierarchy depending upon state of replacing bit in the low speed memory 失效
    根据代替低速存储器中的位的状态,将子块从低速存储器预取到存储器层级的高速存储器的装置和方法

    公开(公告)号:US4774654A

    公开(公告)日:1988-09-27

    申请号:US685527

    申请日:1984-12-24

    IPC分类号: G06F12/08 G06F12/12

    摘要: A prefetching mechanism for a memory hierarchy which includes at least two levels of storage, with L1 being a high-speed low-capacity memory, and L2 being a low-speed high-capacity memory, with the units of L2 and L1 being blocks and sub-blocks respectively, with each block containing several sub-blocks in consecutive addresses. Each sub-block is provided an additional bit, called a r-bit, which indicates that the sub-block has been previously stored in L1 when the bit is 1, and has not been previously stored in L1 when the bit is 0. Initially when a block is loaded into L2 each of the r-bits in the sub-block are set to 0. When a sub-block is transferred from L1 to L2, its r-bit is then set to 1 in the L2 block, to indicate its previous storage in L1. When the CPU references a given sub-block which is not present in L1, and has to be fetched from L2 to L1, the remaining sub-blocks in this block having r-bits set to 1 are prefetched to L1. This prefetching of the other sub-blocks having r-bits set to 1 results in a more efficient utilization of the L1 storage capacity and results in a highter hit ratio.

    摘要翻译: 一种用于存储器层级的预取机制,其包括至少两个级别的存储,L1是高速低容量存储器,L2是低速大容量存储器,L2和L1的单位是块, 子块,每个块包含连续地址中的几个子块。 每个子块被提供一个称为r位的附加位,该位指示当该位为1时该子块已经预先存储在L1中,并且当该位为0时,该块尚未预先存储在L1中。最初 当块被加载到L2中时,子块中的每个r位被设置为0.当子块从L1传送到L2时,其r位在L2块中被设置为1,到 表示其以前在L1中的存储。 当CPU参考L1中不存在并且必须从L2取出的给定子块时,将具有设置为1的r位的该块中剩余的子块预取为L1。 将r位设置为1的其他子块的预取导致L1存储容量的更有效的利用并导致更高的命中率。

    Pageable branch history table
    9.
    发明授权
    Pageable branch history table 失效
    分页历史表

    公开(公告)号:US4679141A

    公开(公告)日:1987-07-07

    申请号:US728424

    申请日:1985-04-29

    IPC分类号: G06F9/38 G06F9/00

    CPC分类号: G06F9/3806 G06F9/3844

    摘要: A branch history table (BHT) is substantially improved by dividing it into two parts: an active area, and a backup area. The active area contains entries for a small number of branches which the processor can encounter in the near future and the backup area contains all other branch entries. Means are provided to bring entries from the backup area into the active area ahead of when the processor will use those entries. When entries are no longer needed they are removed from the active area and put into the backup area if not already there. New entries for the near future are brought in, so that the active area, though small, will almost always contain the branch information needed by the processor.The small size of the active area allows it to be fast and to be optimally located in the processor layout. The backup area can be located outside the critical part of the layout and can therefore be made larger than would be practicable for a standard BHT.

    摘要翻译: 分支历史表(BHT)通过将分割历史表(BHT)分为两部分:活动区域和备份区域来实质上改进。 活动区域包含处理器在不久的将来可能遇到的少数分支的条目,备份区域包含所有其他分支条目。 提供了在处理器将使用这些条目之前将条目从备份区域引入活动区域的手段。 当不再需要条目时,它们将从活动区域中删除,并将其放入备份区(如果尚未存在)。 引入了近期的新条目,使得活动区域虽然很小,但几乎总是包含处理器所需的分支信息。 活动区域的小尺寸允许其快速且最优地位于处理器布局中。 备用区域可以位于布局的关键部分之外,因此可以使其大于标准BHT的可行性。

    Method and apparatus for guaranteeing the logical integrity of data in
the general purpose registers of a complex multi-execution unit
uniprocessor
    10.
    发明授权
    Method and apparatus for guaranteeing the logical integrity of data in the general purpose registers of a complex multi-execution unit uniprocessor 失效
    用于保证复合多执行单元单处理器通用寄存器中数据的逻辑完整性的方法和装置

    公开(公告)号:US4903196A

    公开(公告)日:1990-02-20

    申请号:US859156

    申请日:1986-05-02

    IPC分类号: G06F9/38

    摘要: A method and apparatus for controlling access to its general purpose registers (GPRs) by a high end machine configuration including a plurality of execution units within a single CPU. The invention allows up to "N" execution units to be concurrently executing up to "N" instructions using the GPR sequentially or different GPR's concurrently as either SINK or SOURCE while at the same time preserving the logical integrity of the data supplied to the execution units. The use of the invention allows a higher degree of parallelism in the execution of the instructions than would otherwise be possible if only sequential operations were performed.A series of special purpose tags are associated with each GPR and execution unit. These tags are used together with control circuitry both within the GPR's, within the individual execution units and within the instruction decode unit, which permit the multiple use of the registers to be accomplished while maintaining the requisite logical integrity.

    摘要翻译: 一种用于通过包括单个CPU内的多个执行单元的高端机器配置来控制对其通用寄存器(GPR)的访问的方法和装置。 本发明允许最多“N”个执行单元同时使用GPR顺序地或不同的GPR作为SINK或SOURCE同时执行“N”个指令,同时保持提供给执行单元的数据的逻辑完整性 。 如果仅执行顺序操作,则本发明的使用允许在执行指令时更高程度的并行性。 一系列专用标签与每个GPR和执行单元相关联。 这些标签与GPR内的各个执行单元内和指令解码单元内的控制电路一起使用,这允许在保持必要的逻辑完整性的同时多次使用寄存器。