Apparatus and method for prefetching subblocks from a low speed memory
to a high speed memory of a memory hierarchy depending upon state of
replacing bit in the low speed memory
    1.
    发明授权
    Apparatus and method for prefetching subblocks from a low speed memory to a high speed memory of a memory hierarchy depending upon state of replacing bit in the low speed memory 失效
    根据代替低速存储器中的位的状态,将子块从低速存储器预取到存储器层级的高速存储器的装置和方法

    公开(公告)号:US4774654A

    公开(公告)日:1988-09-27

    申请号:US685527

    申请日:1984-12-24

    IPC分类号: G06F12/08 G06F12/12

    摘要: A prefetching mechanism for a memory hierarchy which includes at least two levels of storage, with L1 being a high-speed low-capacity memory, and L2 being a low-speed high-capacity memory, with the units of L2 and L1 being blocks and sub-blocks respectively, with each block containing several sub-blocks in consecutive addresses. Each sub-block is provided an additional bit, called a r-bit, which indicates that the sub-block has been previously stored in L1 when the bit is 1, and has not been previously stored in L1 when the bit is 0. Initially when a block is loaded into L2 each of the r-bits in the sub-block are set to 0. When a sub-block is transferred from L1 to L2, its r-bit is then set to 1 in the L2 block, to indicate its previous storage in L1. When the CPU references a given sub-block which is not present in L1, and has to be fetched from L2 to L1, the remaining sub-blocks in this block having r-bits set to 1 are prefetched to L1. This prefetching of the other sub-blocks having r-bits set to 1 results in a more efficient utilization of the L1 storage capacity and results in a highter hit ratio.

    摘要翻译: 一种用于存储器层级的预取机制,其包括至少两个级别的存储,L1是高速低容量存储器,L2是低速大容量存储器,L2和L1的单位是块, 子块,每个块包含连续地址中的几个子块。 每个子块被提供一个称为r位的附加位,该位指示当该位为1时该子块已经预先存储在L1中,并且当该位为0时,该块尚未预先存储在L1中。最初 当块被加载到L2中时,子块中的每个r位被设置为0.当子块从L1传送到L2时,其r位在L2块中被设置为1,到 表示其以前在L1中的存储。 当CPU参考L1中不存在并且必须从L2取出的给定子块时,将具有设置为1的r位的该块中剩余的子块预取为L1。 将r位设置为1的其他子块的预取导致L1存储容量的更有效的利用并导致更高的命中率。

    Cache miss facility with stored sequences for data fetching
    2.
    发明授权
    Cache miss facility with stored sequences for data fetching 失效
    高速缓存存储数据存储序列的设备

    公开(公告)号:US5233702A

    公开(公告)日:1993-08-03

    申请号:US390587

    申请日:1989-08-07

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0862 G06F2212/6024

    摘要: A cache memory system develops an optimum sequence for transferring data values between a main memory and a line buffer internal to the cache. At the end of a line transfer, the data in the line buffer is written into the cache memory as a block. Following an initial cache miss, the cache memory system monitors the sequence of data requests received for data in the line that is being read in from main memory. If the sequence being used to read in the data causes the processor to wait for a specific data value in the line, a new sequence is generated in which the specific data value is read at an earlier time in the transfer cycle. This sequence is associated with the instruction that caused the first miss and is used for subsequent misses caused by the instruction. If, in the process of handling a first miss related to a specific instruction, a second miss occurs which is caused by the same instruction but which is for data in a different line of memory, the sequence associated with the instruction is marked as an ephemeral miss. Data transferred to the line buffer in response to an ephemeral miss is not stored in the cache memory and limited to that portion of the line accessed within the line buffer.

    Method and apparatus for guaranteeing the logical integrity of data in
the general purpose registers of a complex multi-execution unit
uniprocessor
    3.
    发明授权
    Method and apparatus for guaranteeing the logical integrity of data in the general purpose registers of a complex multi-execution unit uniprocessor 失效
    用于保证复合多执行单元单处理器通用寄存器中数据的逻辑完整性的方法和装置

    公开(公告)号:US4903196A

    公开(公告)日:1990-02-20

    申请号:US859156

    申请日:1986-05-02

    IPC分类号: G06F9/38

    摘要: A method and apparatus for controlling access to its general purpose registers (GPRs) by a high end machine configuration including a plurality of execution units within a single CPU. The invention allows up to "N" execution units to be concurrently executing up to "N" instructions using the GPR sequentially or different GPR's concurrently as either SINK or SOURCE while at the same time preserving the logical integrity of the data supplied to the execution units. The use of the invention allows a higher degree of parallelism in the execution of the instructions than would otherwise be possible if only sequential operations were performed.A series of special purpose tags are associated with each GPR and execution unit. These tags are used together with control circuitry both within the GPR's, within the individual execution units and within the instruction decode unit, which permit the multiple use of the registers to be accomplished while maintaining the requisite logical integrity.

    摘要翻译: 一种用于通过包括单个CPU内的多个执行单元的高端机器配置来控制对其通用寄存器(GPR)的访问的方法和装置。 本发明允许最多“N”个执行单元同时使用GPR顺序地或不同的GPR作为SINK或SOURCE同时执行“N”个指令,同时保持提供给执行单元的数据的逻辑完整性 。 如果仅执行顺序操作,则本发明的使用允许在执行指令时更高程度的并行性。 一系列专用标签与每个GPR和执行单元相关联。 这些标签与GPR内的各个执行单元内和指令解码单元内的控制电路一起使用,这允许在保持必要的逻辑完整性的同时多次使用寄存器。

    High speed buffer store arrangement for quick wide transfer of data
    4.
    发明授权
    High speed buffer store arrangement for quick wide transfer of data 失效
    高速缓冲存储布置,可快速传输数据

    公开(公告)号:US4823259A

    公开(公告)日:1989-04-18

    申请号:US213506

    申请日:1988-06-23

    IPC分类号: G06F12/08 G06F12/00

    CPC分类号: G06F12/0897

    摘要: A high speed buffer store arrangement for use in a data processing system having multiple cache buffer storage units in a hierarchial arrangement permits fast transfer of wide data blocks. On each cache chip, input and output latches are integrated thus avoiding separate intermediate buffering. Input and output latches are interconnected by 64-byte wide data buses so that data blocks can be shifted rapidly from one cache hierarchy level to another and back. Chip-internal feedback connections from output to input latches allow data blocks to be selectively reentered into a cache after reading. An additional register array is provided so that data blocks can be furnished again after transfer from cache to main memory or CPU without accessing the respective cache. Wide data blocks can be transferred within one cycle, thus tying up caches much less in transfer operations, so that they have increased availability.

    摘要翻译: 用于具有分级布置的多个高速缓存存储单元的数据处理系统中的高速缓冲存储装置允许宽数据块的快速传送。 在每个缓存芯片上,集成了输入和输出锁存器,从而避免了单独的中间缓冲。 输入和输出锁存器通过64字节的宽数据总线互连,使得数据块可以从一个缓存层次级别快速移动到另一个高速缓存层级。 从输出到输入锁存器的芯片内部反馈连接允许数据块在读取之后被选择性地重新进入高速缓存。 提供了一个附加的寄存器阵列,使得在从高速缓存传送到主存储器或CPU之后再次可以提供数据块,而不访问相应的高速缓存。 广泛的数据块可以在一个周期内传输,从而在传输操作中减少高速缓存,从而增加可用性。

    Prefetching system for a cache having a second directory for
sequentially accessed blocks
    5.
    发明授权
    Prefetching system for a cache having a second directory for sequentially accessed blocks 失效
    用于具有用于顺序访问的块的第二目录的高速缓存的预取系统

    公开(公告)号:US4807110A

    公开(公告)日:1989-02-21

    申请号:US597801

    申请日:1984-04-06

    IPC分类号: G06F12/08 G06F9/38 G06F12/12

    CPC分类号: G06F12/0862 G06F2212/6024

    摘要: A prefetching mechanism for a system having a cache has, in addition to the normal cache directory, a two-level shadow directory. When an information block is accessed, a parent identifier derived from the block address is stored in a first level of the shadow directory. The address of a subsequently accessed block is stored in the second level of the shadow directory, in a position associated with the first-level position of the respective parent identifier.With each access to an information block, a check is made whether the respective parent identifier is already stored in the first level of the shadow directory. If it is found, then a descendant address from the associated second-level position is used to prefetch an information block to the cache if it is not already resident therein. This mechanism avoids, with a high probability, the occurrence of cache misses.

    摘要翻译: 具有高速缓存的系统的预取机制除了普通高速缓存目录之外还具有两级影子目录。 当访问信息块时,从块地址导出的父标识符被存储在影子目录的第一级中。 随后访问的块的地址被存储在与目标父标识符的第一级位置相关联的位置中的影子目录的第二级。 通过对信息块的每次访问,检查相应的父标识符是否已经存储在影子目录的第一级中。 如果找到,那么来自相关联的二级位置的后代地址如果不存在于高速缓存中则被用于将信息块预取到高速缓存。 这种机制很可能避免了高速缓存未命中的发生。

    Pageable branch history table
    7.
    发明授权
    Pageable branch history table 失效
    分页历史表

    公开(公告)号:US4679141A

    公开(公告)日:1987-07-07

    申请号:US728424

    申请日:1985-04-29

    IPC分类号: G06F9/38 G06F9/00

    CPC分类号: G06F9/3806 G06F9/3844

    摘要: A branch history table (BHT) is substantially improved by dividing it into two parts: an active area, and a backup area. The active area contains entries for a small number of branches which the processor can encounter in the near future and the backup area contains all other branch entries. Means are provided to bring entries from the backup area into the active area ahead of when the processor will use those entries. When entries are no longer needed they are removed from the active area and put into the backup area if not already there. New entries for the near future are brought in, so that the active area, though small, will almost always contain the branch information needed by the processor.The small size of the active area allows it to be fast and to be optimally located in the processor layout. The backup area can be located outside the critical part of the layout and can therefore be made larger than would be practicable for a standard BHT.

    摘要翻译: 分支历史表(BHT)通过将分割历史表(BHT)分为两部分:活动区域和备份区域来实质上改进。 活动区域包含处理器在不久的将来可能遇到的少数分支的条目,备份区域包含所有其他分支条目。 提供了在处理器将使用这些条目之前将条目从备份区域引入活动区域的手段。 当不再需要条目时,它们将从活动区域中删除,并将其放入备份区(如果尚未存在)。 引入了近期的新条目,使得活动区域虽然很小,但几乎总是包含处理器所需的分支信息。 活动区域的小尺寸允许其快速且最优地位于处理器布局中。 备用区域可以位于布局的关键部分之外,因此可以使其大于标准BHT的可行性。

    Data processing system with fast queue store interposed between
store-through caches and a main memory
    9.
    发明授权
    Data processing system with fast queue store interposed between store-through caches and a main memory 失效
    数据处理系统与存储通过速度和主存储器间的快速队列存储

    公开(公告)号:US5155831A

    公开(公告)日:1992-10-13

    申请号:US342493

    申请日:1989-04-24

    摘要: A fast queue mechanism is provided which keeps a queue of changes (i.e. store actions) issued by each processor, which queue is accessible by all processors. When any processor issues a store action to a line of memory in the queue, the old data is overwritten with the new data. If the queue does not currently have a corresponding entry, a new entry is activated. Room for the new entry is made by selecting some existing entry, either the oldest or the least recently used, to be removed. An entry that is to be removed is first used to update the line corresponding to it in main memory. After the changes held in the entry to be removed are applied to the old value of the line (from main memory) and the updated value is put back into main memory, the entry in the queue is removed by marking it "empty". When a processor accesses a line of data not in its cache, a cache miss occurs and it is necessary to fetch the line from main memory. Such fetches are monitored by the queue mechanism to see if it is holding changes to the line being fetched. If so, the changes are applied to the line coming from main memory before the line is sent to the requesting processor. After a new entry is made in the queue mechanism, other store actions to the same entry by any processor may occur and usually a number of store actions will occur to the entry before it is removed to make room for another.

    Methods and apparatus for insulating a branch prediction mechanism from
data dependent branch table updates that result from variable test
operand locations
    10.
    发明授权
    Methods and apparatus for insulating a branch prediction mechanism from data dependent branch table updates that result from variable test operand locations 失效
    从数据依赖分支机构中分离出分支预测机制的方法和装置更新可变测试操作地点的更新

    公开(公告)号:US5210831A

    公开(公告)日:1993-05-11

    申请号:US429922

    申请日:1989-10-30

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3844

    摘要: Methods and apparatus are described for processing branch instructions using a history based branch prediction mechanism (such as a branch history table) in combination with a data dependent branch table (DDBT), where the branch instructions can vary in both outcome and test operand location. The novel methods and apparatus are sensitive to branch mispredictions and to operand addresses used by the DDBT, to identify irrelevant DDBT entries. Irrelevant DDBT entries are identified within the prediction mechanism using state bits which, when set, indicate that: (1) a given entry in the prediction mechanism was updated by the DDBT and (2) subsequent to such update a misprediction occurred making further DDBT updates irrelevant. Once a DDBT entry is determined to be irrelevant, it is prevented from updating the prediction mechanism. The invention also provides methods and apparatus for locating and removing irrelevant entries from the DDBT. The update packet, sent by the DDBT to the history based prediction mechanism, is expanded to include the test operand address actually used by the DDBT. If the state bits indicate the update is irrelevant, then the operand address can be used to locate and delete the offending DDBT entry since the DDBT is organized based on operand addresses. Additionally, the invention provides for inhibiting creation of further DDBT entries when a Branch Wrong Guess event occurs subsequent to a DDBT update to a given prediction mechanism entry.