Method and apparatus for balancing load vs. store access to a primary
data cache
    1.
    发明授权
    Method and apparatus for balancing load vs. store access to a primary data cache 有权
    用于平衡负载与对主数据高速缓存的存储访问的方法和装置

    公开(公告)号:US6163821A

    公开(公告)日:2000-12-19

    申请号:US215354

    申请日:1998-12-18

    摘要: A computer method and apparatus causes the load-store instruction grouping in a microprocessor instruction pipeline to be disrupted at appropriate times. The computer method and apparatus employs a memory access member which periodically stalls the issuance of store instructions when there are prior store instructions pending in the store queue. The periodic stalls bias the issue stage to issue load groups and store instruction groups. In the latter case, the store queue is free to update the data cache with the data from previous store instructions. Thus, the invention memory access member biases issuance of store instructions in a manner that prevents the store queue from becoming full, and as such enables the store queue to write to the data cache before the store queue becomes full.

    摘要翻译: 计算机方法和装置使得微处理器指令流水线中的加载存储指令分组在适当的时间被中断。 计算机方法和装置采用存储器访问部件,当在存储队列中存在先前的存储指令时,周期性地停止发布存储指令。 周期性档位偏离问题阶段以发布加载组并存储指令组。 在后一种情况下,存储队列可以使用来自先前存储指令的数据来自由地更新数据高速缓存。 因此,本发明的存储器访问部件以防止存储队列变满的方式偏移存储指令的发布,并且因此使存储队列在存储队列变满之前写入数据高速缓存。

    Fast lane prefetching
    2.
    发明授权
    Fast lane prefetching 有权
    快速车道预取

    公开(公告)号:US06681295B1

    公开(公告)日:2004-01-20

    申请号:US09652451

    申请日:2000-08-31

    IPC分类号: G06F1200

    摘要: A computer system has a set-associative, multi-way cache system, in which at least one way is designated as a fast lane, and remaining way(s) are designated slow lanes. Any data that needs to be loaded into cache, but is not likely to be needed again in the future, preferably is loaded into the fast lane. Data loaded into the fast lane is earmarked for immediate replacement. Data loaded into the slow lanes preferably is data that may not needed again in the near future. Slow data is kept in cache to permit it to be reused if necessary. The high-performance mechanism of data access in a modem microprocessor is with a prefetch; data is moved, with a special prefetch instruction, into cache prior to its intended use. The prefetch instruction requires less machine resources, than carrying out the same intent with an ordinary load instruction. So, the slow-lane, fast-lane decision is accomplished by having a multiplicity of prefetch instructions. By loading “not likely to be needed again” data into the fast lane, and designating such data for immediate replacement, data in other cache blocks, in the other ways, may not be evicted, and overall system performance is increased.

    摘要翻译: 计算机系统具有集合关联的多路缓存系统,其中至少一种方式被指定为快速通道,并且剩余方式被指定为慢车道。 任何需要加载到缓存中但不太可能再次需要的数据最好被加载到快速通道中。 加载到快速通道的数据被指定用于立即更换。 加载到慢车道中的数据优选地是在不久的将来可能不再需要的数据。 慢数据保存在缓存中,以便在必要时重新使用它。 调制解调器微处理器中数据访问的高性能机制具有预取功能; 在预期使用之前,将数据用特殊的预取指令移动到缓存中。 预取指令比普通加载指令执行相同的意图要求较少的机器资源。 因此,通过具有多个预取指令来实现慢通道,快速通道决定。 通过将“不太可能需要再次”的数据加载到快速通道中,并且指定这样的数据以立即替换,以其他方式在其他高速缓存块中的数据可能不被驱逐,并且整体系统性能增加。

    Method and apparatus for minimizing dcache index match aliasing using hashing in synonym/subset processing
    3.
    发明授权
    Method and apparatus for minimizing dcache index match aliasing using hashing in synonym/subset processing 失效
    使用同义词/子集处理中的散列来最小化dcache索引匹配混叠的方法和装置

    公开(公告)号:US06253285B1

    公开(公告)日:2001-06-26

    申请号:US09116039

    申请日:1998-07-15

    IPC分类号: C06F1200

    摘要: A data caching system comprises a hashing function, a data store, a tag array, a page translator, a comparator and a duplicate tag array. The hashing function combines an index portion of a virtual address with a virtual page portion of the virtual address to form a cache index. The data store comprises a plurality of data blocks for holding data. The tag array comprises a plurality of tag entries corresponding to the data blocks, and both the data store and tag array are addressed with the cache index. The tag array provides a plurality of physical address tags corresponding to physical addresses of data resident within corresponding data blocks in the data store addressed by the cache index. The page translator translates a tag portion of the virtual address to a corresponding physical address tag. The comparator verifies a match between the physical address tag from the page translator and the plurality of physical address tags from the tag array, a match indicating that data addressed by the virtual address is resident within the data store. Finally, the duplicate tag array resolves synonym issues caused by hashing. The hashing function is such that addresses which are equivalent mod 213 are pseudo-randomly displaced within the cache. The preferred hashing function maps VA to bits of the cache index.

    摘要翻译: 数据缓存系统包括散列函数,数据存储器,标签阵列,页面翻译器,比较器和重复的标签阵列。 散列函数将虚拟地址的索引部分与虚拟地址的虚拟页面部分组合以形成高速缓存索引。 数据存储器包括用于保存数据的多个数据块。 标签阵列包括与数据块相对应的多个标签条目,并且数据存储和标签阵列都用高速缓存索引寻址。 标签阵列提供与驻留在由高速缓存索引寻址的数据存储器中的相应数据块内的数据的物理地址相对应的多个物理地址标签。 页面翻译器将虚拟地址的标签部分转换为相应的物理地址标签。 比较器验证来自页面翻译器的物理地址标签与来自标签阵列的多个物理地址标签之间的匹配,指示由虚拟地址寻址的数据驻留在数据存储中的匹配。 最后,重复的标签数组解决哈希引起的同义词问题。 散列函数使得等效的mod 213的地址在高速缓存内被伪随机移位。 优先散列函数将VA <14,15异或13,12:6>映射到高速缓存索引的位<14:6>。

    Method and apparatus for performing speculative memory fills into a microprocessor
    4.
    发明授权
    Method and apparatus for performing speculative memory fills into a microprocessor 失效
    用于执行推测性存储器填充到微处理器的方法和装置

    公开(公告)号:US06493802B1

    公开(公告)日:2002-12-10

    申请号:US09099396

    申请日:1998-06-18

    IPC分类号: G06F1212

    摘要: According to the present invention a cache within a multiprocessor system is speculatively filled. To speculatively fill a designated cache, the present invention first determines an address which identifies information located in a main memory. The address may also identify one or more other versions of the information located in one or more caches. The process of filling the designated cache with the information is started by locating the information in the main memory and locating other versions of the information identified by the address in the caches. The validity of the information located in the main memory is determined after locating the other versions of the information. The process of filling the designated cache with the information located in the main memory is initiated before determining the validity of the information located in main memory. Thus, the memory reference is speculative.

    摘要翻译: 根据本发明,推测性地填充多处理器系统内的高速缓存。 为了推测地填充指定的高速缓存,本发明首先确定识别位于主存储器中的信息的地址。 地址还可以标识位于一个或多个高速缓存中的信息的一个或多个其他版本。 通过将信息定位在主存储器中并定位在该高速缓存中由该地址识别的信息的其他版本来启动用信息填充指定高速缓存的过程。 位于主存储器中的信息的有效性是在查找信息的其他版本之后确定的。 在确定位于主存储器中的信息的有效性之前启动用位于主存储器中的信息填充指定高速缓存的过程。 因此,内存引用是推测性的。

    Method and apparatus for developing multiprocessor cache control protocols using a memory management system generating atomic probe commands and system data control response commands
    5.
    发明授权
    Method and apparatus for developing multiprocessor cache control protocols using a memory management system generating atomic probe commands and system data control response commands 失效
    使用生成原子探测命令和系统数据控制响应命令的存储器管理系统开发多处理器缓存控制协议的方法和装置

    公开(公告)号:US06349366B1

    公开(公告)日:2002-02-19

    申请号:US09099385

    申请日:1998-06-18

    IPC分类号: G06F1200

    CPC分类号: G06F12/0815

    摘要: A memory management system couples processors to each other and to a main memory. Each processor may have one or more associated caches local to that processor. A system port of the memory management system receives a request from a source processor of the processors to access a block of data from the main memory. A memory manager of the memory management system then converts the request into a probe command having a data movement part identifying a condition for movement of the block out of a cache of a target processor and a next coherence state part indicating a next state of the block in the cache of the target processor.

    摘要翻译: 存储器管理系统将处理器彼此耦合到主存储器。 每个处理器可以具有该处理器本地的一个或多个相关联的高速缓存。 存储器管理系统的系统端口接收来自处理器的源处理器的请求以从主存储器访问数据块。 存储器管理系统的存储器管理器然后将该请求转换成具有数据移动部分的探测命令,该数据移动部分标识出用于从目标处理器的高速缓存中移出块的条件,以及指示块的下一个状态的下一个相干状态部分 在目标处理器的缓存中。

    Method and apparatus for developing multiprocessor cache control protocols using atomic probe commands and system data control response commands
    6.
    发明授权
    Method and apparatus for developing multiprocessor cache control protocols using atomic probe commands and system data control response commands 失效
    使用原子探针命令和系统数据控制响应命令开发多处理器缓存控制协议的方法和装置

    公开(公告)号:US06314496B1

    公开(公告)日:2001-11-06

    申请号:US09099398

    申请日:1998-06-18

    IPC分类号: G06F1200

    CPC分类号: G06F12/0815 G06F12/0811

    摘要: A computing apparatus connectable to a cache and a memory, includes a system port configured to receive an atomic probe command or a system data control response command having an address part identifying data stored in the cache which is associated with data stored in the memory and a next coherence state part indicating a next state of the data in the cache. The computing apparatus further includes an execution unit configured to execute the command to change the state of the data stored in the cache according to the next coherence state part of the command.

    摘要翻译: 可连接到高速缓存和存储器的计算设备包括被配置为接收原子探测命令的系统端口或具有地址部分的系统数据控制响应命令,该地址部分识别与存储在存储器中的数据相关联的高速缓存中的数据,以及 下一个相干状态部分指示高速缓存中的数据的下一状态。 所述计算装置还包括:执行部,其被配置为根据所述命令的下一个一致状态部分执行改变存储在所述高速缓存中的数据的状态的命令。

    Method and apparatus for developing multiprocessor cache control protocols using an external acknowledgement signal to set a cache to a dirty state
    7.
    发明授权
    Method and apparatus for developing multiprocessor cache control protocols using an external acknowledgement signal to set a cache to a dirty state 失效
    用于使用外部确认信号开发多处理器高速缓存控制协议以将高速缓存设置为脏状态的方法和装置

    公开(公告)号:US06651144B1

    公开(公告)日:2003-11-18

    申请号:US09099384

    申请日:1998-06-18

    IPC分类号: G06F1200

    CPC分类号: G06F12/0815 G06F12/0817

    摘要: A computer system includes an external unit governing a cache which generates a set-dirty request as a function of a coherence state of a block in the cache to be modified. The external unit modifies the block of the cache only if an acknowledgment granting permission is received from a memory management system responsive to the set-dirty request. The memory management system receives the set-dirty request, determines the acknowledgment based on contents of the plurality of caches and the main memory according to a cache protocol and sends the acknowledgment to the external unit in response to the set-dirty request. The acknowledgment will either grant permission or deny permission to set the block to the dirty state.

    摘要翻译: 计算机系统包括管理高速缓存的外部单元,该高速缓冲存储器根据要修改的高速缓存中的块的相干状态生成设置脏请求。 仅当响应于设置的脏请求从存储器管理系统接收到确认授权许可时,外部单元才修改高速缓存块。 存储器管理系统接收设置脏请求,根据高速缓存协议基于多个高速缓存和主存储器的内容确定确认,并响应于设置脏请求将确认发送到外部单元。 该确认将授予权限或拒绝许可将块设置为脏状态。

    Method and apparatus for developing multiprocessor cache control protocols by presenting a clean victim signal to an external system
    8.
    发明授权
    Method and apparatus for developing multiprocessor cache control protocols by presenting a clean victim signal to an external system 失效
    通过向外部系统提供干净的受害者信号来开发多处理器缓存控制协议的方法和装置

    公开(公告)号:US06397302B1

    公开(公告)日:2002-05-28

    申请号:US09099304

    申请日:1998-06-18

    IPC分类号: G06F1212

    CPC分类号: G06F12/0822

    摘要: A multiprocessor system includes a plurality of processors, each processor having one or more caches local to the processor, and a memory controller connectable to the plurality of processors and a main memory. The memory controller manages the caches and the main memory of the multiprocessor system. A processor of the multiprocessor system is configurable to evict from its cache a block of data. The selected block may have a clean coherence state or a dirty coherence state. The processor communicates a notify signal indicating eviction of the selected block to the memory controller. In addition to sending a write victim notify signal if the selected block has a dirty coherence state, the processor sends a clean victim notify signal if the selected block has a clean coherence state.

    摘要翻译: 多处理器系统包括多个处理器,每个处理器具有处理器本地的一个或多个高速缓存,以及可连接到多个处理器和主存储器的存储器控​​制器。 存储器控制器管理多处理器系统的高速缓存和主存储器。 多处理器系统的处理器可配置为从其缓存中驱逐数据块。 所选择的块可以具有干净的相干状态或脏相干状态。 处理器将指示所选块的驱逐的通知信号传送到存储器控制器。 如果所选择的块具有脏相干状态,则除了发送写入受害者通知信号之外,如果所选择的块具有干净的相干状态,则处理器发送干净的受害者通知信号。

    Method and apparatus for resolving probes in multi-processor systems which do not use external duplicate tags for probe filtering
    9.
    发明授权
    Method and apparatus for resolving probes in multi-processor systems which do not use external duplicate tags for probe filtering 失效
    用于解决不使用外部重复标签进行探测过滤的多处理器系统中的探针的方法和装置

    公开(公告)号:US06295583B1

    公开(公告)日:2001-09-25

    申请号:US09099400

    申请日:1998-06-18

    IPC分类号: G06F1200

    CPC分类号: G06F12/0855 G06F12/0831

    摘要: A processor of a multiprocessor system is configured to transmit a full probe to a cache associated with the processor to transfer data from the stored data of the cache. The data corresponding to the full probe is transferred during a time period. A first tag-only probe is also transmitted to the cache during the same time period to determine if the data corresponding to the tag-only probe is part of the stored data stored in the cache. A stream of probes accesses the cache in two stages. The cache is composed of a tag structure and a data structure. In the first stage, a probe is designated a tag-only probe and accesses the tag structure, but not the data structure, to determine tag information indicating a hit or a miss. In the second stage, if the probe returns tag information indicating a cache hit the probe is designated to be a full probe and accesses the data structure of the cache. If the probe returns tag information indicating a cache miss the probe does not proceed to the second stage.

    摘要翻译: 多处理器系统的处理器被配置为将完整的探测传输到与处理器相关联的高速缓存器以从存储的高速缓存数据传输数据。 在一段时间内传送对应于完整探测器的数据。 在相同的时间段期间,第一标签探针也被发送到高速缓存,以确定对应于仅标签探针的数据是否存储在高速缓存中的存储数据的一部分。 探针流以两个阶段访问缓存。 缓存由标签结构和数据结构组成。 在第一阶段,探针被指定为仅标签探针,并且访问标签结构,而不是数据结构,以确定指示命中或遗漏的标签信息。 在第二阶段中,如果探测器返回指示高速缓存命中的标签信息,则探测器被指定为完整探测器并访问高速缓存的数据结构。 如果探测器返回指示高速缓存未命中的标签信息,则探针不进入第二阶段。

    Method and apparatus for minimizing pincount needed by external memory control chip for multiprocessors with limited memory size requirements
    10.
    发明授权
    Method and apparatus for minimizing pincount needed by external memory control chip for multiprocessors with limited memory size requirements 失效
    用于使存储器大小要求有限的多处理器的外部存储器控制芯片所需的针数最小化的方法和装置

    公开(公告)号:US06199153B1

    公开(公告)日:2001-03-06

    申请号:US09099383

    申请日:1998-06-18

    IPC分类号: G06F1200

    CPC分类号: G11C5/066 G11C8/00

    摘要: A computing apparatus has a mode selector configured to select one of a long-bus mode corresponding to a first memory size and a short-bus mode corresponding to a second memory size which is less than the first memory size. An address bus of the computing apparatus is configured to transmit an address consisting of address bits defining the first memory size and a subset of the address bits defining the second memory size. The address bus has N communication lines each configured to transmit one of a first number of bits of the address bits defining the first memory size in the long-bus mode and M of the N communication lines each configured to transmit one of a second number of bits of the address bits defining the second memory size in the short-bus mode, where M is less than N.

    摘要翻译: 计算装置具有模式选择器,其被配置为选择对应于第一存储器大小的长总线模式和对应于小于第一存储器大小的第二存储器大小的短总线模式之一。 计算装置的地址总线被配置为发送由定义第一存储器大小的地址位和定义第二存储器大小的地址位的子集组成的地址。 地址总线具有N个通信线路,每个通信线路被配置为发送长整型模式中定义第一存储器大小的地址位的第一位数目中的一个,并且N个通信线路的M个被配置为发送第二数量的 在短总线模式中定义第二存储器大小的地址位的位,其中M小于N.