Method and system for managing speculative requests in a multi-level memory hierarchy
    51.
    发明授权
    Method and system for managing speculative requests in a multi-level memory hierarchy 失效
    用于管理多层内存层次结构中的推测性请求的方法和系统

    公开(公告)号:US06418516B1

    公开(公告)日:2002-07-09

    申请号:US09364409

    申请日:1999-07-30

    IPC分类号: G06F1208

    摘要: A method of operating a multi-level memory hierarchy of a computer system and apparatus embodying the method, wherein instructions issue having an explicit prefetch request directly from an instruction sequence unit to a prefetch unit of the processing unit. The invention applies to values that are either operand data or instructions and treats instructions in a different manner when they are loaded speculatively. These prefetch requests can be demand load requests, where the processing unit will need the operand data or instructions, or speculative load requests, where the processing unit may or may not need the operand data or instructions, but a branch prediction or stream association predicts that they might be needed. The load requests are sent to the lower level cache when the upper level cache does not contain the value required by the load. If a speculative request is for an instruction which is likewise not present in the lower level cache, that request is ignored, keeping both the lower level and upper level caches free of speculative values that are infrequently used. If the value is present in the lower level cache, it is loaded into the upper level cache. If a speculative request is for operand data, the value is loaded only into the lower level cache if it is not already present, keeping the upper level cache free of speculative operand data.

    摘要翻译: 一种操作计算机系统的多级存储器层级的方法和体现该方法的装置,其中指令从直接从指令序列单元向处理单元的预取单元发出具有显式预取请求的指令。 本发明适用于作为操作数数据或指令的值,并且当它们被推测地加载时以不同的方式对待指令。 这些预取请求可以是需求负载请求,其中处理单元将需要操作数数据或指令或推测性负载请求,其中处理单元可能需要或可能不需要操作数数据或指令,但分支预测或流关联预测 他们可能需要。 当高级缓存不包含负载所需的值时,负载请求将发送到较低级别的缓存。 如果对低级缓存中同样不存在的指令进行推测性请求,则忽略该请求,同时保持较低级别和上级缓存都不会被不经常使用的推测值。 如果该值存在于较低级缓存中,则将其加载到上级缓存中。 如果对于操作数数据是推测性请求,则该值仅在尚未存在的情况下被加载到较低级别的高速缓存中,保持高级缓存没有推测操作数数据。

    Protocol for transferring modified-unsolicited state during data intervention
    52.
    发明授权
    Protocol for transferring modified-unsolicited state during data intervention 有权
    缓存一致性协议,提供来自中间缓存的标志,以指示修改的高速缓存行的释放

    公开(公告)号:US06349369B1

    公开(公告)日:2002-02-19

    申请号:US09437180

    申请日:1999-11-09

    IPC分类号: G06F1200

    CPC分类号: G06F12/0831

    摘要: A novel cache coherency protocol provides a modified-unsolicited (MU) cache state to indicate that a value held in a cache line has been modified (i.e., is not currently consistent with system memory), but was modified by another processing unit, not by the processing unit associated with the cache that currently contains the value in the MU state, and that the value is held exclusive of any other horizontally adjacent caches. Because the value is exclusively held, it may be modified in that cache without the necessity of issuing a bus transaction to other horizontal caches in the memory hierarchy. The MU state may be applied as a result of a snoop response to a read request. The read request can include a flag to indicate that the requesting cache is capable of utilizing the MU state. Alternatively, a flag may be provided with intervention data to indicate that the requesting cache should utilize the modified-unsolicited state.

    摘要翻译: 一种新颖的高速缓存一致性协议提供修改的非请求(MU)高速缓存状态,以指示保持在高速缓存行中的值已被修改(即,当前不符合系统存储器),但是被另一个处理单元修改,而不是由 与当前包含MU状态的值的高速缓存相关联的处理单元,并且该值被保持为任何其他水平相邻的高速缓存。 因为该值是唯一保留的,所以可以在该高速缓存中修改该值,而不需要向存储器层级中的其他水平高速缓存发出总线事务。 作为对读取请求的窥探响应的结果,可以应用MU状态。 读取请求可以包括用于指示请求的高速缓存能够利用MU状态的标志。 或者,可以向标记提供干预数据,以指示请求的高速缓存应该利用修改的未经请求的状态。

    Method and system for communication in which a castout operation is cancelled in response to snoop responses
    53.
    发明授权
    Method and system for communication in which a castout operation is cancelled in response to snoop responses 失效
    用于通信的方法和系统,其中响应于窥探响应取消了退出操作

    公开(公告)号:US06349367B1

    公开(公告)日:2002-02-19

    申请号:US09368228

    申请日:1999-08-04

    IPC分类号: G06F1300

    CPC分类号: G06F12/0831 G06F12/0804

    摘要: An effectively “conditional”, cast out operation or cast out portion of a combined operation including a related data access may be cancelled by the combined response to the operation. The combined response logic receives coherency state and/or LRU position information for cache lines corresponding to the cast out victim within snoopers and vertically in-line storage. The combined response logic may also receive information regarding the presence of shared or invalid cache lines in snoopers or lower level storage within the congruence class for the victim, or information regarding the read-once nature of the data access target. Based on these responses, the combined response logic determines whether the cast out should be cancelled and, if so, selects and drives the appropriate combined response code.

    摘要翻译: 可以通过对操作的组合的响应来取消有效的“有条件”,丢弃包括相关数据访问在内的组合操作的部分。 组合的响应逻辑在窥探者和垂直的在线存储器中接收对应于被丢弃的受害者的高速缓存行的相关性状态和/或LRU位置信息。 组合的响应逻辑还可以接收关于在受害者的同余类中的窥探者或低级存储器中存在共享或无效高速缓存行的信息,或者关于数据访问目标的一次读取性质的信息。 基于这些响应,组合的响应逻辑确定是否应该取消推出,如果是,则选择并驱动适当的组合响应代码。

    Multiprocessor system bus transaction for transferring exclusive-deallocate cache state to lower lever cache
    54.
    发明授权
    Multiprocessor system bus transaction for transferring exclusive-deallocate cache state to lower lever cache 失效
    多处理器系统总线事务,用于将独占解除缓存状态转移到低级缓存

    公开(公告)号:US06314498B1

    公开(公告)日:2001-11-06

    申请号:US09437197

    申请日:1999-11-09

    IPC分类号: G06F1208

    CPC分类号: G06F12/0831 G06F12/0811

    摘要: A cache coherency protocol uses a “Exclusive-Deallocate” (ED) coherency state to indicate that a particular value is currently held in an upper level cache in an exclusive, unmodified form (not shared with any other caches of the computer system, including caches associated with the same processing unit), so that the value can conveniently be modified without any lower level bus transactions since no lower level caches have allocated a line for the value. If the value is subsequently modified in the upper level cache, its coherency state is simply switched to “modified” without the need for any bus transactions. Conversely, if the value is evicted from the upper level cache without ever having been modified, it can be loaded into the lower level cache with a coherency state indicating that the lower level cache contains the unmodified value exclusive of all other caches in other processing units of the computer system. If the value is initially loaded into the upper level cache from a cache of another processing unit, or from a lower level cache of the same processing unit, then the upper level cache may be selectively programmed to mark the cache line with the ED state.

    摘要翻译: 高速缓存一致性协议使用“独占解除分配”(ED)一致性状态来指示特定值当前以独占未修改的形式(不与计算机系统的任何其他高速缓存共享,包括高速缓存)保持在高级缓存中 与相同的处理单元关联),使得该值可以方便地被修改而没有任何较低级别的总线事务,因为没有较低级别的高速缓存已经为该值分配了一行。 如果该值随后在高级缓存中被修改,则其一致性状态被简单地切换到“修改”,而不需要任何总线事务。 相反,如果该值从上级缓存中被逐出而没有被修改,则可以将其加载到具有一致性状态的相关性状态中,该相关性状态指示低级缓存包含其他处理单元中所有其他高速缓存的排他性的未修改值 的计算机系统。 如果该值最初从另一处理单元的高速缓存或相同处理单元的较低级高速缓存加载到高级缓存中,则可以选择性地编程高级缓存以用ED状态标记高速缓存行。

    Multiprocessor system bus with combined snoop responses implicitly updating snooper LRU position
    55.
    发明授权
    Multiprocessor system bus with combined snoop responses implicitly updating snooper LRU position 失效
    具有组合侦听响应的多处理器系统总线隐式更新snooper LRU位置

    公开(公告)号:US06279086B1

    公开(公告)日:2001-08-21

    申请号:US09368227

    申请日:1999-08-04

    IPC分类号: G06F1208

    摘要: Upon snooping a combined data access and cast out/deallocate operation initiating by a horizontal storage device, snoop logic determines, from LRU position information appended to the combined response to the combined operation, whether the coherency state and/or LRU position of the victim may be upgraded within the subject storage device. If so, the coherency state or LRU position is upgraded to improve global data storage management. For instance, a cache line within a snooping storage device may be altered to assume the coherency state of the victim within the storage device initiating the combined operation to improve data storage management under a given replacement policy.

    摘要翻译: 在窥探组合的数据访问并且通过水平存储设备推出/取消分配操作时,窥探逻辑从附加到对组合操作的组合响应的LRU位置信息确定受害者的相关性状态和/或LRU位置是否可以 在主题存储设备内进行升级。 如果是这样,则一致性状态或LRU位置被升级以改进全局数据存储管理。 例如,可以改变窥探存储设备内的高速缓存行,以假定存储设备内的受害者的一致性状态发起组合操作,以改善给定替换策略下的数据存储管理。

    Multiprocessor system bus with system controller explicitly updating snooper cache state information
    56.
    发明授权
    Multiprocessor system bus with system controller explicitly updating snooper cache state information 失效
    具有系统控制器的多处理器系统总线显式更新窥探缓存状态信息

    公开(公告)号:US06275909B1

    公开(公告)日:2001-08-14

    申请号:US09368226

    申请日:1999-08-04

    IPC分类号: G06F1300

    CPC分类号: G06F12/0831 G06F12/0811

    摘要: Combined response logic for a bus receives a combined data access and cast out/deallocate operation initiating by a storage device within a specific level of a storage hierarchy with a coherency state of the cast out/deallocate victim appended. Snoopers on the bus drive snoop responses to the combined operation with the coherency state and/or LRU position of locally-stored cache lines corresponding to the victim appended. The combined response logic determines, from the coherency state information appended to the combined operation and the snoop responses, whether a coherency upgrade is possible. If so, the combined response logic selects a snooper storage device to upgrade the coherency state of a respective cache line corresponding to the victim, and appends an upgrade directive to the combined response. The snooper selected to upgrade the coherency state of a cache line corresponding the victim may be randomly chosen or, as an optimization, be chosen for having the highest LRU position for the respective cache line.

    摘要翻译: 总线的组合响应逻辑接收组合的数据访问,并且通过存储分层结构的特定级别中的存储设备发起/撤销分配操作,所述存储层级具有附加的转出/取消分配的受害者的一致性状态。 总线驱动器侦听器上的侦听器响应于与所附加的受害者对应的本地存储的缓存线的相关性状态和/或LRU位置的组合操作。 组合响应逻辑从附加到组合操作和窥探响应的一致性状态信息确定是否可以进行一致性升级。 如果是这样,组合的响应逻辑选择窥探存储设备来升级与受害者相对应的相应高速缓存行的一致性状态,并且将升级指令附加到组合响应。 选择用于升级与受害者相对应的高速缓存线的相关性状态的窥探者可以被随机选择,或者作为优化被选择以具有用于相应高速缓存行的最高LRU位置。

    Imprecise snooping based invalidation mechanism
    57.
    发明授权
    Imprecise snooping based invalidation mechanism 失效
    不精确的基于窥探的无效机制

    公开(公告)号:US06801984B2

    公开(公告)日:2004-10-05

    申请号:US09895119

    申请日:2001-06-29

    IPC分类号: G06F1208

    CPC分类号: G06F12/0831

    摘要: A method, system, and processor cache configuration that enables efficient retrieval of valid data in response to an invalidate cache miss at a local processor cache. A cache directory is provided a set of directional bits in addition to the coherency state bits and the address tag. The directional bits provide information that includes a processor cache identification (ID) and routing method. The processor cache ID indicates which processor's operation resulted in the cache line of the local processor changing to the invalidate (I) coherency state. The routing method indicates what transmission method to utilize to forward the cache line, from among a local system bus or a switch or broadcast mechanism. Processor/Cache directory logic provide responses to requests depending on the values of the directional bits.

    摘要翻译: 一种方法,系统和处理器高速缓存配置,其能够响应于在本地处理器高速缓存处的无效高速缓存未命中而有效地检索有效数据。 除了一致性状态位和地址标签之外,向缓存目录提供一组方向位。 方向位提供包括处理器缓存标识(ID)和路由方法的信息。 处理器缓存ID指示哪个处理器的操作导致本地处理器的高速缓存行变为无效(I)一致性状态。 该路由方法指示用于从本地系统总线或交换机或广播机制中转发高速缓存行的什么传输方法。 处理器/缓存目录逻辑根据定向位的值提供对请求的响应。

    Layered speculative request unit with instruction optimized and storage hierarchy optimized partitions
    58.
    发明授权
    Layered speculative request unit with instruction optimized and storage hierarchy optimized partitions 失效
    分层推测请求单元,具有指令优化和存储层次结构优化分区

    公开(公告)号:US06496921B1

    公开(公告)日:2002-12-17

    申请号:US09345643

    申请日:1999-06-30

    IPC分类号: G06F930

    摘要: A method of operating a processing unit of a computer system, by issuing an instruction having an explicit prefetch request directly from an instruction sequence unit to a prefetch unit of the processing unit. The invention applies to values that are either operand data or instructions. In a preferred embodiment, two prefetch units are used, the first prefetch unit being hardware independent and dynamically monitoring one or more active streams associated with operations carried out by a core of the processing unit, and the second prefetch unit being aware of the lower level storage subsystem and sending with the prefetch request an indication that a prefetch value is to be loaded into a lower level cache of the processing unit. The invention may advantageously associate each prefetch request with a stream ID of an associated processor stream, or a processor ID of the requesting processing unit (the latter feature is particularly useful for caches which are shared by a processing unit cluster).

    摘要翻译: 一种操作计算机系统的处理单元的方法,通过从指令序列单元向处理单元的预取单元发出具有显式预取请求的指令。 本发明适用于作为操作数数据或指令的值。 在优选实施例中,使用两个预取单元,第一预取单元是硬件独立的,并且动态地监视与由处理单元的核心执行的操作相关联的一个或多个活动流,并且第二预取单元知道较低级别 存储子系统,并用预取请求发送将预取值加载到处理单元的较低级缓存中的指示。 本发明可以有利地将每个预取请求与相关联的处理器流的流ID或请求处理单元的处理器ID相关联(后一特征对于由处理单元簇共享的高速缓存特别有用)。

    Multiprocessor computer system with sectored cache line system bus protocol mechanism
    59.
    发明授权
    Multiprocessor computer system with sectored cache line system bus protocol mechanism 失效
    多处理器计算机系统采用高速缓存线路系统总线协议机制

    公开(公告)号:US06484241B2

    公开(公告)日:2002-11-19

    申请号:US09752862

    申请日:2000-12-28

    IPC分类号: G06F1200

    CPC分类号: G06F12/0831

    摘要: A method of maintaining coherency in a multiprocessor computer system wherein each processing unit's cache has sectored cache lines. A first cache coherency state is assigned to one of the sectors of a particular cache line, and a second cache coherency state, different from the first cache coherency state, is assigned to the overall cache line while maintaining the first cache coherency state for the first sector. The first cache coherency state may provide an indication that the first sector contains a valid value which is not shared with any other cache (i.e., an exclusive or modified state), and the second cache coherency state may provide an indication that at least one of the sectors in the cache line contains a valid value which is shared with at least one other cache (a shared, recently-read, or tagged state). Other coherency states may be applied to other sectors in the same cache line. Partial intervention may be achieved by issuing a request to retrieve an entire cache line, and sourcing only a first sector of the cache line in response to the request. A second sector of the same cache line may be sourced from a third cache. Other sectors may also be sourced from a system memory device of the computer system as well. Appropriate system bus codes are utilized to transmit cache operations to the system bus and indicate which sectors of the cache line are targets of the cache operation.

    摘要翻译: 一种在多处理器计算机系统中维持一致性的方法,其中每个处理单元的高速缓冲存储器具有高速缓存行。 第一高速缓存一致性状态被分配给特定高速缓存行的一个扇区,并且与第一高速缓存一致性状态不同的第二高速缓存一致性状态被分配给总高速缓存行,同时保持第一高速缓存一致性状态 部门。 第一高速缓存一致性状态可以提供第一扇区包含不与任何其它高速缓存共享的有效值(即,排他或修改状态)的指示,并且第二高速缓存一致性状态可以提供以下指示: 高速缓存行中的扇区包含与至少一个其他高速缓存(共享,最近读取或标记状态)共享的有效值。 其他一致性状态可以应用于同一高速缓存行中的其他扇区。 部分干预可以通过发出检索整个高速缓存线的请求来实现,并且仅响应于该请求仅提供高速缓存行的第一扇区。 相同高速缓存行的第二扇区可以来自第三高速缓存。 其他扇区也可以来自计算机系统的系统存储器设备。 利用适当的系统总线代码将高速缓存操作发送到系统总线,并指示高速缓存行的哪些扇区是高速缓存操作的目标。

    Method for upper level cache victim selection management by a lower level cache
    60.
    发明授权
    Method for upper level cache victim selection management by a lower level cache 失效
    低级缓存的上级缓存受害者选择管理方法

    公开(公告)号:US06446166B1

    公开(公告)日:2002-09-03

    申请号:US09340073

    申请日:1999-06-25

    IPC分类号: G06F1214

    摘要: A method of improving memory access for a computer system, by sending load requests to a lower level storage subsystem along with associated information pertaining to intended use of the requested information by the requesting processor, without using a high level load queue. Returning the requested information to the processor along with the associated use information allows the information to be placed immediately without using reload buffers. A register load bus separate from the cache load bus (and having a smaller granularity) is used to return the information. An upper level (L1) cache may then be imprecisely reloaded (the upper level cache can also be imprecisely reloaded with store instructions). The lower level (L2) cache can monitor L1 and L2 cache activity, which can be used to select a victim cache block in the L1 cache (based on the additional L2 information), or to select a victim cache block in the L2 cache (based on the additional L1 information). L2 control of the L1 directory also allows certain snoop requests to be resolved without waiting for L1 acknowledgement. The invention can be applied to, e.g., instruction, operand data and translation caches.

    摘要翻译: 一种改进计算机系统的存储器访问的方法,通过将请求发送到较低级别的存储子系统以及由请求处理器对与请求的信息的预期用途有关的关联信息而不使用高级别的负载队列来进行发送。 将所请求的信息与相关联的使用信息一起返回到处理器允许立即放置信息而不使用重新加载缓冲器。 使用与缓存负载总线分离(并具有较小粒度)的寄存器负载总线返回信息。 然后可能不精确地重新加载上级(L1)高速缓存(高级缓存也可以不精确地用存储指令重新加载)。 低级(L​​2)缓存可以监视L1和L2高速缓存活动,其可用于在L1高速缓存中选择受害者缓存块(基于附加的L2信息),或者选择L2缓存中的受害缓存块( 基于附加的L1信息)。 L1目录的L2控制也允许解决某些侦听请求,而无需等待L1确认。 本发明可以应用于例如指令,操作数数据和翻译高速缓存。