Method for instruction extensions for a tightly coupled speculative request unit
    53.
    发明授权
    Method for instruction extensions for a tightly coupled speculative request unit 有权
    紧耦合推测请求单元的指令扩展方法

    公开(公告)号:US06421763B1

    公开(公告)日:2002-07-16

    申请号:US09345642

    申请日:1999-06-30

    IPC分类号: G06F1208

    摘要: A method of operating a processing unit of a computer system, by issuing an instruction having an explicit prefetch request directly from an instruction sequence unit to a prefetch unit of the processing unit. The invention applies to values that are either operand data or instructions. In a preferred embodiment, two prefetch units are used, the first prefetch unit being hardware independent and dynamically monitoring one or more active streams associated with operations carried out by a core of the processing unit, and the second prefetch unit being aware of the lower level storage subsystem and sending with the prefetch request an indication that a prefetch value is to be loaded into a lower level cache of the processing unit. The invention may advantageously associate each prefetch request with a stream ID of an associated processor stream, or a processor ID of the requesting processing unit (the latter feature is particularly useful for caches which are shared by a processing unit cluster). If another prefetch value is requested from the memory hierarchy, and it is determined that a prefetch limit of cache usage has been met by the cache, then a cache line in the cache containing one of the earlier prefetch values is allocated for receiving the other prefetch value. The prefetch limit of cache usage may be established with a maximum number of sets in a congruence class usable by the requesting processing unit. A flag in a directory of the cache may be set to indicate that the prefetch value was retrieved as the result of a prefetch operation. In the implementation wherein the cache is a multi-level cache, a second flag in the cache directory may be set to indicate that the prefetch value has been sourced to an upstream cache. A cache line containing prefetch data can be automatically invalidated after a preset amount of time has passed since the prefetch value was requested.

    摘要翻译: 一种操作计算机系统的处理单元的方法,通过从指令序列单元向处理单元的预取单元发出具有显式预取请求的指令。 本发明适用于作为操作数数据或指令的值。 在优选实施例中,使用两个预取单元,第一预取单元是硬件独立的,并且动态地监视与由处理单元的核心执行的操作相关联的一个或多个活动流,并且第二预取单元知道较低级别 存储子系统,并用预取请求发送将预取值加载到处理单元的较低级缓存中的指示。 本发明可以有利地将每个预取请求与相关联的处理器流的流ID或请求处理单元的处理器ID相关联(后一特征对于由处理单元簇共享的高速缓存特别有用)。 如果从存储器层次结构请求另一个预取值,并且确定高速缓存的高速缓存使用的预取限制已经被高速缓存满足,则分配包含较早预取值之一的高速缓存行中的高速缓存行用于接收另一个预取 值。 高速缓存使用的预取限制可以由请求处理单元可用的同余类中的最大数量的集合来建立。 高速缓存目录中的标志可以被设置为指示作为预取操作的结果检索预取值。 在其中缓存是多级高速缓存的实现中,高速缓存目录中的第二标志可以被设置为指示预取值已经被提供给上游高速缓存。 包含预取数据的缓存行可以在从请求预取值开始经过预设的时间后自动失效。

    High performance multichannel DMA controller for a PCI host bridge with a built-in cache
    54.
    发明授权
    High performance multichannel DMA controller for a PCI host bridge with a built-in cache 失效
    具有内置缓存的PCI主机桥的高性能多通道DMA控制器

    公开(公告)号:US06230219B1

    公开(公告)日:2001-05-08

    申请号:US08966873

    申请日:1997-11-10

    IPC分类号: G06F1328

    CPC分类号: G06F13/30

    摘要: A host bridge having a dataflow controller is provided. In a preferred embodiment, the host bridge contains a read command path which has a mechanism for requesting and receiving data from an upstream device. The host bridge also contains a write command path that has means for receiving data from a downstream device and for transmitting the received data to an upstream device. A target controller is used to receive the read and write commands from the downstream device and to steer the read command toward the read command path and the write command toward the write command path. A bus controller is also used to request control of an upstream bus before transmitting the request for data of the read command and transmitting the data of the write command.

    摘要翻译: 提供了具有数据流控制器的主机桥。 在优选实施例中,主桥包含读命令路径,其具有用于从上游设备请求和接收数据的机制。 主机桥还包含写命令路径,其具有用于从下游设备接收数据并将接收的数据发送到上游设备的装置。 目标控制器用于从下游设备接收读取和写入命令,并将读取命令转向读取命令路径和朝向写入命令路径的写入命令。 在发送读取命令的数据请求并发送写命令的数据之前,总线控制器还用于请求对上游总线的控制。

    High performance cache intervention mechanism for symmetric multiprocessor systems
    55.
    发明授权
    High performance cache intervention mechanism for symmetric multiprocessor systems 失效
    对称多处理器系统的高性能缓存干预机制

    公开(公告)号:US06763433B1

    公开(公告)日:2004-07-13

    申请号:US09696910

    申请日:2000-10-26

    IPC分类号: G06F1208

    CPC分类号: G06F12/0831 G06F12/0888

    摘要: Upon snooping an operation in which an intervention is permitted or required, an intervening cache may elect to source only that portion of a requested cache line which is actually required, rather than the entire cache line. For example, if the intervening cache determines that the requesting cache would likely be required to invalidate the cache line soon after receipt, less than the full cache line may be sourced to the requesting cache. The requesting cache will not cache less than a full cache line, but may forward the received data to the processor supported by the requesting cache. Data bus bandwidth utilization may therefore be reduced. Additionally, the need to subsequently invalidate the cache line within the requesting cache is avoided, together with the possibility that the requesting cache will retry an operation requiring invalidation of the cache line.

    摘要翻译: 在窥探允许或需要干预的操作时,中间缓存可以选择仅仅提供实际需要的所请求的高速缓存行的那部分,而不是整个高速缓存行。 例如,如果中间缓存确定可能需要请求的高速缓存在接收之后不久使高速缓存行无效,则小于全缓存行可以被提供给请求的高速缓存。 请求的高速缓存不会缓存小于完整的高速缓存行,而是可以将接收到的数据转发到由请求的高速缓存支持的处理器。 因此可以减少数据总线带宽利用率。 此外,避免了随后使请求高速缓存中的高速缓存行无效的需要,以及请求高速缓存将重试需要高速缓存行无效的操作的可能性。

    Enhanced cache management mechanism via an intelligent system bus monitor
    56.
    发明授权
    Enhanced cache management mechanism via an intelligent system bus monitor 失效
    通过智能系统总线监视器增强缓存管理机制

    公开(公告)号:US06721856B1

    公开(公告)日:2004-04-13

    申请号:US09696887

    申请日:2000-10-26

    IPC分类号: G06F1200

    CPC分类号: G06F12/0831

    摘要: In addition to an address tag, a coherency state and an LRU position, each cache directory entry includes historical processor access, snoop operation, and system controller hint information for the corresponding cache line. Each entry includes different subentries for different processors which have accessed the corresponding cache line, with subentries containing a processor access sequence segment, a snoop operation sequence segment, and a system controller hint history segment. In addition to an address tag, within each system controller bus transaction sequence log directory entry is contained one or more opcodes identifying bus operations addressing the corresponding cache line, a processor identifier associated with each opcode, and a timestamp associated with each opcode. Also, along with each system bus transaction's opcode, the individual snoop responses that were received from one or more snoopers and the hint information that was provided to the requester and the snoopers may also be included. This information may then be utilized by the system controller to append hints to the combined snoop responses in order to influence cache controllers (the requestor(s), snoopers, or both) handling of victim selection, coherency state transitions, LRU state transitions, deallocation timing, and other cache management functions.

    摘要翻译: 除了地址标签,一致性状态和LRU位置之外,每个缓存目录条目包括对应的高速缓存行的历史处理器访问,窥探操作和系统控制器提示信息。 每个条目包括不同处理器的不同子条目,这些子进程已访问相应的高速缓存行,子条目包含处理器访问序列段,侦听操作序列段和系统控制器提示历史段。 除了地址标签之外,在每个系统控制器总线事务序列日志目录条目中还包含一个或多个操作码,用于标识寻址相应高速缓存行的总线操作,与每个操作码相关联的处理器标识符以及与每个操作码相关联的时间戳。 此外,连同每个系统总线事务的操作码,也可以包括从一个或多个窥探者接收到的各个窥探响应以及提供给请求者和窥探者的提示信息。 该信息随后可被系统控制器利用来附加到组合侦听响应的提示,以便影响高层控制器(请求者,窥探者或两者)处理受害者选择,相关性状态转换,LRU状态转换,释放 定时和其他缓存管理功能。

    Dynamic cache management in a symmetric multiprocessor system via snoop operation sequence analysis
    57.
    发明授权
    Dynamic cache management in a symmetric multiprocessor system via snoop operation sequence analysis 失效
    通过snoop操作序列分析在对称多处理器系统中进行动态缓存管理

    公开(公告)号:US06601144B1

    公开(公告)日:2003-07-29

    申请号:US09696912

    申请日:2000-10-26

    IPC分类号: G06F1208

    摘要: In addition to an address tag, a coherency state and an LRU position, each cache directory entry includes historical processor access and snoop operation information for the corresponding cache line. The historical processor access and snoop operation information includes different subentries for each different processor which has accessed the corresponding cache line, with subentries being “pushed” along the stack when a new processor accesses the subject cache line. Each subentries contains the processor identifier for the corresponding-processor which accessed the cache line, a processor access history segment, and a snoop operation history segment. The processor access history segment contains one or more opcodes identifying the operations which were performed by the processor, and timestamps associated with each opcode. The snoop operation history segment contains, for each operation snooped by the respective processor, a processor identifier for the processor originating the snooped operation, an opcode identifying the snooped operation, and a timestamp identifying when the operation was snooped. This historical processor access and snoop operation information may then be utilized by the cache controller to influence victim selection, coherency state transitions, LRU state transitions, deallocation timing, and other cache management functions so that smaller caches are given the effectiveness of very large caches through more intelligent cache management.

    摘要翻译: 除了地址标签,一致性状态和LRU位置之外,每个高速缓存目录条目包括对应的高速缓存行的历史处理器访问和窥探操作信息。 历史处理器访问和窥探操作信息包括已经访问相应的高速缓存行的每个不同处理器的不同子条目,当新的处理器访问对象高速缓存行时,子进程沿堆栈被“推送”。 每个子条目包含访问高速缓存线的对应处理器的处理器标识符,处理器访问历史段和窥探操作历史段。 处理器访问历史段包含标识由处理器执行的操作的一个或多个操作码以及与每个操作码相关联的时间戳。 对于由相应处理器窥探的每个操作,窥探操作历史段包含用于发起被窥探操作的处理器的处理器标识符,识别被窥探操作的操作码,以及标识该操作何时被窥探的时间戳。 然后,该历史处理器访问和窥探操作信息可被高速缓存控制器用于影响受害者选择,一致性状态转换,LRU状态转换,释放定时和其他高速缓存管理功能,使得较小的高速缓存被给予非常大的缓存的有效性 更智能的缓存管理。

    Multi-node data processing system and method of queue management in which a queued operation is speculatively cancelled in response to a partial combined response
    58.
    发明授权
    Multi-node data processing system and method of queue management in which a queued operation is speculatively cancelled in response to a partial combined response 失效
    多节点数据处理系统和队列管理方法,其中响应于部分组合响应推测性地取消排队操作

    公开(公告)号:US06591307B1

    公开(公告)日:2003-07-08

    申请号:US09436897

    申请日:1999-11-09

    IPC分类号: G06F112

    摘要: A data processing system includes an interconnect, a plurality of nodes coupled to the interconnect that each include at least one agent, response logic within each node, and a queue. In response to snooping a transaction on the interconnect, each agent outputs a snoop response. In addition, the queue, which has an associated agent, allocates an entry to service the transaction. The response logic within each node accumulates a partial combined response of its node and any preceding node until a complete combined response for all of the plurality of nodes is obtained. However, prior to the associated agent receiving the complete combined response, the queue speculatively deallocates the entry if the partial combined response indicates that an agent other than the associated agent will service the transaction.

    摘要翻译: 数据处理系统包括互连,耦合到互连的多个节点,每个节点包括至少一个代理,每个节点内的响应逻辑和队列。 响应在互连上窥探事务,每个代理输出一个侦听响应。 此外,具有关联代理的队列分配一个条目来为事务提供服务。 每个节点内的响应逻辑累积其节点和任何先前节点的部分组合响应,直到获得所有多个节点的完整组合响应。 然而,在相关联的代理接收到完整的组合响应之前,如果部分组合响应指示除了相关联的代理之外的代理将服务于该事务,则队列推测性地释放该条目。

    Multiprocessor system snoop scheduling mechanism for limited bandwidth snoopers performing directory update

    公开(公告)号:US06546468B2

    公开(公告)日:2003-04-08

    申请号:US09749054

    申请日:2000-12-27

    IPC分类号: G06F1200

    CPC分类号: G06F12/0831

    摘要: A multiprocessor computer system in which snoop operations of the caches are synchronized to allow the issuance of a cache operation during a cycle which is selected based on the particular manner in which the caches have been synchronized. Each cache controller is aware of when these synchronized snoop tenures occur, and can target these cycles for certain types of requests that are sensitive to snooper retries, such as kill-type operations. The synchronization may set up a priority scheme for systems with multiple interconnect buses, or may synchronize the refresh cycles of the DRAM memory of the snooper's directory. In another aspect of the invention, windows are created during which a directory will not receive write operations (i.e., the directory is reserved for only read-type operations). The invention may be implemented in a cache hierarchy which provides memory arranged in banks, the banks being similarly synchronized. The invention is not limited to any particular type of instruction, and the synchronization functionality may be hardware or software programmable.

    Layered speculative request unit with instruction optimized and storage hierarchy optimized partitions
    60.
    发明授权
    Layered speculative request unit with instruction optimized and storage hierarchy optimized partitions 失效
    分层推测请求单元,具有指令优化和存储层次结构优化分区

    公开(公告)号:US06496921B1

    公开(公告)日:2002-12-17

    申请号:US09345643

    申请日:1999-06-30

    IPC分类号: G06F930

    摘要: A method of operating a processing unit of a computer system, by issuing an instruction having an explicit prefetch request directly from an instruction sequence unit to a prefetch unit of the processing unit. The invention applies to values that are either operand data or instructions. In a preferred embodiment, two prefetch units are used, the first prefetch unit being hardware independent and dynamically monitoring one or more active streams associated with operations carried out by a core of the processing unit, and the second prefetch unit being aware of the lower level storage subsystem and sending with the prefetch request an indication that a prefetch value is to be loaded into a lower level cache of the processing unit. The invention may advantageously associate each prefetch request with a stream ID of an associated processor stream, or a processor ID of the requesting processing unit (the latter feature is particularly useful for caches which are shared by a processing unit cluster).

    摘要翻译: 一种操作计算机系统的处理单元的方法,通过从指令序列单元向处理单元的预取单元发出具有显式预取请求的指令。 本发明适用于作为操作数数据或指令的值。 在优选实施例中,使用两个预取单元,第一预取单元是硬件独立的,并且动态地监视与由处理单元的核心执行的操作相关联的一个或多个活动流,并且第二预取单元知道较低级别 存储子系统,并用预取请求发送将预取值加载到处理单元的较低级缓存中的指示。 本发明可以有利地将每个预取请求与相关联的处理器流的流ID或请求处理单元的处理器ID相关联(后一特征对于由处理单元簇共享的高速缓存特别有用)。