Multi-node data processing system and communication protocol having a partial combined response
    41.
    发明授权
    Multi-node data processing system and communication protocol having a partial combined response 失效
    多节点数据处理系统和具有部分组合响应的通信协议

    公开(公告)号:US06519649B1

    公开(公告)日:2003-02-11

    申请号:US09436899

    申请日:1999-11-09

    IPC分类号: G06F1516

    CPC分类号: G06F12/0813

    摘要: A data processing system includes an interconnect and first and second nodes, coupled to the interconnect, that each include at least one agent. Each agent within the first and second nodes outputs a snoop response in response to snooping a transaction on the interconnect. Utilizing the snoop response of each agent within the first node, first response logic within the first node produces a first cumulative combined response. This first cumulative combined response is then combined by second response logic in the second node with the snoop response of each agent in the second node to produce a second cumulative combined response. After a complete combined response is obtained in this manner, the complete combined response is distributed to all nodes so that each agent can determine its response, if any, to the transaction.

    摘要翻译: 数据处理系统包括互连以及耦合到互连的第一和第二节点,每个包括至少一个代理。 第一和第二节点内的每个代理响应于窥探互连上的事务而输出一个窥探响应。 利用第一节点内的每个代理的窥探响应,第一节点内的第一响应逻辑产生第一累积组合响应。 然后,该第一累积组合响应由第二节点中的第二响应逻辑与第二节点中每个代理的窥探响应组合以产生第二累积组合响应。 在以这种方式获得完整的组合响应之后,完整的组合响应被分配给所有节点,使得每个代理可以确定其对事务的响应(如果有的话)。

    Bus master and bus snooper for execution of global operations utilizing a single token for multiple operations with explicit release
    42.
    发明授权
    Bus master and bus snooper for execution of global operations utilizing a single token for multiple operations with explicit release 失效
    总线主机和总线监听器,用于执行全局操作,利用单个令牌进行多次操作,并显式释放

    公开(公告)号:US06516368B1

    公开(公告)日:2003-02-04

    申请号:US09435928

    申请日:1999-11-09

    IPC分类号: G06F1314

    CPC分类号: G06F12/0831

    摘要: In response to a need to initiate one or more global operations, a bus master within a multiprocessor system issues a combined token and operation request in a single bus transaction on a bus coupled to the bus master. The combined token and operation request solicits a single existing token required to complete the global operations within the multiprocessor system and identifies the first of the global operations to be processed with the token, if granted. Once a bus master is granted the token, no other bus master will be granted the token until the current token owner explicitly requests release. The current token owner repeats the combined token and operation request for each global operation which needs to be initiated and, on the last global operation, issues a combined request with an explicit release. Acknowledgement of the combined request with release implies release of the token for use by other bus masters.

    摘要翻译: 响应于需要启动一个或多个全局操作,多处理器系统内的总线主机在耦合到总线主机的总线上的单总线事务中发出组合令牌和操作请求。 组合的令牌和操作请求请求在多处理器系统中完成全局操作所需的单个现有令牌,并且如果被授予则标识要使用令牌处理的第一个全局操作。 一旦总线主机被授予令牌,在当前令牌所有者明确请求发布之前,将不会授予其他总线主机的令牌。 当前的标记所有者重复需要启动的每个全局操作的组合令牌和操作请求,并且在最后一个全局操作中发出具有明确版本的组合请求。 对发布的组合请求的确认意味着释放令牌供其他总线主机使用。

    Bus snooper for SMP execution of global operations utilizing a single token with implied release
    43.
    发明授权
    Bus snooper for SMP execution of global operations utilizing a single token with implied release 失效
    使用具有隐含释放的单个令牌来执行全局操作的SMP的总线监听器

    公开(公告)号:US06460100B1

    公开(公告)日:2002-10-01

    申请号:US09435929

    申请日:1999-11-09

    IPC分类号: G06F1314

    CPC分类号: G06F13/37

    摘要: Only a single snooper queue for global operations within a multiprocessor system is implemented within each bus snooper, controlled by a single token allowing completion of one operation. A bus snooper, upon detecting a combined token and operation request, begins speculatively processing the operation if the snooper is not already busy. The snooper then watches for a combined response acknowledging the combined request or a subsequent token request from the same processor, which indicates that the originating processor has been granted the sole token for completing global operations, before completing the operation. When processing an operation from a combined request and detecting an operation request (only) from a different processor, which indicates that another processor has been granted the token, the snooper suspends processing of the current operation and begins processing the new operation. If the snooper is busy when a combined request is received, the snooper retries the operation portion of the combined request and, upon detecting a subsequent operation request (only) for the operation, begins processing the operation at that time if not busy. Snoop logic for large multiprocessor systems is thus simplified, with conflict reduced to situations in which multiple processors are competing for the token.

    摘要翻译: 在一个多处理器系统内,只有一个用于全局操作的侦听队列是在每个总线侦听器中实现的,由一个允许完成一个操作的令牌控制。 一旦检测到组合的令牌和操作请求,总线侦听器开始推测性地处理该操作,如果该侦听器尚未忙。 监听器然后在完成操作之前监视来自同一处理器的组合请求或后续令牌请求的组合响应,其指示始发处理器已经被授予用于完成全局操作的唯一令牌。 当从组合请求处理操作并从另一处理器(仅指示另一个处理器已被授予令牌)检测到操作请求时,监听器暂停对当前操作的处理并开始处理新的操作。 如果接收到组合请求时,监听器正忙,则侦听器重试组合请求的操作部分,并且在检测到用于该操作的后续操作请求(仅))时,如果不忙,则开始处理该操作。 因此,大型多处理器系统的窥探逻辑被简化,冲突降低到多个处理器竞争令牌的情况。

    Method for alternate preferred time delivery of load data
    44.
    发明授权
    Method for alternate preferred time delivery of load data 失效
    负载数据交替优选时间交付方法

    公开(公告)号:US06389529B1

    公开(公告)日:2002-05-14

    申请号:US09344059

    申请日:1999-06-25

    IPC分类号: G06F9312

    摘要: A system for time-ordered execution of load instructions. More specifically, the system enables just-in-time delivery of data requested by a load instruction. The system consists of a processor, an L1 data cache with corresponding L1 cache controller, and an instruction processor. The instruction processor manipulates a plurality of architected time dependency fields of a load instruction to create a plurality of dependency fields. The dependency fields holds a relative dependency value which is utilized to order the load instruction in a Relative Time-Ordered Queue (RTOQ) of the L1 cache controller. The load instruction is sent from RTOQ to the L1 data cache at a particular time so that the data requested is loaded from the L1 data cache at the time specified by one of the dependency fields. The dependency fields are prioritized so that the cycle corresponding to the highest priority field which is available is utilized.

    摘要翻译: 用于加载指令的时间执行的系统。 更具体地,该系统实现了由加载指令请求的数据的及时传送。 该系统由处理器,具有对应的L1高速缓存控制器的L1数据高速缓存器和指令处理器组成。 指令处理器操纵加载指令的多个架构时间依赖性字段以创建多个依赖项。 相关性字段保持相对依赖性值,该相关性值用于对L1高速缓存控制器的相对时间排序队列(RTOQ)中的加载指令进行排序。 加载指令在特定时间从RTOQ发送到L1数据高速缓存,以便在由一个依赖项指定的时间内从L1数据高速缓存中加载请求的数据。 优先依赖关系字段,以便利用对应于可用的最高优先级字段的周期。

    Merged vertical cache controller mechanism with combined cache controller and snoop queries for in-line caches
    45.
    发明授权
    Merged vertical cache controller mechanism with combined cache controller and snoop queries for in-line caches 失效
    合并的垂直缓存控制器机制与组合高速缓存控制器和窥探查询用于在线高速缓存

    公开(公告)号:US06347363B1

    公开(公告)日:2002-02-12

    申请号:US09024316

    申请日:1998-02-17

    IPC分类号: G06F1208

    摘要: Logically in line caches within a multilevel cache hierarchy are jointly controlled by single cache controller. By combining the cache controller and snoop logic for different levels within the cache hierarchy, separate queues are not required for each level. During a cache access, cache directories are looked up in parallel. Data is retrieved from an upper cache if hit, or from the lower cache if the upper cache misses and the lower cache hits. LRU units may be updated in parallel based on cache directory hits. Alternatively, the lower cache LRU unit may be updated based on cache memory accesses rather than cache directory hits, or the cache hierarchy may be provided with user selectable modes of operation for both LRU unit update schemes. The merged vertical cache controller mechanism does not require the lower cache memory to be inclusive of the upper cache memory. A novel deallocation scheme and update protocol may be implemented in conjunction with the merged vertical cache controller mechanism.

    摘要翻译: 逻辑上在多级缓存层次结构中的行高速缓存由单缓存控制器联合控制。 通过将缓存控制器和窥探逻辑组合在缓存层次结构中的不同级别,每个级别不需要单独的队列。 在缓存访问期间,并行查找缓存目录。 如果命中,则从高级缓存中检索数据,如果高速缓存未命中,并且较低级缓存命中,则从较低级缓存中检索数据。 可以基于缓存目录命中并行更新LRU单元。 或者,可以基于高速缓存存储器访问而不是高速缓存目录命中来更新低级缓存LRU单元,或者可以为两个LRU单元更新方案提供用户可选择的操作模式的高速缓存层级。 合并的垂直高速缓存控制器机制不需要较低的高速缓冲存储器来包含高速缓存存储器。 可以结合合并的垂直高速缓存控制器机制来实现新颖的解除分配方案和更新协议。

    Multiprocessor system bus with system controller explicitly updating snooper LRU information
    46.
    发明授权
    Multiprocessor system bus with system controller explicitly updating snooper LRU information 失效
    具有系统控制器的多处理器系统总线显式更新窥探LRU信息

    公开(公告)号:US06338124B1

    公开(公告)日:2002-01-08

    申请号:US09368229

    申请日:1999-08-04

    IPC分类号: G06F1208

    CPC分类号: G06F12/123 G06F12/0831

    摘要: Combined response logic for a bus receives a combined data access and cast out/deallocate operation initiating by a storage device within a specific level of a storage hierarchy, with a coherency state and LRU position of the cast out/deallocate victim appended. Snoopers on the bus drive snoop responses to the combined operation with the coherency state and/or LRU position of locally-stored cache lines corresponding to the victim appended. The combined response logic determines, from the coherency state and LRU position information appended to the combined operation and the snoop responses, whether an update of the LRU position and/or coherency state of a cache line corresponding to the victim within one of the snoopers is required. If so, the combined response logic selects a snooper storage device to have at least the LRU position of a respective cache line corresponding to the victim updated, and appends an update command identifying the selected snooper to the combined response. The snooper selected to be updated may be randomly chosen, selected based on LRU position of the cache line corresponding to the victim within respective storage, or selected based on other criteria.

    摘要翻译: 总线的组合响应逻辑接收组合的数据访问,并且通过由存储层级的特定级别中的存储设备发起/撤销分配操作,附加了外推/解除分配的受害者的一致性状态和LRU位置。 总线驱动器侦听器上的侦听器响应于与所附加的受害者对应的本地存储的缓存线的相关性状态和/或LRU位置的组合操作。 组合响应逻辑从相关性状态和附加到组合操作和窥探响应的LRU位置信息中确定与窥探者之一内的受害者对应的高速缓存线的LRU位置和/或一致性状态的更新是否是 需要。 如果是,组合的响应逻辑选择窥探存储设备至少具有与受害者相对应的相应高速缓存行的LRU位置更新,并且将识别所选窥探者的更新命令附加到组合响应。 选择要更新的窥探者可以被随机地选择,基于在相应存储器内对应于受害者的高速缓存线的LRU位置来选择,或者基于其他标准来选择。

    Multiprocessor system bus with combined snoop responses explicitly cancelling master allocation of read data
    47.
    发明授权
    Multiprocessor system bus with combined snoop responses explicitly cancelling master allocation of read data 失效
    具有组合侦听响应的多处理器系统总线显式地取消读取数据的主分配

    公开(公告)号:US06321305B1

    公开(公告)日:2001-11-20

    申请号:US09368230

    申请日:1999-08-04

    IPC分类号: G06F1200

    摘要: In cancelling the cast out portion of a combined operation including a data access related to the cast out, the combined response logic explicitly directs the storage device initiating the combined operation not to allocate storage for the target of the data access. Instead, the target of the data access may be passed directly to an in-line processor core without storage, may be stored in a horizontal storage device, or may be stored in an in-line, noninclusive, lower level storage device. Cancellation of the cast out thus defers any latency associated with writing the cast out victim to system memory while maximizing utilization of available storage with acceptable tradeoffs in data access latency.

    摘要翻译: 组合响应逻辑在取消组合操作包括与丢弃相关的数据访问的部署时,明确地指示存储设备启动组合操作,而不为数据访问的目标分配存储。 相反,数据访问的目标可以直接传递到没有存储的在线处理器核心,可以被存储在水平存储设备中,或者可以被存储在一个在线的,独立的,低级的存储设备中。 取消投票,从而延迟与将丢弃的受害者写入系统内存相关的任何延迟,同时最大限度地利用可用存储在数据访问延迟中具有可接受的折中。

    Cache coherency protocol having hovering (H) and recent (R) states
    48.
    发明授权
    Cache coherency protocol having hovering (H) and recent (R) states 失效
    具有悬停(H)和最近(R)状态的高速缓存一致性协议

    公开(公告)号:US06292872B1

    公开(公告)日:2001-09-18

    申请号:US09024609

    申请日:1998-02-17

    IPC分类号: G06F1200

    CPC分类号: G06F12/0833

    摘要: A cache and method of maintaining cache coherency in a data processing system are described. The data processing system includes a plurality of processors and a plurality of caches coupled to an interconnect. According to the method, a first data item is stored in a first of the caches in association with an address tag indicating an address of the first data item. A coherency indicator in the first cache is set to a first state that indicates that the tag is valid and that the first data item is invalid. Thereafter, the interconnect is snooped to detect a data transfer initiated by another of the plurality of caches, where the data transfer is associated with the address indicated by the address tag and contains a valid second data item. In response to detection of such a data transfer while the coherency indicator is set to the first state, the first data item is replaced by storing the second data item in the first cache in association with the address tag. In addition, the coherency indicator is updated to a second state indicating that the second data item is valid and that the first cache can supply said second data item in response to a request.

    摘要翻译: 描述了在数据处理系统中维持高速缓存一致性的缓存和方法。 数据处理系统包括耦合到互连的多个处理器和多个高速缓存。 根据该方法,第一数据项与指示第一数据项的地址的地址标签相关联地存储在第一缓存中。 第一高速缓存中的一致性指示符被设置为指示标签有效并且第一数据项无效的第一状态。 此后,窥探互连以检测由多个高速缓存中的另一个缓存发起的数据传输,其中数据传输与由地址标签指示的地址相关联并且包含有效的第二数据项。 响应于在一致性指示符被设置为第一状态时检测到这样的数据传输,通过与地址标签相关联地将第二数据项存储在第一高速缓存中来替换第一数据项。 此外,一致性指示符被更新为指示第二数据项有效的第二状态,并且第一高速缓存可以响应于请求提供所述第二数据项。

    Demand based sync bus operation
    49.
    发明授权
    Demand based sync bus operation 失效
    基于需求的同步总线操作

    公开(公告)号:US06175930B1

    公开(公告)日:2001-01-16

    申请号:US09024586

    申请日:1998-02-17

    IPC分类号: H02H305

    CPC分类号: G06F12/0831

    摘要: A register associated with the architected logic queue of a memory-coherent device within a multiprocessor system contains a flag set whenever an architected operation—one which might affect the storage hierarchy as perceived by other devices within the system—is posted in the snoop queue of a remote snooping device. The flag remains set and is reset only when a synchronization instruction (such as the “sync” instruction supported by the PowerPC™ family of devices) is received from a local processor. The state of the flag thus provides historical information regarding architected operations which may be pending in other devices within the system after being snooped from the system bus. This historical information is utilized to determine whether a synchronization operation should be presented on the system bus, allowing unnecessary synchronization operations to be filtered and additional system bus cycles made available for other purposes. When a local processor issues a synchronization instruction to the device managing the architected logic queue, the instruction is generally accepted when the architected logic queue is empty. Otherwise the architected operation is retried back to the local processor until the architected logic queue becomes empty. If the flag is set when the synchronization instruction is accepted from the local processor, it is presented on the system bus. If the flag is not set when the synchronization instruction is received from the local processor, the synchronization operation is unnecessary and is not presented on the system bus.

    摘要翻译: 与多处理器系统中的存储器相干设备的架构化逻辑队列相关联的寄存器包含标志集,每当可以影响系统内其他设备感知到的存储层次结构的操作(一个可能影响系统中的其他设备的架构操作)被发布在 一个远程监听设备。 该标志保持置位,并且仅当从本地处理器接收到同步指令(例如由PowerPC TM系列器件支持的“sync”指令)时才会复位该标志。 因此,标志的状态提供关于在从系统总线窥探之后可能在系统内的其他设备中挂起的架构操作的历史信息。 该历史信息用于确定是否应在系统总线上呈现同步操作,从而允许滤除不必要的同步操作,并为其他目的提供额外的系统总线周期。 当本地处理器向管理架构的逻辑队列的设备发出同步指令时,当架构的逻辑队列为空时,通常会接受该指令。 否则,架构操作将重新回到本地处理器,直到架构化的逻辑队列变为空。 如果在本地处理器接受同步指令时设置了标志,则会将其显示在系统总线上。 如果当从本地处理器接收到同步指令时未设置标志,则不需要同步操作,并且不会在系统总线上呈现同步操作。

    Method and system for increasing system memory bandwidth within a
symmetric multiprocessor data-processing system
    50.
    发明授权
    Method and system for increasing system memory bandwidth within a symmetric multiprocessor data-processing system 失效
    在对称多处理器数据处理系统中增加系统内存带宽的方法和系统

    公开(公告)号:US6094710A

    公开(公告)日:2000-07-25

    申请号:US992786

    申请日:1997-12-17

    IPC分类号: G06F12/08 G06F12/00

    CPC分类号: G06F12/0813

    摘要: A method and system for increasing system memory bandwidth within a symmetric multiprocessor data-processing system are disclosed. The symmetric multiprocessor data-processing system includes several processing units. With conventional systems, all these processing units are typically coupled to a system memory via an interconnect. In order to increase the bandwidth of the system memory, the system memory is first divided into multiple partial system memories, wherein an aggregate of contents within all of these partial system memories equals to the contents of the system memory. Then, each of the processing units is individually associated with one of the partial system memories, such that the bandwidth of the system memory within the symmetric multiprocessor data-processing system is increased.

    摘要翻译: 公开了一种用于在对称多处理器数据处理系统内增加系统存储器带宽的方法和系统。 对称多处理器数据处理系统包括多个处理单元。 对于常规系统,所有这些处理单元通常经由互连耦合到系统存储器。 为了增加系统存储器的带宽,系统存储器首先划分为多个部分系统存储器,其中所有这些部分系统存储器内的内容的集合等于系统存储器的内容。 然后,每个处理单元分别与部分系统存储器之一相关联,使得对称多处理器数据处理系统内的系统存储器的带宽增加。