Super-coherent data mechanisms for shared caches in a multiprocessing system
    61.
    发明授权
    Super-coherent data mechanisms for shared caches in a multiprocessing system 有权
    多处理系统中共享缓存的超连贯数据机制

    公开(公告)号:US06658539B2

    公开(公告)日:2003-12-02

    申请号:US09978353

    申请日:2001-10-16

    IPC分类号: G06F1200

    CPC分类号: G06F12/0831 G06F12/084

    摘要: A method for improving performance of a multiprocessor data processing system having processor groups with shared caches. When a processor within a processor group that shares a cache snoops a modification to a shared cache line in a cache of another processor that is not within the processor group, the coherency state of the shared cache line within the first cache is set to a first coherency state that indicates that the cache line has been modified by a processor not within the processor group and that the cache line has not yet been updated within the group's cache. When a request for the cache line is later issued by a processor, the request is issued to the system bus or interconnect. If a received response to the request indicates that the processor should utilize super-coherent data, the coherency state of the cache line is set to a processor-specific super coherency state. This state indicates that subsequent requests for the cache line by the first processor should be provided said super-coherent data, while a subsequent request for the cache line by a next processor in the processor group that has not yet issued a request for the cache line on the system bus, may still go to the system bus to request the cache line. The individualized, processor-specific super coherency states are individually set but are usually changed to another coherency state (e.g., Modified or Invalid) as a group.

    摘要翻译: 一种用于改善具有处理器组与共享高速缓存的多处理器数据处理系统的性能的方法。 当共享缓存的处理器组内的处理器窥探在处理器组内的另一处理器的高速缓存中的共享高速缓存线的修改时,第一高速缓存内的共享高速缓存行的一致性状态被设置为第一 指示高速缓存行已被处理器组内的处理器修改并且高速缓存行尚未在组的高速缓存内更新的一致性状态。 当稍后由处理器发出对高速缓存行的请求时,该请求被发布到系统总线或互连。 如果对该请求的接收到的响应指示处理器应该使用超相干数据,则高速缓存行的一致性状态被设置为处理器特定的超一致性状态。 该状态指示应该为所述超相干数据提供由第一处理器对高速缓存行的后续请求,而处理器组中尚未发出对高速缓存行请求的下一个处理器对高速缓存行的后续请求 在系统总线上,仍然可以去系统总线请求缓存行。 个性化的处理器特定的超一致性状态是单独设置的,但是通常作为一组更改为另一个一致性状态(例如,修改或无效)。

    Multiprocessor speculation mechanism for efficiently managing multiple barrier operations
    62.
    发明授权
    Multiprocessor speculation mechanism for efficiently managing multiple barrier operations 有权
    用于有效管理多个屏障操作的多处理器推测机制

    公开(公告)号:US06625660B1

    公开(公告)日:2003-09-23

    申请号:US09588605

    申请日:2000-06-06

    IPC分类号: G06F1516

    摘要: Disclosed is a method of operation within a processor that permits load instructions to be issued speculatively. An instruction sequence is received that includes multiple barrier instructions and a load instruction that follows the barrier instructions in the instruction sequence. In response to the multiple barrier instructions, barrier operations are issued on an interconnect coupled to the processor. Also, while the barrier operations are pending, a load request associated with the load instruction is speculatively issued. When the load request is issued, a flag is set to indicate that it was speculatively issued. The flag is reset when acknowledgments of all the barrier operations are received. Data that is returned before the acknowledgments are received is temporarily held and forwarded to the register and/or execution unit of the processor only after the acknowledgments are received. If a snoop invalidate is detected for the speculatively issued load request before completion of the barrier operations, the data is discarded and the load request is re-issued.

    摘要翻译: 公开了一种在处理器内操作的方法,其允许以推测方式发布加载指令。 接收包括多个屏障指令和跟随指令序列中的屏障指令的加载指令的指令序列。 响应于多个屏障指令,在耦合到处理器的互连上发出屏障操作。 此外,当屏障操作正在等待时,推测性地发出与加载指令相关联的加载请求。 当发出加载请求时,会设置一个标志来指示它被推测发出。 当接收到所有屏障操作的确认时,该标志被复位。 在接收到确认之前返回的数据被暂时保存,并且在接收到确认之后被转发到处理器的寄存器和/或执行单元。 如果在完成屏障操作之前,对于推测发出的加载请求检测到窥探无效,则丢弃数据并重新发出加载请求。

    Method and apparatus for executing singly-initiated, singly-sourced variable delay system bus operations of differing character
    63.
    发明授权
    Method and apparatus for executing singly-initiated, singly-sourced variable delay system bus operations of differing character 失效
    用于单独启动的单源可变延迟系统总线操作的不同特性的方法和装置

    公开(公告)号:US06178485B1

    公开(公告)日:2001-01-23

    申请号:US09114187

    申请日:1998-07-13

    IPC分类号: G06F1200

    CPC分类号: G06F12/0831

    摘要: The present invention is a method and apparatus for preventing the occurrence of deadlocks from the execution of singly-initiated singly-sourced variable delay system bus operations. In general, each snooper excepts a given operation at the same time according to an agreed upon condition. In other words, the snooper in a given cache can accept an operation and begin working on it even while retrying the operation. Furthermore, none of the active snoopers release an operation until all the active snoopers are done with the operation. In other words, execution of a given operation is started by the snoopers at the same time and finished by each of the snoopers at the same time. This prevents the ping-pong deadlock by keeping any one cache from finishing the operation before any of the others.

    摘要翻译: 本发明是一种用于防止从单次发起的单源可变延迟系统总线操作的执行中发生死锁的方法和装置。 一般来说,每个窥探者除了按照约定的条件同时给定操作。 换句话说,即使在重试操作时,给定缓存中的监听器也可以接受操作并开始处理。 此外,没有一个主动侦听器释放一个操作,直到所有的主动侦听器都完成了操作。 换句话说,给定操作的执行由窥探者同时开始,同时由每个窥探者完成。 这可以防止乒乓的死锁,因为任何一个缓存都不会在其他任何缓存之前完成操作。

    Latch-and-hold circuit that permits subcircuits of an integrated circuit
to operate at different frequencies
    64.
    发明授权
    Latch-and-hold circuit that permits subcircuits of an integrated circuit to operate at different frequencies 失效
    锁存和保持电路允许集成电路的子电路在不同频率下工作

    公开(公告)号:US6161189A

    公开(公告)日:2000-12-12

    申请号:US992132

    申请日:1997-12-17

    IPC分类号: G06F1/04 G06F13/42 H04L12/00

    CPC分类号: G06F13/423 G06F1/12

    摘要: An integrated circuit comprises a semiconductor substrate having integrated circuitry formed therein. According to the present invention, the integrated circuitry includes a plurality of subcircuits, including first and second subcircuits that concurrently operate at diverse first and second frequencies, respectively. According to one embodiment, the integrated circuit has a clock signal that alternates between an active state and an inactive state at a third frequency and is broadcast to all of the subcircuits. In this embodiment, at least one subcircuit among the plurality of subcircuits, for example, a processor, operates in response to the clock signal at the third frequency, which is higher than the first frequency. According to another embodiment, the subcircuits each communicate with at least one other subcircuit via a latch-to-latch interface.

    摘要翻译: 集成电路包括其中形成有集成电路的半导体衬底。 根据本发明,集成电路包括多个子电路,包括分别以不同的第一和第二频率同时工作的第一和第二子电路。 根据一个实施例,集成电路具有在第三频率处于活动状态和非活动状态之间交替的时钟信号,并且广播到所有子电路。 在本实施例中,多个子电路(例如处理器)中的至少一个子电路响应于高于第一频率的第三频率的时钟信号而工作。 根据另一个实施例,子电路通过锁存器到锁存器接口与至少一个其他子电路通信。

    Dynamic folding of cache operations for multiple coherency-size systems
    65.
    发明授权
    Dynamic folding of cache operations for multiple coherency-size systems 失效
    用于多个一致性大小系统的缓存操作的动态折叠

    公开(公告)号:US6105112A

    公开(公告)日:2000-08-15

    申请号:US834120

    申请日:1997-04-14

    IPC分类号: G06F9/30 G06F12/08 G06F12/00

    CPC分类号: G06F9/30047 G06F12/0831

    摘要: A method is disclosed of managing architectural operations in a computer system whose architecture includes components having varying coherency granule sizes. A queue is provided for receiving as entries a plurality of the architectural operations, the entries of the queue are compared with a new architectural operation to determine if the new architectural operation is redundant with any of the entries. If the new architectural operation is not redundant with any of the entries, it is loaded in the queue. The computer system may include a cache having a processor granularity size and a system bus granularity size which is larger than the processor granularity size, and the architectural operations are cache instructions. The comparison may be performed in an associative manner based on the varying coherency granule sizes.

    摘要翻译: 公开了一种在计算机系统中管理架构操作的方法,该系统的架构包括具有变化的一致性粒度大小的组件。 提供了用于作为条目接收多个架构操作的队列,将队列的条目与新的架构操作进行比较,以确定新的架构操作是否与任何条目冗余。 如果新架构操作对于任何条目都不是冗余的,则它将被加载到队列中。 计算机系统可以包括具有处理器粒度大小和大于处理器粒度大小的系统总线粒度大小的高速缓存,并且架构操作是高速缓存指令。 可以基于变化的一致性粒度大小的关联方式来进行比较。

    Demand based sync bus operation
    66.
    发明授权
    Demand based sync bus operation 失效
    基于需求的同步总线操作

    公开(公告)号:US6065086A

    公开(公告)日:2000-05-16

    申请号:US24615

    申请日:1998-02-17

    CPC分类号: G06F13/4243

    摘要: A register associated with the architected logic queue of a memory-coherent device within a multiprocessor system contains a flag set whenever an architected operation enters the initiating device's architected logic queue to be issued on the system bus. The flag remains set even after the architected logic queue is drained, and is reset only when a synchronization instruction is received from a local processor, providing historical information regarding architected operations which may be pending in other devices. This historical information is utilized to determine whether a synchronization operation should be presented on the system bus, allowing unnecessary synchronization operations to be filtered. When a local processor issues a synchronization instruction to the device managing the architected logic queue, the instruction is generally accepted when the architected logic queue is empty. Otherwise the architected operation is retried back to the local processor until the architected logic queue becomes empty. If the flag is set when the synchronization instruction is accepted from the local processor, it is presented on the system bus. If the flag is not set when the synchronization instruction is received from the local processor, the synchronization operation is unnecessary and is not presented on the system bus.

    摘要翻译: 与多处理器系统中的存储器相干设备的架构化逻辑队列相关联的寄存器包含每当架构操作进入要在系统总线上发布的启动设备的架构逻辑队列时的标志集。 即使在架构化逻辑队列耗尽之后,该标志也保持置位,并且仅当从本地处理器接收到同步指令时才复位,提供有关其他设备中可能挂起的架构操作的历史信息。 该历史信息用于确定是否应在系统总线上呈现同步操作,从而允许过滤不必要的同步操作。 当本地处理器向管理架构的逻辑队列的设备发出同步指令时,当架构的逻辑队列为空时,通常会接受该指令。 否则,架构操作将重新回到本地处理器,直到架构化的逻辑队列变为空。 如果在本地处理器接受同步指令时设置了标志,则会将其显示在系统总线上。 如果当从本地处理器接收到同步指令时未设置标志,则不需要同步操作,并且不会在系统总线上呈现同步操作。

    Method and system for controlling access to a shared resource in a data
processing system utilizing dynamically-determined weighted
pseudo-random priorities
    67.
    发明授权
    Method and system for controlling access to a shared resource in a data processing system utilizing dynamically-determined weighted pseudo-random priorities 失效
    用于使用动态确定的加权伪随机优先级来控制对数据处理系统中的共享资源的访问的方法和系统

    公开(公告)号:US5896539A

    公开(公告)日:1999-04-20

    申请号:US839438

    申请日:1997-04-14

    CPC分类号: G06F13/364

    摘要: A method and system for controlling access to a shared resource in a data processing system are described. According to the method, a number of requests for access to the resource are generated by a number of requestors that share the resource. Each of the requestors is dynamically associated with a priority weight in response to events in the data processing system. The priority weight indicates a probability that the associated requestor will be assigned a highest current priority. Each requester is then assigned a current priority that is determined substantially randomly with respect to previous priorities of the requestors. In response to the current priorities of the requestors, a request for access to the resource is granted.

    摘要翻译: 描述了用于控制对数据处理系统中的共享资源的访问的方法和系统。 根据该方法,由许多共享资源的请求者生成对资源的访问请求数量。 响应于数据处理系统中的事件,每个请求者与优先权重动态相关联。 优先权重指示相关联的请求者将被分配最高当前优先级的概率。 然后向每个请求者分配相对于请求者的先前优先级基本随机确定的当前优先级。 响应请求者的当前优先级,授予访问该资源的请求。

    Integrated purge store mechanism to flush L2/L3 cache structure for improved reliabity and serviceability
    68.
    发明授权
    Integrated purge store mechanism to flush L2/L3 cache structure for improved reliabity and serviceability 有权
    集成的清除存储机制来刷新L2 / L3缓存结构,以提高可靠性和可维护性

    公开(公告)号:US07055002B2

    公开(公告)日:2006-05-30

    申请号:US10424486

    申请日:2003-04-25

    IPC分类号: G06F13/00

    CPC分类号: G06F12/0804 G06F12/0897

    摘要: A method of reducing errors in a cache memory of a computer system (e.g., an L2 cache) by periodically issuing a series of purge commands to the L2 cache, sequentially flushing cache lines from the L2 cache to an L3 cache in response to the purge commands, and correcting errors (single-bit) in the cache lines as they are flushed to the L3 cache. Purge commands are issued only when the processor cores associated with the L2 cache have an idle cycle available in a store pipe to the cache. The flush rate of the purge commands can be programmably set, and the purge mechanism can be implemented either in software running on the computer system, or in hardware integrated with the L2 cache. In the case of the software, the purge mechanism can be incorporated into the operating system. In the case of hardware, a purge engine can be provided which advantageously utilizes the store pipe that is provided between the L1 and L2 caches. The L2 cache can be forced to victimize cache lines, by setting tag bits for the cache lines to a value that misses in the L2 cache (e.g., cache-inhibited space). With the eviction mechanism of the cache placed in a direct-mapped mode, the address misses will result in eviction of the cache lines, thereby flushing them to the L3 cache.

    摘要翻译: 通过周期性地向L2高速缓存发出一系列清除命令来减少计算机系统(例如,L2高速缓存)的高速缓冲存储器中的错误的方法,响应于清除,将缓存行从L2高速缓存刷新到L3高速缓存 命令和纠正高速缓存行中的错误(单位),因为它们被刷新到L3高速缓存。 清除命令仅在与L2缓存关联的处理器核心具有可用于缓存的存储管道中的空闲周期时发出。 清除命令的刷新速率可以可编程设置,并且清除机制可以在计算机系统上运行的软件中,也可以在与L2缓存集成的硬件中实现。 在软件的情况下,可以将清除机构并入操作系统。 在硬件的情况下,可以提供有利地利用设置在L1和L2高速缓存之间的存储管道的清洗引擎。 通过将高速缓存行的标记位设置为L2高速缓存中缺少的值(例如,禁止高速缓存的空间),L2高速缓存可能被迫使高速缓存行受害。 由于高速缓存的驱逐机制处于直接映射模式,地址未命中将导致高速缓存线的驱逐,从而将它们刷新到L3高速缓存。

    Chained cache coherency states for sequential non-homogeneous access to a cache line with outstanding data response
    69.
    发明授权
    Chained cache coherency states for sequential non-homogeneous access to a cache line with outstanding data response 有权
    链接高速缓存一致性状态用于对具有出色数据响应的高速缓存行的顺序非均匀访问

    公开(公告)号:US07409504B2

    公开(公告)日:2008-08-05

    申请号:US11245312

    申请日:2005-10-06

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0831

    摘要: A method for sequentially coupling successive processor requests for a cache line before the data is received in the cache of a first coupled processor. Both homogenous and non-homogenous operations are chained to each other, and the coherency protocol includes several new intermediate coherency responses associated with the chained states. Chained coherency states are assigned to track the chain of processor requests and the grant of access permission prior to receipt of the data at the first processor. The chained coherency states also identify the address of the receiving processor. When data is received at the cache of the first processor within the chain, the processor completes its operation on (or with) the data and then forwards the data to the next processor in the chain. The chained coherency protocol frees up address bus bandwidth by reducing the number of retries.

    摘要翻译: 一种用于在数据在第一耦合处理器的高速缓存中接收数据之前顺序耦合高速缓存行的连续处理器请求的方法。 同质和非均匀的操作彼此链接,并且一致性协议包括与链接状态相关联的几个新的中间一致性响应。 分配链接一致性状态以在第一处理器接收到数据之前跟踪处理器请求链和授予访问权限。 链接的一致性状态还标识接收处理器的地址。 当在链中的第一处理器的高速缓存处接收到数据时,处理器完成其对数据的(或与)数据的操作,然后将数据转发到链中的下一个处理器。 链接的一致性协议通过减少重试次数来释放地址总线带宽。

    Chained cache coherency states for sequential homogeneous access to a cache line with outstanding data response
    70.
    发明授权
    Chained cache coherency states for sequential homogeneous access to a cache line with outstanding data response 失效
    链接高速缓存一致性状态用于对具有出色数据响应的高速缓存行进行顺序同步访问

    公开(公告)号:US07370155B2

    公开(公告)日:2008-05-06

    申请号:US11245313

    申请日:2005-10-06

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0831 G06F12/0822

    摘要: A method and data processing system for sequentially coupling successive, homogenous processor requests for a cache line in a chain before the data is received in the cache of a first processor within the chain. Chained intermediate coherency states are assigned to track the chain of processor requests and subsequent access permission provided, prior to receipt of the data at the first processor starting the chain. The chained intermediate coherency state assigned identifies the processor operation and a directional identifier identifies the processor to which the cache line is to be forwarded. When the data is received at the cache of the first processor within the chain, the first processor completes its operation on (or with) the data and then forwards the data to the next processor in the chain. The chain is immediately stopped when a non-homogenous operation is snooped by the last-in-chain processor.

    摘要翻译: 一种方法和数据处理系统,用于在数据在链中的第一处理器的高速缓存中接收之前,将链接中的高速缓存行的连续的均匀处理器请求顺序耦合。 分配链接的中间一致性状态,以便在启动链路的第一个处理器接收到数据之前跟踪处理器请求链和后续访问权限。 所分配的链接中间一致性状态标识处理器操作,并且方向标识符标识要向其转发高速缓存行的处理器。 当在链中的第一处理器的高速缓存处接收数据时,第一处理器完成其数据处理(或与数据)的操作,然后将数据转发到链中的下一个处理器。 当最后一个链接处理器窥探非均匀操作时,链条立即停止。