Data processing system and method for resolving a conflict between requests to modify a shared cache line
    31.
    发明授权
    Data processing system and method for resolving a conflict between requests to modify a shared cache line 失效
    用于解决修改共享缓存行的请求之间的冲突的数据处理系统和方法

    公开(公告)号:US06763434B2

    公开(公告)日:2004-07-13

    申请号:US09752947

    申请日:2000-12-30

    IPC分类号: G06F1200

    CPC分类号: G06F12/0831

    摘要: Disclosed herein are a data processing system and method of operating a data processing system that arbitrate between conflicting requests to modify data cached in a shared state and that protect ownership of the cache line granted during such arbitration until modification of the data is complete. The data processing system includes a plurality of agents coupled to an interconnect that supports pipelined transactions. While data associated with a target address are cached at a first agent among the plurality of agents in a shared state, the first agent issues a transaction on the interconnect. In response to snooping the transaction, a second agent provides a snoop response indicating that the second agent has a pending conflicting request and a coherency decision point provides a snoop response granting the first agent ownership of the data. In response to the snoop responses, the first agent is provided with a combined response representing a collective response to the transaction of all of the agents that grants the first agent ownership of the data. In response to the combined response, the first agent is permitted to modify the data.

    摘要翻译: 这里公开了一种数据处理系统和方法,该数据处理系统和方法在数据处理系统之间进行仲裁,以便在冲突的请求之间进行仲裁,以修改在共享状态下缓存的数据,并保护在此类仲裁期间授予的高速缓存行的所有权,直到数据修改完成。 数据处理系统包括耦合到支持流水线交易的互连的多个代理。 虽然与目标地址相关联的数据在共享状态的多个代理之间的第一代理处被高速缓存,但第一代理在互连上发布事务。 响应于窥探事务,第二代理提供窥探响应,指示第二代理具有待决冲突请求,并且一致性决策点提供准备数据的第一代理所有权的窥探响应。 响应于窥探响应,向第一代理提供组合的响应,其表示对授予数据的第一代理所有权的所有代理的交易的集体响应。 响应于组合的响应,允许第一代理修改数据。

    Multiprocessor speculation mechanism via a barrier speculation flag
    32.
    发明授权
    Multiprocessor speculation mechanism via a barrier speculation flag 有权
    通过屏障投机标志的多处理器推测机制

    公开(公告)号:US06691220B1

    公开(公告)日:2004-02-10

    申请号:US09588608

    申请日:2000-06-06

    IPC分类号: G06F900

    摘要: A method of operation within a processor that permits load instructions following barrier instructions in an instruction sequence to be issued speculatively. The barrier instruction is executed and while the barrier operation is pending, a load request associated with the load instruction is speculatively issued. A speculation flag is set to indicate the load instruction was speculatively issued. The flag is reset when an acknowledgment of the barrier operation is received. Data that is returned before the acknowledgment is received is temporarily held, and the data is forwarded to the register and/or execution unit of the processor only after the acknowledgment is received. If a snoop invalidate is detected for the speculatively issued load request before the barrier operation completes, the data is discarded and the load request is re-issued.

    摘要翻译: 一种处理器内的操作方法,其允许按照指令序列中的障碍指令之后的加载指令进行推测。 屏障指令被执行,并且当屏障操作正在等待时,推测地发出与加载指令相关联的加载请求。 设置了一个猜测标志来指示加载指令被推测发出。 当接收到屏障操作的确认时,该标志被复位。 在接收到确认之前返回的数据被暂时保留,并且仅在接收到确认之后将数据转发到处理器的寄存器和/或执行单元。 如果在屏障操作完成之前,对于推测发出的负载请求检测到窥探无效,则丢弃数据并重新发出加载请求。

    Multiprocessor speculation mechanism for efficiently managing multiple barrier operations
    33.
    发明授权
    Multiprocessor speculation mechanism for efficiently managing multiple barrier operations 有权
    用于有效管理多个屏障操作的多处理器推测机制

    公开(公告)号:US06625660B1

    公开(公告)日:2003-09-23

    申请号:US09588605

    申请日:2000-06-06

    IPC分类号: G06F1516

    摘要: Disclosed is a method of operation within a processor that permits load instructions to be issued speculatively. An instruction sequence is received that includes multiple barrier instructions and a load instruction that follows the barrier instructions in the instruction sequence. In response to the multiple barrier instructions, barrier operations are issued on an interconnect coupled to the processor. Also, while the barrier operations are pending, a load request associated with the load instruction is speculatively issued. When the load request is issued, a flag is set to indicate that it was speculatively issued. The flag is reset when acknowledgments of all the barrier operations are received. Data that is returned before the acknowledgments are received is temporarily held and forwarded to the register and/or execution unit of the processor only after the acknowledgments are received. If a snoop invalidate is detected for the speculatively issued load request before completion of the barrier operations, the data is discarded and the load request is re-issued.

    摘要翻译: 公开了一种在处理器内操作的方法,其允许以推测方式发布加载指令。 接收包括多个屏障指令和跟随指令序列中的屏障指令的加载指令的指令序列。 响应于多个屏障指令,在耦合到处理器的互连上发出屏障操作。 此外,当屏障操作正在等待时,推测性地发出与加载指令相关联的加载请求。 当发出加载请求时,会设置一个标志来指示它被推测发出。 当接收到所有屏障操作的确认时,该标志被复位。 在接收到确认之前返回的数据被暂时保存,并且在接收到确认之后被转发到处理器的寄存器和/或执行单元。 如果在完成屏障操作之前,对于推测发出的加载请求检测到窥探无效,则丢弃数据并重新发出加载请求。

    Mechanism for collapsing store misses in an SMP computer system

    公开(公告)号:US06615321B2

    公开(公告)日:2003-09-02

    申请号:US09782581

    申请日:2001-02-12

    IPC分类号: G06F1208

    CPC分类号: G06F12/0855 G06F12/0831

    摘要: A method of handling a write operation in a multiprocessor computer system wherein each processing unit has a respective cache, by determining that a new value for a store instruction is the same as a current value already contained in the memory hierarchy, and discarding the store instruction without issuing any associated cache operation in response to this determination. When a store hit occurs, the current value is retrieved from the local cache. When a store miss occurs, the current value is retrieved from a remote cache by issuing a read request. The comparison may be performed using a portion of the cache line which is less than a granule size of the cache line. A store gathering queue can be use to collect pending store instructions that are directed to different portions of the same cache line.

    Store collapsing mechanism for SMP computer system

    公开(公告)号:US06615320B2

    公开(公告)日:2003-09-02

    申请号:US09782580

    申请日:2001-02-12

    IPC分类号: G06F1208

    CPC分类号: G06F12/0815 G06F12/0831

    摘要: A method of handling a write operation in a multiprocessor computer system wherein each processing unit has a respective cache, by determining that a new value for a store instruction is the same as a current value already contained in the memory hierarchy, and discarding the store instruction without issuing any associated cache operation in response to this determination. When a store hit occurs, the current value is retrieved from the local cache. When a store miss occurs, the current value is retrieved from a remote cache by issuing a read request. The comparison may be performed using a portion of the cache line which is less than a granule size of the cache line. A store gathering queue can be use to collect pending store instructions that are directed to different portions of the same cache line.

    Efficient instruction cache coherency maintenance mechanism for scalable multiprocessor computer system with write-back data cache

    公开(公告)号:US06574714B2

    公开(公告)日:2003-06-03

    申请号:US09782579

    申请日:2001-02-12

    IPC分类号: G06F1208

    摘要: A method of maintaining coherency in a cache hierarchy of a processing unit of a computer system, wherein the upper level (L1) cache includes a split instruction/data cache. In one implementation, the L1 data cache is store-through, and each processing unit has a lower level (L2) cache. When the lower level cache receives a cache operation requiring invalidation of a program instruction in the L1 instruction cache (i.e., a store operation or a snooped kill), the L2 cache sends an invalidation transaction (e.g., icbi) to the instruction cache. The L2 cache is fully inclusive of both instructions and data. In another implementation, the L1 data cache is write-back, and a store address queue in the processor core is used to continually propagate pipelined address sequences to the lower levels of the memory hierarchy, i.e., to an L2 cache or, if there is no L2 cache, then to the system bus. If there is no L2 cache, then the cache operations may be snooped directly against the L1 instruction cache.

    Cache index based system address bus
    37.
    发明授权
    Cache index based system address bus 有权
    基于缓存索引的系统地址总线

    公开(公告)号:US06477613B1

    公开(公告)日:2002-11-05

    申请号:US09345302

    申请日:1999-06-30

    IPC分类号: G06F1200

    摘要: Following a cache miss by an operation, the address for the operation is transmitted on the bus coupling the cache to lower levels of the storage hierarchy. A portion of the address including the index field is transmitted during a first bus cycle, and may be employed to begin directory lookups in lower level storage devices before the address tag is received. The remainder of the address is transmitted during subsequent bus cycles, which should be in time for address tag comparisons with the congruence class elements. To allow multiple directory lookups to be occurring concurrently in a pipelined directory, a portion of multiple addresses for several data access operations, each portion including the index field for the respective address, may be transmitted during the first bus cycle or staged in consecutive bus cycles, with the remainders of each address—including the cache tags—transmitted during the subsequent bus cycles. This allows directory lookups utilizing the index fields to be processed concurrently within a lower level storage device for multiple operations, with the address tags being provided later, but still timely for tag comparisons at the end of the directory lookup. Where the lower level storage device operates at a higher frequency than the bus, overall latency is reduced and directory bandwidth is more efficiently utilized.

    摘要翻译: 在操作的高速缓存未命中之后,操作的地址在将高速缓存耦合到存储层级的较低级别的总线上传输。 包括索引字段的地址的一部分在第一总线周期期间被发送,并且可以用于在接收到地址标签之前开始下级存储设备中的目录查找。 在随后的总线周期期间传送地址的其余部分,这些时间应与地址标签与同余类元素进行比较。 为了允许在流水线目录中同时发生多个目录查找,可以在第一个总线周期期间发送多个数据访问操作的多个地址的一部分,每个部分包括相应地址的索引字段,或者在连续的总线周期中分段 ,每个地址的剩余部分,包括在后续总线周期期间发送的缓存标签。 这允许使用索引字段的目录查找在较低级存储设备中同时处理以用于多个操作,其中地址标签稍后提供,但是在目录查找结束时仍然适合于标签比较。 在较低级存储设备以比总线更高的频率工作的地方,总体延迟降低,目录带宽更有效地利用。

    Method and apparatus for efficiently managing caches with non-power-of-two congruence classes
    38.
    发明授权
    Method and apparatus for efficiently managing caches with non-power-of-two congruence classes 失效
    有效管理具有非二次全能级别的缓存的方法和装置

    公开(公告)号:US06434670B1

    公开(公告)日:2002-08-13

    申请号:US09435948

    申请日:1999-11-09

    IPC分类号: G06F1200

    CPC分类号: G06F12/0864 G06F12/123

    摘要: A method and apparatus for efficiently managing caches with non-power-of-two congruence classes allows for increasing the number of congruence classes in a cache when not enough area is available to double the cache size. One or more congruence classes within the cache have their associative sets split so that a number of congruence classes are created with reduced associativity. The management method and apparatus allow access to the congruence classes without introducing any additional cycles of delay or complex logic.

    摘要翻译: 一种用于非有效二等度类的高效管理高速缓存的方法和装置允许当没有足够的区域可用于使高速缓存大小加倍时,增加高速缓存中的同余类的数量。 高速缓存中的一个或多个同余类别将其关联集合拆分,从而创建具有降低的关联性的多个同余类。 管理方法和装置允许访问一致类,而不引入任何额外的延迟周期或复杂的逻辑。

    Layered local cache with imprecise reload mechanism
    39.
    发明授权
    Layered local cache with imprecise reload mechanism 有权
    分层本地缓存与不精确的重载机制

    公开(公告)号:US06434667B1

    公开(公告)日:2002-08-13

    申请号:US09340075

    申请日:1999-06-25

    IPC分类号: G06F1200

    摘要: A method of improving memory access for a computer system, by sending load requests to a lower level storage subsystem along with associated information pertaining to intended use of the requested information by the requesting processor, without using a high level load queue. Returning the requested information to the processor along with the associated use information allows the information to be placed immediately without using reload buffers. A register load bus separate from the cache load bus (and having a smaller granularity) is used to return the information. An upper level (L1) cache may then be imprecisely reloaded (the upper level cache can also be imprecisely reloaded with store instructions). The lower level (L2) cache can monitor L1 and L2 cache activity, which can be used to select a victim cache block in the L1 cache (based on the additional L2 information), or to select a victim cache block in the L2 cache (based on the additional L1 information). L2 control of the L1 directory also allows certain snoop requests to be resolved without waiting for L1 acknowledgement. The invention can be applied to, e.g., instruction, operand data and translation caches.

    摘要翻译: 一种改进计算机系统的存储器访问的方法,通过将请求发送到较低级别的存储子系统以及由请求处理器对与请求的信息的预期用途有关的关联信息而不使用高级别的负载队列来进行发送。 将所请求的信息与相关联的使用信息一起返回到处理器允许立即放置信息而不使用重新加载缓冲器。 使用与缓存负载总线分离(并具有较小粒度)的寄存器负载总线返回信息。 然后可能不精确地重新加载上级(L1)高速缓存(高级缓存也可以不精确地用存储指令重新加载)。 低级(L​​2)缓存可以监视L1和L2高速缓存活动,其可用于在L1高速缓存中选择受害者缓存块(基于附加的L2信息),或者选择L2缓存中的受害缓存块( 基于附加的L1信息)。 L1目录的L2控制也允许解决某些侦听请求,而无需等待L1确认。 本发明可以应用于例如指令,操作数数据和翻译高速缓存。

    Queue-less and state-less layered local data cache mechanism
    40.
    发明授权
    Queue-less and state-less layered local data cache mechanism 失效
    无队列和无状态的分层本地数据缓存机制

    公开(公告)号:US06418513B1

    公开(公告)日:2002-07-09

    申请号:US09340077

    申请日:1999-06-25

    IPC分类号: G06F1200

    CPC分类号: G06F12/0897

    摘要: A method of improving memory access for a computer system, by sending load requests to a lower level storage subsystem along with associated information pertaining to intended use of the requested information by the requesting processor, without using a high level load queue. Returning the requested information to the processor along with the associated use information allows the information to be placed immediately without using reload buffers. A register load bus separate from the cache load bus (and having a smaller granularity) is used to return the information. An upper level (L1) cache may then be imprecisely reloaded (the upper level cache can also be imprecisely reloaded with store instructions). The lower level (L2) cache can monitor L1 and L2 cache activity, which can be used to select a victim cache block in the L1 cache (based on the additional L2 information), or to select a victim cache block in the L2 cache (based on the additional L1 information). L2 control of the L1 directory also allows certain snoop requests to be resolved without waiting for L1 acknowledgement. The invention can be applied to, e.g., instruction, operand data and translation caches.

    摘要翻译: 一种改进计算机系统的存储器访问的方法,通过将请求发送到较低级别的存储子系统以及由请求处理器对与请求的信息的预期用途有关的关联信息而不使用高级别的负载队列来进行发送。 将所请求的信息与相关联的使用信息一起返回到处理器允许立即放置信息而不使用重新加载缓冲器。 使用与缓存负载总线分离(并具有较小粒度)的寄存器负载总线返回信息。 然后可能不精确地重新加载上级(L1)高速缓存(高级缓存也可以不精确地用存储指令重新加载)。 低级(L​​2)缓存可以监视L1和L2高速缓存活动,其可用于在L1高速缓存中选择受害者缓存块(基于附加的L2信息),或者选择L2缓存中的受害缓存块( 基于附加的L1信息)。 L1目录的L2控制也允许解决某些侦听请求,而无需等待L1确认。 本发明可以应用于例如指令,操作数数据和翻译高速缓存。