SELECTIVE CACHE-TO-CACHE LATERAL CASTOUTS
    21.
    发明申请
    SELECTIVE CACHE-TO-CACHE LATERAL CASTOUTS 有权
    选择性高速缓存行驶路线

    公开(公告)号:US20110161589A1

    公开(公告)日:2011-06-30

    申请号:US12650018

    申请日:2009-12-30

    IPC分类号: G06F12/08 G06F12/00

    CPC分类号: G06F12/0811 G06F12/12

    摘要: A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line.

    摘要翻译: 数据处理系统包括第一和第二处理单元和系统存储器。 第一处理单元具有第一上层和第一下层高速缓存,第二处理单元具有第二上层和下层高速缓存。 响应于数据请求,选择要从第一较低级高速缓存丢弃的受害者高速缓存行,并且第一较低级高速缓存选择在执行到第二低级高速缓存的受害者高速缓存行的横向流出(LCO) 基于与受害者高速缓存行相关联的置信指示,将受害者缓存行的丢弃发送到系统存储器。 响应于选择LCO,第一处理单元在互连结构上发布LCO命令,并从第一低级缓存中移除受害者高速缓存行,并且第二下级缓存保存受害缓存行。

    Data processing system, cache system and method for updating an invalid coherency state in response to snooping an operation
    22.
    发明授权
    Data processing system, cache system and method for updating an invalid coherency state in response to snooping an operation 失效
    数据处理系统,缓存系统和用于响应于窥探操作来更新无效一致性状态的方法

    公开(公告)号:US07451277B2

    公开(公告)日:2008-11-11

    申请号:US11388017

    申请日:2006-03-23

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0831 G06F2212/507

    摘要: A cache coherent data processing system includes at least first and second coherency domains. In a first cache memory within the first coherency domain of the data processing system, a coherency state field associated with a storage location and an address tag is set to a first data-invalid coherency state that indicates that the address tag is valid and that the storage location does not contain valid data. In response to snooping an exclusive access operation, the exclusive access request specifying a target address matching the address tag and indicating a relative domain location of a requester that initiated the exclusive access operation, the first cache memory updates the coherency state field from the first data-invalid coherency state to a second data-invalid coherency state that indicates that the address tag is valid, that the storage location does not contain valid data, and whether a target memory block associated with the address tag is cached within the first coherency domain upon successful completion of the exclusive access operation based upon the relative location of the requestor.

    摘要翻译: 缓存相干数据处理系统至少包括第一和第二相干域。 在数据处理系统的第一相关域内的第一高速缓冲存储器中,将与存储位置和地址标签相关联的一致性状态字段设置为指示地址标签有效的第一数据无效一致性状态, 存储位置不包含有效数据。 响应于窥探独占访问操作,专用访问请求指定与地址标签匹配的目标地址,并且指示发起独占访问操作的请求者的相对域位置,第一高速缓存存储器从第一数据更新相关性状态字段 - 无效的一致性状态到指示地址标签有效的第二数据无效一致性状态,存储位置不包含有效数据,以及与地址标签相关联的目标存储器块是否被缓存在第一相关域内 基于请求者的相对位置成功完成独占访问操作。

    Method and apparatus for performing data prefetch in a multiprocessor system
    23.
    发明授权
    Method and apparatus for performing data prefetch in a multiprocessor system 失效
    在多处理器系统中执行数据预取的方法和装置

    公开(公告)号:US08161245B2

    公开(公告)日:2012-04-17

    申请号:US11054173

    申请日:2005-02-09

    IPC分类号: G06F13/00 G06F13/28 G06F15/00

    摘要: A method and apparatus for performing data prefetch in a multiprocessor system are disclosed. The multiprocessor system includes multiple processors, each having a cache memory. The cache memory is subdivided into multiple slices. A group of prefetch requests is initially issued by a requesting processor in the multiprocessor system. Each prefetch request is intended for one of the respective slices of the cache memory of the requesting processor. In response to the prefetch requests being missed in the cache memory of the requesting processor, the prefetch requests are merged into one combined prefetch request. The combined prefetch request is then sent to the cache memories of all the non-requesting processors within the multiprocessor system. In response to a combined clean response from the cache memories of all the non-requesting processors, data are then obtained for the combined prefetch request from a system memory.

    摘要翻译: 公开了一种用于在多处理器系统中执行数据预取的方法和装置。 多处理器系统包括多个处理器,每个具有高速缓冲存储器。 缓存存储器被细分成多个片段。 一组预取请求最初由多处理器系统中的请求处理器发出。 每个预取请求用于请求处理器的高速缓冲存储器的相应片段之一。 响应于在请求处理器的高速缓冲存储器中错过的预取请求,预取请求被合并成一个组合预取请求。 然后将组合的预取请求发送到多处理器系统内的所有不请求处理器的高速缓冲存储器。 响应于来自所有非请求处理器的高速缓冲存储器的组合清洁响应,然后从系统存储器获得用于组合预取请求的数据。

    Data processing system, method and interconnect fabric supporting multiple planes of processing nodes
    24.
    发明授权
    Data processing system, method and interconnect fabric supporting multiple planes of processing nodes 有权
    支持多个处理节点平面的数据处理系统,方法和互连结构

    公开(公告)号:US07818388B2

    公开(公告)日:2010-10-19

    申请号:US11245887

    申请日:2005-10-07

    IPC分类号: G06F15/16

    CPC分类号: G06F15/16

    摘要: A data processing system includes a first plane including a first plurality of processing nodes, each including multiple processing units, and a second plane including a second plurality of processing nodes, each including multiple processing units. The data processing system also includes a plurality of point-to-point first tier links. Each of the first plurality and second plurality of processing nodes includes one or more first tier links among the plurality of first tier links, where the first tier link(s) within each processing node connect a pair of processing units in the same processing node for communication. The data processing system further includes a plurality of point-to-point second tier links. At least a first of the plurality of second tier links connects processing units in different ones of the first plurality of processing nodes, at least a second of the plurality of second tier links connects processing units in different ones of the second plurality of processing nodes, and at least a third of the plurality of second tier links connects a processing unit in the first plane to a processing unit in the second plane.

    摘要翻译: 数据处理系统包括包括第一多个处理节点的第一平面,每个处理节点包括多个处理单元,以及包括第二多个处理节点的第二平面,每个处理节点包括多个处理单元。 数据处理系统还包括多个点对点第一层链路。 第一多个处理节点和第二多个处理节点中的每一个包括多个第一层链路之中的一个或多个第一层链路,其中每个处理节点内的第一层链路连接相同处理节点中的一对处理单元,用于 通讯。 数据处理系统还包括多个点到点第二层链路。 所述多个第二层链路中的至少第一层连接所述第一多个处理节点中的不同处理节点中的处理单元,所述多个第二层链路中的至少一个链接连接所述第二多个处理节点中的不同处理节点中的处理单元, 并且所述多个第二层链路中的至少三分之一链路将所述第一平面中的处理单元连接到所述第二平面中的处理单元。

    DATA PROCESSING SYSTEM, METHOD AND INTERCONNECT FABRIC SUPPORTING MULTIPLE PLANES OF PROCESSING NODES
    25.
    发明申请
    DATA PROCESSING SYSTEM, METHOD AND INTERCONNECT FABRIC SUPPORTING MULTIPLE PLANES OF PROCESSING NODES 审中-公开
    数据处理系统,方法和互连织物支持多个加工点的平面

    公开(公告)号:US20080225863A1

    公开(公告)日:2008-09-18

    申请号:US12124639

    申请日:2008-05-21

    IPC分类号: H04L12/56

    CPC分类号: G06F15/16

    摘要: A data processing system includes a first plane including a first plurality of processing nodes, each including multiple processing units, and a second plane including a second plurality of processing nodes, each including multiple processing units. The data processing system also includes a plurality of point-to-point first tier links. Each of the first plurality and second plurality of processing nodes includes one or more first tier links among the plurality of first tier links, where the first tier link(s) within each processing node connect a pair of processing units in the same processing node for communication. The data processing system further includes a plurality of point-to-point second tier links. At least a first of the plurality of second tier links connects processing units in different ones of the first plurality of processing nodes, at least a second of the plurality of second tier links connects processing units in different ones of the second plurality of processing nodes, and at least a third of the plurality of second tier links connects a processing unit in the first plane to a processing unit in the second plane.

    摘要翻译: 数据处理系统包括包括第一多个处理节点的第一平面,每个处理节点包括多个处理单元,以及包括第二多个处理节点的第二平面,每个处理节点包括多个处理单元。 数据处理系统还包括多个点对点第一层链路。 第一多个处理节点和第二多个处理节点中的每一个包括多个第一层链路之中的一个或多个第一层链路,其中每个处理节点内的第一层链路连接相同处理节点中的一对处理单元,用于 通讯。 数据处理系统还包括多个点到点第二层链路。 所述多个第二层链路中的至少第一层连接所述第一多个处理节点中的不同处理节点中的处理单元,所述多个第二层链路中的至少一个链接连接所述第二多个处理节点中的不同处理节点中的处理单元, 并且所述多个第二层链路中的至少三分之一链路将所述第一平面中的处理单元连接到所述第二平面中的处理单元。

    Mode-based castout destination selection
    26.
    发明授权
    Mode-based castout destination selection 失效
    基于模式的castout目的地选择

    公开(公告)号:US08312220B2

    公开(公告)日:2012-11-13

    申请号:US12420933

    申请日:2009-04-09

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0811 G06F12/12

    摘要: In response to a data request of a first of a plurality of processing units, the first processing unit selects a victim cache line to be castout from the lower level cache of the first processing unit and determines whether a mode is set. If not, the first processing unit issues on the interconnect fabric an LCO command identifying the victim cache line and indicating that a lower level cache is the intended destination. If the mode is set, the first processing unit issues a castout command with an alternative intended destination. In response to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from its lower level cache, and the victim cache line is held elsewhere in the data processing system. The mode can be set to inhibit castouts to system memory, for example, for testing.

    摘要翻译: 响应于多个处理单元中的第一处理单元的数据请求,第一处理单元从第一处理单元的较低级高速缓存中选择要丢弃的牺牲高速缓存行,并且确定是否设置了模式。 如果不是,则第一处理单元在互连结构上发出识别受害者高速缓存行的LCO命令,并指示较低级别的高速缓存是预期的目的地。 如果模式被设置,则第一处理单元发出具有替代预定目的地的停顿命令。 响应于指示LCO命令成功的LCO命令的一致性响应,第一处理单元从其较低级高速缓存中去除受害者高速缓存行,并且将受害者高速缓存行保持在数据处理系统的其他地方。 该模式可以设置为抑制系统内存的丢弃,例如进行测试。

    Victim cache prefetching
    27.
    发明授权
    Victim cache prefetching 失效
    受害者缓存预取

    公开(公告)号:US08209489B2

    公开(公告)日:2012-06-26

    申请号:US12256064

    申请日:2008-10-22

    IPC分类号: G06F12/08

    摘要: A processing unit for a multiprocessor data processing system includes a processor core and a cache hierarchy coupled to the processor core to provide low latency data access. The cache hierarchy includes an upper level cache coupled to the processor core and a lower level victim cache coupled to the upper level cache. In response to a prefetch request of the processor core that misses in the upper level cache, the lower level victim cache determines whether the prefetch request misses in the directory of the lower level victim cache and, if so, allocates a state machine in the lower level victim cache that services the prefetch request by issuing the prefetch request to at least one other processing unit of the multiprocessor data processing system.

    摘要翻译: 用于多处理器数据处理系统的处理单元包括处理器核心和耦合到处理器核心的高速缓存层级以提供低延迟数据访问。 高速缓存层级包括耦合到处理器核心的高级缓存和耦合到高级缓存的较低级别的牺牲缓存。 响应于在高级缓存中丢失的处理器核心的预取请求,较低级别的受害者缓存确定预取请求是否丢失在较低级别的受害者缓存的目录中,并且如果是,则在下级缓存中分配状态机 通过向多处理器数据处理系统的至少一个其他处理单元发出预取请求来服务于预取请求。

    Empirically Based Dynamic Control of Transmission of Victim Cache Lateral Castouts
    28.
    发明申请
    Empirically Based Dynamic Control of Transmission of Victim Cache Lateral Castouts 有权
    基于经验的动态控制受害者缓存横向铸件传动

    公开(公告)号:US20100262778A1

    公开(公告)日:2010-10-14

    申请号:US12421180

    申请日:2009-04-09

    IPC分类号: G06F12/08

    摘要: In response to a data request, a victim cache line is selected for castout from a lower level cache, and a target lower level cache of one of the plurality of processing units is selected. A determination is made whether the selected target lower level cache has provided more than a threshold number of retry responses to lateral castout (LCO) commands of the first lower level cache, and if so, a different target lower level cache is selected. The first processing unit thereafter issues a LCO command on the interconnect fabric. The LCO command identifies the victim cache line to be castout and indicates that the target lower level cache is an intended destination of the victim cache line. In response to a successful coherence response to the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache.

    摘要翻译: 响应于数据请求,选择从较低级别高速缓冲存储器进行丢弃的受害者高速缓存行,并且选择多个处理单元之一的目标下级高速缓存。 确定所选择的目标下层高速缓存是否为第一较低级别高速缓存的横向转移(LCO)命令提供了超过阈值数量的重试响应,如果是,则选择不同的目标低级高速缓存。 此后,第一处理单元在互连结构上发出LCO命令。 LCO命令标识要丢弃的受害者缓存行,并指示目标下级缓存是受害缓存行的预期目标。 响应于对LCO命令的成功的一致性响应,从第一低级缓存中移除受害者高速缓存行并保存在第二较低级缓存中。

    Data processing system, method and interconnect fabric supporting destination data tagging
    29.
    发明授权
    Data processing system, method and interconnect fabric supporting destination data tagging 失效
    数据处理系统,方法和互连结构支持目标数据标记

    公开(公告)号:US07761631B2

    公开(公告)日:2010-07-20

    申请号:US12117539

    申请日:2008-05-08

    IPC分类号: G06F13/42 G06F13/00

    CPC分类号: G06F15/16

    摘要: A data processing system includes a plurality of communication links and a plurality of processing units including a local master processing unit. The local master processing unit includes interconnect logic that couples the processing unit to one or more of the plurality of communication links and an originating master coupled to the interconnect logic. The originating master originates an operation by issuing a write-type request on at least one of the one or more communication links, receives from a snooper in the data processing system a destination tag identifying a route to the snooper, and, responsive to receipt of the combined response and the destination tag, initiates a data transfer including a data payload and a data tag identifying the route provided within the destination tag.

    摘要翻译: 数据处理系统包括多个通信链路和包括本地主处理单元的多个处理单元。 本地主处理单元包括将处理单元耦合到多个通信链路中的一个或多个以及耦合到互连逻辑的始发主机的互连逻辑。 始发主机通过在一个或多个通信链路中的至少一个发出写入请求来发起操作,从数据处理系统中的窥探者接收标识到窥探者的路由的目的地标签,并且响应于接收到 组合响应和目的地标签,发起包括数据有效载荷和标识目的地标签内提供的路由的数据标签的数据传输。

    Victim Cache Replacement
    30.
    发明申请
    Victim Cache Replacement 有权
    受害者缓存替换

    公开(公告)号:US20100100682A1

    公开(公告)日:2010-04-22

    申请号:US12256002

    申请日:2008-10-22

    IPC分类号: G06F12/08

    摘要: A data processing system includes a processor core having an associated upper level cache and a lower level victim cache. In response to a memory access request of the processor core that specifies a non-modifying access to a target coherency granule, a determination is made whether the memory access request hits or misses in a directory of the lower level victim cache. In response to determining that the memory access request hits in the lower level victim cache in a data-valid coherence state, the lower level victim cache provides the target coherency granule of the memory access request to the upper level cache. The lower level victim cache preserves the target coherency granule in the lower level victim cache in a shared coherence state if the memory access request is of a first type and invalidates the target coherency granule if the memory access request is of a second type.

    摘要翻译: 数据处理系统包括具有相关联的高级缓存和较低级别的受害缓存的处理器核心。 响应于指定对目标一致性粒子的不修改访问的处理器核心的存储器访问请求,确定存储器访问请求是否在较低级别的受害缓存的目录中命中或丢失。 响应于确定存储器访问请求在数据有效的相干状态中击中较低级别的受害者高速缓存,则较低级别的受害者缓存将存储器访问请求的目标一致性颗粒提供给高级缓存。 如果存储器访问请求是第一类型,则较低级别的受害者缓存在共享相干状态下保留较低级别的受害者缓存中的目标一致性粒子,如果存储器访问请求是第二类型,则使目标一致性粒子无效。