Reducing Wiring Congestion in a Cache Subsystem Utilizing Sectored Caches with Discontiguous Addressing
    1.
    发明申请
    Reducing Wiring Congestion in a Cache Subsystem Utilizing Sectored Caches with Discontiguous Addressing 失效
    减少使用不连续寻址的扇区高速缓存子系统中的接线拥塞

    公开(公告)号:US20090049248A1

    公开(公告)日:2009-02-19

    申请号:US11839663

    申请日:2007-08-16

    IPC分类号: G06F12/08

    摘要: A method and computer system for reducing the wiring congestion, required real estate, and access latency in a cache subsystem with a sectored and sliced lower cache by re-configuring sector-to-slice allocation and the lower cache addressing scheme. With this allocation, sectors having discontiguous addresses are placed within the same slice, and a reduced-wiring scheme is possible between two levels of lower caches based on this re-assignment of the addressable sectors within the cache slices. Additionally, the lower cache effective address tag is re-configured such that the address fields previously allocated to identifying the sector and the slice are switched relative to each other's location within the address tag. This re-allocation of the address bits enables direct slice addressing based on the indicated sector.

    摘要翻译: 一种方法和计算机系统,用于通过重新配置扇区到片分配和较低的高速缓存寻址方案来减少具有扇区和分片的低级高速缓存的高速缓存子系统中的布线拥塞,所需的房地产和访问延迟。 通过这种分配,具有不连续地址的扇区被放置在相同的片内,并且基于对高速缓存片内的可寻址扇区的这种重新分配,可以在两级低级高速缓存之间进行简化布线方案。 此外,低速缓存有效地址标签被重新配置,使得先前分配用于识别扇区和片的地址字段相对于地址标签内的彼此的位置被切换。 地址位的这种重新分配使得能够基于指示的扇区进行直接片寻址。

    Reducing wiring congestion in a cache subsystem utilizing sectored caches with discontiguous addressing
    2.
    发明授权
    Reducing wiring congestion in a cache subsystem utilizing sectored caches with discontiguous addressing 失效
    减少缓存子系统中的布线拥塞,利用具有不连续寻址的扇区缓存

    公开(公告)号:US08433851B2

    公开(公告)日:2013-04-30

    申请号:US11839663

    申请日:2007-08-16

    IPC分类号: G06F12/08

    摘要: A method and computer system for reducing the wiring congestion, required real estate, and access latency in a cache subsystem with a sectored and sliced lower cache by re-configuring sector-to-slice allocation and the lower cache addressing scheme. With this allocation, sectors having discontiguous addresses are placed within the same slice, and a reduced-wiring scheme is possible between two levels of lower caches based on this re-assignment of the addressable sectors within the cache slices. Additionally, the lower cache effective address tag is re-configured such that the address fields previously allocated to identifying the sector and the slice are switched relative to each other's location within the address tag. This re-allocation of the address bits enables direct slice addressing based on the indicated sector.

    摘要翻译: 一种方法和计算机系统,用于通过重新配置扇区到片分配和较低的高速缓存寻址方案来减少具有扇区和分片的低级高速缓存的高速缓存子系统中的布线拥塞,所需的房地产和访问延迟。 通过这种分配,具有不连续地址的扇区被放置在相同的片内,并且基于对高速缓存片内的可寻址扇区的这种重新分配,可以在两级低级高速缓存之间进行简化布线方案。 此外,低速缓存有效地址标签被重新配置,使得先前分配用于识别扇区和片的地址字段相对于地址标签内的彼此的位置被切换。 地址位的这种重新分配使得能够基于指示的扇区进行直接片寻址。

    L2 CACHE CONTROLLER WITH SLICE DIRECTORY AND UNIFIED CACHE STRUCTURE
    4.
    发明申请
    L2 CACHE CONTROLLER WITH SLICE DIRECTORY AND UNIFIED CACHE STRUCTURE 失效
    L2缓存控制器,具有SLICE DIRECTORY和统一的高速缓存结构

    公开(公告)号:US20090083489A1

    公开(公告)日:2009-03-26

    申请号:US12325266

    申请日:2008-12-01

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0851 G06F12/0811

    摘要: A cache memory logically partitions a cache array having a single access/command port into at least two slices, and uses a first directory to access the first array slice while using a second directory to access the second array slice, but accesses from the cache directories are managed using a single cache arbiter which controls the single access/command port. In one embodiment, each cache directory has its own directory arbiter to handle conflicting internal requests, and the directory arbiters communicate with the cache arbiter. The cache array is arranged with rows and columns of cache sectors wherein a cache line is spread across sectors in different rows and columns, with a portion of the given cache line being located in a first column having a first latency and another portion of the given cache line being located in a second column having a second latency greater than the first latency.

    摘要翻译: 缓存存储器将具有单个访问/命令端口的高速缓存阵列逻辑地分区成至少两个切片,并且使用第一目录来访问第一阵列片,同时使用第二目录来访问第二阵列片,但是从高速缓存目录 使用控制单个访问/命令端口的单个缓存仲裁器进行管理。 在一个实施例中,每个高速缓存目录具有其自己的目录仲裁器来处理冲突的内部请求,并且目录仲裁器与高速缓存仲裁器通信。 高速缓存阵列布置有高速缓存扇区的行和列,其中高速缓存行分布在不同行和列中的扇区之间,其中一部分给定高速缓存行位于具有第一延迟的第一列中,并且给定的另一部分 高速缓存线位于具有大于第一等待时间的第二等待时间的第二列中。

    L2 cache controller with slice directory and unified cache structure
    6.
    发明授权
    L2 cache controller with slice directory and unified cache structure 失效
    L2缓存控制器具有片目录和统一缓存结构

    公开(公告)号:US08001330B2

    公开(公告)日:2011-08-16

    申请号:US12325266

    申请日:2008-12-01

    IPC分类号: G06F13/00

    CPC分类号: G06F12/0851 G06F12/0811

    摘要: A cache memory logically partitions a cache array having a single access/command port into at least two slices, and uses a first directory to access the first array slice while using a second directory to access the second array slice, but accesses from the cache directories are managed using a single cache arbiter which controls the single access/command port. In one embodiment, each cache directory has its own directory arbiter to handle conflicting internal requests, and the directory arbiters communicate with the cache arbiter. The cache array is arranged with rows and columns of cache sectors wherein a cache line is spread across sectors in different rows and columns, with a portion of the given cache line being located in a first column having a first latency and another portion of the given cache line being located in a second column having a second latency greater than the first latency.

    摘要翻译: 缓存存储器将具有单个访问/命令端口的高速缓存阵列逻辑地分区成至少两个切片,并且使用第一目录来访问第一阵列片,同时使用第二目录来访问第二阵列片,但是从高速缓存目录 使用控制单个访问/命令端口的单个缓存仲裁器进行管理。 在一个实施例中,每个高速缓存目录具有其自己的目录仲裁器来处理冲突的内部请求,并且目录仲裁器与高速缓存仲裁器通信。 高速缓存阵列布置有高速缓存扇区的行和列,其中高速缓存行分布在不同行和列中的扇区之间,其中一部分给定高速缓存行位于具有第一延迟的第一列中,并且给定的另一部分 高速缓存线位于具有大于第一等待时间的第二等待时间的第二列中。

    L2 cache controller with slice directory and unified cache structure
    7.
    发明授权
    L2 cache controller with slice directory and unified cache structure 失效
    L2缓存控制器具有片目录和统一缓存结构

    公开(公告)号:US07490200B2

    公开(公告)日:2009-02-10

    申请号:US11054924

    申请日:2005-02-10

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0851 G06F12/0811

    摘要: A cache memory logically partitions a cache array having a single access/command port into at least two slices, and uses a first cache directory to access the first cache array slice while using a second cache directory to access the second cache array slice, but accesses from the cache directories are managed using a single cache arbiter which controls the single access/command port. In the illustrative embodiment, each cache directory has its own directory arbiter to handle conflicting internal requests, and the directory arbiters communicate with the cache arbiter. An address tag associated with a load request is transmitted from the processor core with a designated bit that associates the address tag with only one of the cache array slices whose corresponding directory determines whether the address tag matches a currently valid cache entry. The cache array may be arranged with rows and columns of cache sectors wherein a given cache line is spread across sectors in different rows and columns, with at least one portion of the given cache line being located in a first column having a first latency and another portion of the given cache line being located in a second column having a second latency greater than the first latency. The cache array outputs different sectors of the given cache line in successive clock cycles based on the latency of a given sector.

    摘要翻译: 高速缓存存储器将具有单个访问/命令端口的高速缓存阵列逻辑地分割成至少两个切片,并且使用第一高速缓存目录访问第一高速缓存阵列切片,同时使用第二高速缓存目录来访问第二高速缓存阵列切片,但是访问 从缓存目录中使用单个缓存仲裁器来管理单个访问/命令端口。 在说明性实施例中,每个高速缓存目录具有其自己的目录仲裁器来处理冲突的内部请求,并且目录仲裁器与缓存仲裁器通信。 与处理器核心相关联的地址标签被从处理器核心以指定的位发送,指定的位将地址标签与只有一个高速缓存阵列片相关联,其相应的目录确定地址标签是否与当前有效的高速缓存条目匹配。 高速缓存阵列可以布置有高速缓存扇区的行和列,其中给定的高速缓存行分布在不同行和列中的扇区之间,其中给定高速缓存行的至少一部分位于具有第一等待时间的第一列和另一个 给定高速缓存行的一部分位于具有大于第一等待时间的第二等待时间的第二列中。 缓存阵列基于给定扇区的等待时间在连续的时钟周期中输出给定高速缓存行的不同扇区。

    Multiprocessor system in which a cache serving as a highest point of coherency is indicated by a snoop response
    8.
    发明授权
    Multiprocessor system in which a cache serving as a highest point of coherency is indicated by a snoop response 失效
    多处理器系统,其中作为最高点的一致性的缓存由窥探响应指示

    公开(公告)号:US06405289B1

    公开(公告)日:2002-06-11

    申请号:US09437196

    申请日:1999-11-09

    IPC分类号: G06F1200

    摘要: A method of maintaining cache coherency, by designating one cache that owns a line as a highest point of coherency (HPC) for a particular memory block, and sending a snoop response from the cache indicating that it is currently the HPC for the memory block and can service a request. The designation may be performed in response to a particular coherency state assigned to the cache line, or based on the setting of a coherency token bit for the cache line. The processing units may be grouped into clusters, while the memory is distributed using memory arrays associated with respective clusters. One memory array is designated as the lowest point of coherency (LPC) for the memory block (i.e., a fixed assignment) while the cache designated as the HPC is dynamic (i.e., changes as different caches gain ownership of the line). An acknowledgement snoop response is sent from the LPC memory array, and a combined response is returned to the requesting device which gives priority to the HPC snoop response over the LPC snoop response.

    摘要翻译: 通过将一个具有一行的高速缓存指定为特定存储器块的最高一致性(HPC),以及从高速缓存指示其当前是存储器块的HPC的高速缓存发送侦听响应的方法来维持高速缓存一致性的方法,以及 可以服务请求。 可以响应于分配给高速缓存行的特定一致性状态,或者基于高速缓存行的相关性令牌位的设置来执行指定。 处理单元可以被分组成群集,而存储器是使用与相应簇相关联的存储器阵列分布的。 一个存储器阵列被指定为存储器块的一致性(LPC)的最低点(即,固定分配),而指定为HPC的缓存是动态的(即,随着不同的高速缓存获得线的所有权而改变)。 从LPC存储器阵列发送确认窥探响应,并且将组合的响应返回给请求设备,该请求设备通过LPC窥探响应优先考虑HPC侦听响应。

    Optimized cache allocation algorithm for multiple speculative requests
    9.
    发明授权
    Optimized cache allocation algorithm for multiple speculative requests 失效
    针对多个推测请求的优化缓存分配算法

    公开(公告)号:US06393528B1

    公开(公告)日:2002-05-21

    申请号:US09345714

    申请日:1999-06-30

    IPC分类号: G06F1200

    CPC分类号: G06F12/0862 G06F12/127

    摘要: A method of operating a computer system is disclosed in which an instruction having an explicit prefetch request is issued directly from an instruction sequence unit to a prefetch unit of a processing unit. In a preferred embodiment, two prefetch units are used, the first prefetch unit being hardware independent and dynamically monitoring one or more active streams associated with operations carried out by a core of the processing unit, and the second prefetch unit being aware of the lower level storage subsystem and sending with the prefetch request an indication that a prefetch value is to be loaded into a lower level cache of the processing unit. The invention may advantageously associate each prefetch request with a stream ID of an associated processor stream, or a processor ID of the requesting processing unit (the latter feature is particularly useful for caches which are shared by a processing unit cluster). If another prefetch value is requested from the memory hiearchy and it is determined that a prefetch limit of cache usage has been met by the cache, then a cache line in the cache containing one of the earlier prefetch values is allocated for receiving the other prefetch value.

    摘要翻译: 公开了一种操作计算机系统的方法,其中具有显式预取请求的指令直接从指令序列单元发送到处理单元的预取单元。 在优选实施例中,使用两个预取单元,第一预取单元是硬件独立的,并且动态地监视与由处理单元的核心执行的操作相关联的一个或多个活动流,并且第二预取单元知道较低级别 存储子系统,并用预取请求发送将预取值加载到处理单元的较低级缓存中的指示。 本发明可以有利地将每个预取请求与相关联的处理器流的流ID或请求处理单元的处理器ID相关联(后一特征对于由处理单元簇共享的高速缓存特别有用)。 如果从存储器hiearchy请求另一个预取值,并且确定高速缓存的高速缓存使用的预取限制已被满足,则包含先前预取值中的一个的高速缓存行中的高速缓存行被分配用于接收另一个预取值 。

    Time based mechanism for cached speculative data deallocation
    10.
    发明授权
    Time based mechanism for cached speculative data deallocation 失效
    缓存的推测数据释放的基于时间的机制

    公开(公告)号:US06510494B1

    公开(公告)日:2003-01-21

    申请号:US09345716

    申请日:1999-06-30

    IPC分类号: G06F1208

    摘要: A method of operating a processing unit of a computer system, by issuing an instruction having an explicit prefetch request directly from an instruction sequence unit to a prefetch unit of the processing unit. The invention applies to values that are either operand data or instructions. In a preferred embodiment, two prefetch units are used, the first prefetch unit being hardware independent and dynamically monitoring one or more active streams associated with operations carried out by a core of the processing unit, and the second prefetch unit being aware of the lower level storage subsystem and sending with the pref etch request an indication that a prefetch value is to be loaded into a lower level cache of the processing unit. The invention may advantageously associate each prefetch request with a stream ID of an associated processor stream, or a processor ID of the requesting processing unit.

    摘要翻译: 一种操作计算机系统的处理单元的方法,通过从指令序列单元向处理单元的预取单元发出具有显式预取请求的指令。 本发明适用于作为操作数数据或指令的值。 在优选实施例中,使用两个预取单元,第一预取单元是硬件独立的,并且动态地监视与由处理单元的核心执行的操作相关联的一个或多个活动流,并且第二预取单元知道较低级别 存储子系统,并用pref蚀刻请求发送将预取值加载到处理单元的较低级缓存中的指示。 本发明可以有利地将每个预取请求与相关联的处理器流的流ID或请求处理单元的处理器ID相关联。