Delaying cache data array updates
    11.
    发明授权
    Delaying cache data array updates 有权
    延迟缓存数据阵列更新

    公开(公告)号:US09229866B2

    公开(公告)日:2016-01-05

    申请号:US14089014

    申请日:2013-11-25

    Applicant: Apple Inc.

    CPC classification number: G06F12/0811 G06F12/0842 G06F12/0857 G06F12/0888

    Abstract: Systems, methods, and apparatuses for reducing writes to the data array of a cache. A cache hierarchy includes one or more L1 caches and a L2 cache inclusive of the L2 cache(s). When a request from the L1 cache misses in the L2 cache, the L2 cache sends a fill request to memory. When the fill data returns from memory, the L2 cache delays writing the fill data to its data array. Instead, this cache line is written to the L1 cache and a clean-evict bit corresponding to the cache line is set in the L1 cache. When the L1 cache evicts this cache line, the L1 cache will write back the cache line to the L2 cache even if the cache line has not been modified.

    Abstract translation: 用于减少对缓存的数据阵列的写入的系统,方法和装置。 高速缓存层级包括一个或多个L1高速缓存和包括L2高速缓存的L2高速缓存。 当来自L1缓存的请求在L2高速缓存中丢失时,L2缓存向存储器发送填充请求。 当填充数据从存储器返回时,L2缓存延迟将填充数据写入其数据阵列。 相反,该缓存行被写入到L1高速缓存中,并且在高速缓存中设置与高速缓存行相对应的清除位。 当L1高速缓存驱逐此高速缓存行时,即使高速缓存行未被修改,L1高速缓存也将高速缓存行写回到L2高速缓存。

    Flush engine
    12.
    发明授权
    Flush engine 有权
    冲洗发动机

    公开(公告)号:US09128857B2

    公开(公告)日:2015-09-08

    申请号:US13734444

    申请日:2013-01-04

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed related to flushing one or more data caches. In one embodiment an apparatus includes a processing element, a first cache associated with the processing element, and a circuit configured to copy modified data from the first cache to a second cache in response to determining an activity level of the processing element. In this embodiment, the apparatus is configured to alter a power state of the first cache after the circuit copies the modified data. The first cache may be at a lower level in a memory hierarchy relative to the second cache. In one embodiment, the circuit is also configured to copy data from the second cache to a third cache or a memory after a particular time interval. In some embodiments, the circuit is configured to copy data while one or more pipeline elements of the apparatus are in a low-power state.

    Abstract translation: 公开了涉及冲洗一个或多个数据高速缓存的技术。 在一个实施例中,设备包括处理元件,与处理元件相关联的第一高速缓存器,以及被配置为响应于确定处理元件的活动级别将修改的数据从第一高速缓存复制到第二高速缓存的电路。 在该实施例中,该装置被配置为在电路复制修改的数据之后改变第一高速缓存的功率状态。 第一缓存可以在相对于第二高速缓存的存储器层级中处于较低级。 在一个实施例中,电路还被配置为在特定时间间隔之后将数据从第二高速缓存复制到第三高速缓存或存储器。 在一些实施例中,电路被配置为在设备的一个或多个流水线元件处于低功率状态时复制数据。

    SELECTIVE VICTIMIZATION IN A MULTI-LEVEL CACHE HIERARCHY
    13.
    发明申请
    SELECTIVE VICTIMIZATION IN A MULTI-LEVEL CACHE HIERARCHY 有权
    多层次高速缓存中的选择性维权

    公开(公告)号:US20150149721A1

    公开(公告)日:2015-05-28

    申请号:US14088980

    申请日:2013-11-25

    Applicant: Apple Inc.

    Abstract: Systems, methods, and apparatuses for implementing selective victimization to reduce power and utilized bandwidth in a multi-level cache hierarchy. Each set of an upper-level cache includes a counter that keeps track of the number of times the set was accessed. These counters are periodically decremented by another counter that tracks the total number of accesses to the cache. If a given set counter is below a certain threshold value, clean victims are dropped from this given set instead of being sent to a lower-level cache. Also, a separate counter is used to track the total number of outstanding requests for the cache as a proxy for bus-bandwidth in order to gauge the total amount of traffic in the system. The cache will implement selective victimization whenever there is a large amount of traffic in the system.

    Abstract translation: 用于实现选择性受害以在多级缓存层级中降低功率和利用带宽的系统,方法和装置。 每一组上级缓存包括一个计数器,用于跟踪该组被访问的次数。 这些计数器通过另一个计数器周期性递减,该计数器跟踪对高速缓存的总访问次数。 如果给定的设置计数器低于某个阈值,则清除的受害者将从该给定集合中删除,而不是发送到较低级别的缓存。 此外,使用单独的计数器来跟踪作为总线带宽的代理的缓存的未完成请求的总数,以便测量系统中的总流量。 当系统中存在大量流量时,缓存将实现选择性受害。

    CACHE POLICIES FOR UNCACHEABLE MEMORY REQUESTS
    14.
    发明申请
    CACHE POLICIES FOR UNCACHEABLE MEMORY REQUESTS 有权
    无法访问的内存请求的缓存策略

    公开(公告)号:US20140181403A1

    公开(公告)日:2014-06-26

    申请号:US13725066

    申请日:2012-12-21

    Applicant: APPLE INC.

    CPC classification number: G06F12/0811 G06F12/0815 G06F12/0888

    Abstract: Systems, processors, and methods for keeping uncacheable data coherent. A processor includes a multi-level cache hierarchy, and uncacheable load memory operations can be cached at any level of the cache hierarchy. If an uncacheable load misses in the L2 cache, then allocation of the uncacheable load will be restricted to a subset of the ways of the L2 cache. If an uncacheable store memory operation hits in the L1 cache, then the hit cache line can be updated with the data from the memory operation. If the uncacheable store misses in the L1 cache, then the uncacheable store is sent to a core interface unit.Multiple contiguous store misses are merged into larger blocks of data in the core interface unit before being sent to the L2 cache.

    Abstract translation: 用于保持不可缓存的数据一致的系统,处理器和方法。 处理器包括多级缓存层次结构,并且不可缓存的加载存储器操作可以在高速缓存层级的任何级别缓存。 如果L2缓存中存在不可缓存的加载错误,则不可缓存的加载的分配将被限制为L2高速缓存的一部分。 如果不可缓存的存储器操作命中在L1缓存中,则命中高速缓存行可以用来自存储器操作的数据来更新。 如果不可缓存的商店在L1缓存中丢失,则不可缓存的商店被发送到核心接口单元。 在发送到L2缓存之前,多个连续的存储器缺失在核心接口单元中被合并到更大的数据块中。

    Combining write buffer with dynamically adjustable flush metrics
    15.
    发明授权
    Combining write buffer with dynamically adjustable flush metrics 有权
    将写入缓冲区与动态可调整的flush指标相结合

    公开(公告)号:US08566528B2

    公开(公告)日:2013-10-22

    申请号:US13709649

    申请日:2012-12-10

    Applicant: Apple Inc.

    CPC classification number: G06F12/0891 G06F12/0804

    Abstract: In an embodiment, a combining write buffer is configured to maintain one or more flush metrics to determine when to transmit write operations from buffer entries. The combining write buffer may be configured to dynamically modify the flush metrics in response to activity in the write buffer, modifying the conditions under which write operations are transmitted from the write buffer to the next lower level of memory. For example, in one implementation, the flush metrics may include categorizing write buffer entries as “collapsed.” A collapsed write buffer entry, and the collapsed write operations therein, may include at least one write operation that has overwritten data that was written by a previous write operation in the buffer entry. In another implementation, the combining write buffer may maintain the threshold of buffer fullness as a flush metric and may adjust it over time based on the actual buffer fullness.

    Abstract translation: 在一个实施例中,组合写缓冲器被配置为维护一个或多个刷新度量以确定何时从缓冲器条目发送写入操作。 组合写缓冲器可以被配置为响应于写缓冲器中的活动来动态地修改刷新度量,修改写操作从写缓冲器发送到下一较低级存储器的条件。 例如,在一个实现中,刷新度量可以包括将写缓冲器条目分类为“折叠”。 折叠的写缓冲器条目及其中的折叠写入操作可以包括至少一个写入操作,该写入操作已经覆盖由缓冲器条目中的先前写入操作写入的数据。 在另一实现中,组合写缓冲器可以将缓冲器充满度的阈值保持为刷新度量,并且可以基于实际的缓冲器充满度随时间调整缓冲器充满度。

    Scalable Cache Coherency Protocol
    16.
    发明公开

    公开(公告)号:US20240273024A1

    公开(公告)日:2024-08-15

    申请号:US18582333

    申请日:2024-02-20

    Applicant: Apple Inc.

    CPC classification number: G06F12/0815 G06F12/0831 G06F2212/1032

    Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.

    Scalable cache coherency protocol
    17.
    发明授权

    公开(公告)号:US11868258B2

    公开(公告)日:2024-01-09

    申请号:US18160575

    申请日:2023-01-27

    Applicant: Apple Inc.

    CPC classification number: G06F12/0815 G06F12/0831 G06F2212/1032

    Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.

    Scalable Cache Coherency Protocol
    18.
    发明申请

    公开(公告)号:US20230083397A1

    公开(公告)日:2023-03-16

    申请号:US18058105

    申请日:2022-11-22

    Applicant: Apple Inc.

    Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.

    Scalable cache coherency protocol
    19.
    发明授权

    公开(公告)号:US11544193B2

    公开(公告)日:2023-01-03

    申请号:US17315725

    申请日:2021-05-10

    Applicant: Apple Inc.

    Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.

Patent Agency Ranking