ADAPTIVE CACHE RECONFIGURATION VIA CLUSTERING

    公开(公告)号:US20200293445A1

    公开(公告)日:2020-09-17

    申请号:US16355168

    申请日:2019-03-15

    Abstract: A method of dynamic cache configuration includes determining, for a first clustering configuration, whether a current cache miss rate exceeds a miss rate threshold. The first clustering configuration includes a plurality of graphics processing unit (GPU) compute units clustered into a first plurality of compute unit clusters. The method further includes clustering, based on the current cache miss rate exceeding the miss rate threshold, the plurality of GPU compute units into a second clustering configuration having a second plurality of compute unit clusters fewer than the first plurality of compute unit clusters.

    DISTRIBUTED COHERENCE DIRECTORY SUBSYSTEM WITH EXCLUSIVE DATA REGIONS

    公开(公告)号:US20200278930A1

    公开(公告)日:2020-09-03

    申请号:US16821632

    申请日:2020-03-17

    Abstract: A processing system includes a first set of one or more processing units including a first processing unit, a second set of one or more processing units including a second processing unit, and a memory having an address space shared by the first and second sets. The processing system further includes a distributed coherence directory subsystem having a first coherence directory to support a first subset of one or more address regions of the address space and a second coherence directory to support a second subset of one or more address regions of the address space. In some implementations, the first coherence directory is implemented in the system so as to have a lower access latency for the first set, whereas the second coherence directory is implemented in the system so as to have a lower access latency for the second set.

    REDUCING CACHE FOOTPRINT IN CACHE COHERENCE DIRECTORY

    公开(公告)号:US20190163632A1

    公开(公告)日:2019-05-30

    申请号:US15825880

    申请日:2017-11-29

    Abstract: A method includes monitoring, at a cache coherence directory, states of cachelines stored in a cache hierarchy of a data processing system using a plurality of entries of the cache coherence directory. Each entry of the cache coherence directory is associated with a corresponding cache page of a plurality of cache pages, and each cache page representing a corresponding set of contiguous cachelines. The method further includes selectively evicting cachelines from a first cache of the cache hierarchy based on cacheline utilization densities of cache pages represented by the corresponding entries of the plurality of entries of the cache coherence directory.

    PREEMPTIVE CACHE MANAGEMENT POLICIES FOR PROCESSING UNITS

    公开(公告)号:US20180285264A1

    公开(公告)日:2018-10-04

    申请号:US15475435

    申请日:2017-03-31

    CPC classification number: G06F12/0806 G06F2212/621

    Abstract: A processing system includes at least one central processing unit (CPU) core, at least one graphics processing unit (GPU) core, a main memory, and a coherence directory for maintaining cache coherence. The at least one CPU core receives a CPU cache flush command to flush cache lines stored in cache memory of the at least one CPU core prior to launching a GPU kernel. The coherence directory transfers data associated with a memory access request by the at least one GPU core from the main memory without issuing coherence probes to caches of the at least one CPU core.

    BATCHING MODIFIED BLOCKS TO THE SAME DRAM PAGE
    16.
    发明申请
    BATCHING MODIFIED BLOCKS TO THE SAME DRAM PAGE 有权
    将修改好的块装入相同的DRAM页面

    公开(公告)号:US20160170887A1

    公开(公告)日:2016-06-16

    申请号:US14569175

    申请日:2014-12-12

    Abstract: To efficiently transfer of data from a cache to a memory, it is desirable that more data corresponding to the same page in the memory be loaded in a line buffer. Writing data to a memory page that is not currently loaded in a row buffer requires closing an old page and opening a new page. Both operations consume energy and clock cycles and potentially delay more critical memory read requests. Hence it is desirable to have more than one write going to the same DRAM page to amortize the cost of opening and closing DRAM pages. A desirable approach is batch write backs to the same DRAM page by retaining modified blocks in the cache until a sufficient number of modified blocks belonging to the same memory page are ready for write backs.

    Abstract translation: 为了有效地将数据从高速缓存传输到存储器,期望将与存储器中的相同页面相对应的更多数据加载到行缓冲器中。 将数据写入当前未加载到行缓冲区的内存页面时,需要关闭旧页面并打开新页面。 两种操作都消耗能量和时钟周期,并可能延迟更多关键的存储器读取请求。 因此,期望具有多于一个写入同一DRAM页面的写入以分摊打开和关闭DRAM页面的成本。 期望的方法是通过将修改的块保留在高速缓存中来批量回写到相同的DRAM页面,直到属于同一存储器页面的足够数量的修改的块准备好回写。

    Dynamic and Adaptive Sleep State Management
    17.
    发明申请
    Dynamic and Adaptive Sleep State Management 有权
    动态和适应性睡眠状态管理

    公开(公告)号:US20150121106A1

    公开(公告)日:2015-04-30

    申请号:US14068207

    申请日:2013-10-31

    Abstract: An approach is described herein that includes a method for power management of a device. In one example, the method includes sampling duration characteristics for a plurality of past idle events for a predetermined interval of time and determining whether to transition a device to a powered-down state based on the sampled duration characteristics. In another example, the method includes determining whether an average idle time for a plurality of past idle events exceeds an energy break-even point threshold. If the average idle time for the plurality of past idle events exceeds the energy break-even point threshold, a device is immediately transitioned to a powered-down state upon receipt of a next idle event. If the average idle time for the plurality of past idle events does not exceed the energy break-even point threshold, transition of the device to the powered-down state is delayed.

    Abstract translation: 这里描述了包括用于设备的电源管理的方法的方法。 在一个示例中,该方法包括在预定的时间间隔内对多个过去的空闲事件采样持续时间特征,并且基于采样的持续时间特性来确定是否将设备转换到掉电状态。 在另一示例中,该方法包括确定多个过去的空闲事件的平均空闲时间是否超过能量中断点阈值。 如果多个过去的空闲事件的平均空闲时间超过能量中断点阈值,则在接收到下一个空闲事件时,设备立即转换到掉电状态。 如果多个过去的空闲事件的平均空闲时间不超过能量中断点阈值,则设备转换到掉电状态被延迟。

Patent Agency Ranking