Zero latency prefetching in caches
    31.
    发明授权

    公开(公告)号:US10929296B2

    公开(公告)日:2021-02-23

    申请号:US15730874

    申请日:2017-10-12

    Abstract: This invention involves a cache system in a digital data processing apparatus including: a central processing unit core; a level one instruction cache; and a level two cache. The cache lines in the second level cache are twice the size of the cache lines in the first level instruction cache. The central processing unit core requests additional program instructions when needed via a request address. Upon a miss in the level one instruction cache that causes a hit in the upper half of a level two cache line, the level two cache supplies the upper half level cache line to the level one instruction cache. On a following level two cache memory cycle, the level two cache supplies the lower half of the cache line to the level one instruction cache. This cache technique thus prefetches the lower half level two cache line employing fewer resources than an ordinary prefetch.

    STATIC POWER REDUCTION IN CACHES USING DETERMINISTIC NAPS

    公开(公告)号:US20200348747A1

    公开(公告)日:2020-11-05

    申请号:US16933407

    申请日:2020-07-20

    Abstract: Disclosed embodiments relate to a dNap architecture that accurately transitions cache lines to full power state before an access to them. This ensures that there are no additional delays due to waking up drowsy lines. Only cache lines that are determined by the DMC to be accessed in the immediate future are fully powered while others are put in drowsy mode. As a result, we are able to significantly reduce leakage power with no cache performance degradation and minimal hardware overhead, especially at higher associativities. Up to 92% static/Leakage power savings are accomplished with minimal hardware overhead and no performance tradeoff.

    ZERO LATENCY PREFETCHING IN CACHES
    34.
    发明申请

    公开(公告)号:US20190114263A1

    公开(公告)日:2019-04-18

    申请号:US15730874

    申请日:2017-10-12

    Abstract: This invention involves a cache system in a digital data processing apparatus including: a central processing unit core; a level one instruction cache; and a level two cache. The cache lines in the second level cache are twice the size of the cache lines in the first level instruction cache. The central processing unit core requests additional program instructions when needed via a request address. Upon a miss in the level one instruction cache that causes a hit in the upper half of a level two cache line, the level two cache supplies the upper half level cache line to the level one instruction cache. On a following level two cache memory cycle, the level two cache supplies the lower half of the cache line to the level one instruction cache. This cache technique thus prefetchs the lower half level two cache line employing fewer resources than an ordinary prefetch.

    Hiding page translation miss latency in program memory controller by selective page miss translation prefetch
    37.
    发明授权
    Hiding page translation miss latency in program memory controller by selective page miss translation prefetch 有权
    通过选择性页面错误翻译预取隐藏程序存储器控制器中的页面翻译错误延迟

    公开(公告)号:US09514059B2

    公开(公告)日:2016-12-06

    申请号:US14579654

    申请日:2014-12-22

    Abstract: This invention hides the page miss translation latency for program fetches. In this invention whenever an access is requested by CPU, the L1I cache controller does a-priori lookup of whether the virtual address plus the fetch packet count of expected program fetches crosses a page boundary. If the access crosses a page boundary, the L1I cache controller will request a second page translation along with the first page. This pipelines requests to the μTLB without waiting for L1I cache controller to begin processing the second page requests. This becomes a deterministic prefetch of the second page translation request. The translation information for the second page is stored locally in L1I cache controller and used when the access crosses the page boundary.

    Abstract translation: 本发明隐藏程序提取的页面未命中转换延迟。 在本发明中,只要CPU请求访问,L1I高速缓存控制器先验地查看虚拟地址加上预期程序提取的提取数据包数是否跨越页边界。 如果访问跨页面边界,则L1I缓存控制器将与第一页一起请求第二页翻译。 该管道请求到μTLB,而不等待L1I缓存控制器开始处理第二页请求。 这成为第二页翻译请求的确定性预取。 第二页的翻译信息本地存储在L1I高速缓存控制器中,并且当访问越过页面边界时使用。

    HIDING PAGE TRANSLATION MISS LATENCY IN PROGRAM MEMORY CONTROLLER BY NEXT PAGE PREFETCH ON CROSSING PAGE BOUNDARY
    39.
    发明申请
    HIDING PAGE TRANSLATION MISS LATENCY IN PROGRAM MEMORY CONTROLLER BY NEXT PAGE PREFETCH ON CROSSING PAGE BOUNDARY 有权
    隐藏页面翻译错误在程序存储器控制器中的延迟下一页页面前缀在交叉页面边界

    公开(公告)号:US20160179699A1

    公开(公告)日:2016-06-23

    申请号:US14581487

    申请日:2014-12-23

    Abstract: This invention hides the page miss translation latency for program fetches. In this invention whenever an access is requested by CPU that crosses a memory page boundary, the L1I cache controller request a next page translation along with the current page. This pipelines requests to the μTLB without waiting for L1I cache controller to begin processing the next page requests. This becomes a deterministic prefetch of the second page translation request. The translation information for the second page is stored locally in L1I cache controller and used when the access crosses the next page boundary.

    Abstract translation: 本发明隐藏程序提取的页面未命中转换延迟。 在本发明中,只要CPU通过存取页面边界的请求,L1I高速缓存控制器与当前页一起请求下一页翻译。 该管道请求到μTLB,而不等待L1I缓存控制器开始处理下一页请求。 这成为第二页翻译请求的确定性预取。 第二页的翻译信息本地存储在L1I高速缓存控制器中,当访问跨越下一页边界时使用。

    Static Power Reduction in Caches Using Deterministic Naps
    40.
    发明申请
    Static Power Reduction in Caches Using Deterministic Naps 审中-公开
    使用确定性缺陷的缓存中的静态功耗降低

    公开(公告)号:US20150310902A1

    公开(公告)日:2015-10-29

    申请号:US14694285

    申请日:2015-04-23

    Abstract: The dNap architecture is able to accurately transition cache lines to full power state before an access to them. This ensures that there are no additional delays due to waking up drowsy lines. Only cache lines that are determined by the DMC to be accessed in the immediate future are fully powered while others are put in drowsy mode. As a result, we are able to significantly reduce leakage power with no cache performance degradation and minimal hardware overhead, especially at higher associativities. Up to 92% static/Leakage power savings are accomplished with minimal hardware overhead and no performance tradeoff.

    Abstract translation: dNap架构能够在访问它们之前将高速缓存行精确地转换为全功率状态。 这确保了由于醒来困倦的线路而没有额外的延迟。 只有由DMC确定要在未来访问的高速缓存线才能全部供电,而其他线路则处于昏昏欲睡的模式。 因此,我们能够显着降低泄漏功率,无需缓存性能下降和最小的硬件开销,特别是在较高的关联性。 高达92%的静态/泄漏功率节省是以最少的硬件开销和无需性能折衷来实现的。

Patent Agency Ranking