-
公开(公告)号:US11099990B2
公开(公告)日:2021-08-24
申请号:US16545521
申请日:2019-08-20
Applicant: Apple Inc.
Inventor: Gideon N. Levinsky , Brian R. Mestan , Deepak Limaye , Mridul Agarwal
IPC: G06F12/0811
Abstract: A system and method for efficiently forwarding cache misses to another level of the cache hierarchy. Logic in a cache controller receives a first non-cacheable load miss request and stores it in a miss queue. When the logic determines the target address of the first load miss request is within a target address range of an older pending second load miss request stored in the miss queue with an open merge window, the logic merges the two requests into a single merged miss request. Additional requests may be similarly merged. The logic issues the merged miss requests based on determining the merge window has closed. The logic further prevents any other load miss requests, which were not previously merged in the merged miss request before it was issued, from obtaining a copy of data from the returned fill data. Such prevention in a non-coherent memory computing system supports memory ordering.
-
公开(公告)号:US20210056024A1
公开(公告)日:2021-02-25
申请号:US16545521
申请日:2019-08-20
Applicant: Apple Inc.
Inventor: Gideon N. Levinsky , Brian R. Mestan , Deepak Limaye , Mridul Agarwal
IPC: G06F12/0811
Abstract: A system and method for efficiently forwarding cache misses to another level of the cache hierarchy. Logic in a cache controller receives a first non-cacheable load miss request and stores it in a miss queue. When the logic determines the target address of the first load miss request is within a target address range of an older pending second load miss request stored in the miss queue with an open merge window, the logic merges the two requests into a single merged miss request. Additional requests may be similarly merged. The logic issues the merged miss requests based on determining the merge window has closed. The logic further prevents any other load miss requests, which were not previously merged in the merged miss request before it was issued, from obtaining a copy of data from the returned fill data. Such prevention in a non-coherent memory computing system supports memory ordering.
-
公开(公告)号:US10402334B1
公开(公告)日:2019-09-03
申请号:US15948072
申请日:2018-04-09
Applicant: Apple Inc.
Inventor: Stephan G. Meier , Mridul Agarwal
IPC: G06F12/00 , G06F12/0862 , G06F9/30
Abstract: In an embodiment, a processor may implement an access map-pattern match (AMPM)-based prefetch circuit with features designed to improve prefetching accuracy and/or reduce power consumption. In an embodiment, the prefetch circuit may be configured to detect that pointer reads are occurring (e.g. “pointer chasing.”) The prefetch circuit may be configured to increase the frequency at which prefetch requests are generated for an access map in which pointer read activity is detected, compared to the frequency at which the prefetch requests would be generated in the pointer read activity is not generated. In an embodiment, the prefetch circuit may also detect access maps that are store-only, and may reduce the frequency of prefetches for store only access maps as compared to the frequency of load-only or load/store maps.
-
公开(公告)号:US20150199272A1
公开(公告)日:2015-07-16
申请号:US14154122
申请日:2014-01-13
Applicant: Apple Inc.
Inventor: Rajat Goel , Mridul Agarwal
IPC: G06F12/08
CPC classification number: G06F12/0815 , G06F12/0844 , G06F12/0891
Abstract: Systems, processors, and methods for efficiently handling concurrent store and load operations within a processor. A processor comprises a load-store unit (LSU) with a banked level-one (L1) data cache. When a store operation is ready to write data to the L1 data cache, the store operation will skip the write to any banks that have a conflict with a concurrent load operation. A partial write of the store operation will be performed to those banks of the L1 data cache that do not have a conflict with a concurrent load operation. For every attempt to write the store operation, a corresponding store mask will be updated to indicate which portions of the store operation were successfully written to the L1 data cache.
Abstract translation: 用于在处理器内有效处理并发存储和加载操作的系统,处理器和方法。 处理器包括具有一级(L1)数据高速缓存的加载存储单元(LSU)。 当存储操作准备好将数据写入L1数据高速缓存时,存储操作将跳过对与并发加载操作冲突的任何存储区的写操作。 将对与数据并行加载操作不冲突的L1数据高速缓存区进行存储操作的部分写入。 对于每次尝试写存储操作时,将更新相应的存储掩码,以指示存储操作的哪些部分已成功写入L1数据高速缓存。
-
-
-