Managing serial miss requests for load operations in a non-coherent memory system

    公开(公告)号:US11099990B2

    公开(公告)日:2021-08-24

    申请号:US16545521

    申请日:2019-08-20

    Applicant: Apple Inc.

    Abstract: A system and method for efficiently forwarding cache misses to another level of the cache hierarchy. Logic in a cache controller receives a first non-cacheable load miss request and stores it in a miss queue. When the logic determines the target address of the first load miss request is within a target address range of an older pending second load miss request stored in the miss queue with an open merge window, the logic merges the two requests into a single merged miss request. Additional requests may be similarly merged. The logic issues the merged miss requests based on determining the merge window has closed. The logic further prevents any other load miss requests, which were not previously merged in the merged miss request before it was issued, from obtaining a copy of data from the returned fill data. Such prevention in a non-coherent memory computing system supports memory ordering.

    MANAGING SERIAL MISS REQUESTS FOR LOAD OPERATIONS IN A NON-COHERENT MEMORY SYSTEM

    公开(公告)号:US20210056024A1

    公开(公告)日:2021-02-25

    申请号:US16545521

    申请日:2019-08-20

    Applicant: Apple Inc.

    Abstract: A system and method for efficiently forwarding cache misses to another level of the cache hierarchy. Logic in a cache controller receives a first non-cacheable load miss request and stores it in a miss queue. When the logic determines the target address of the first load miss request is within a target address range of an older pending second load miss request stored in the miss queue with an open merge window, the logic merges the two requests into a single merged miss request. Additional requests may be similarly merged. The logic issues the merged miss requests based on determining the merge window has closed. The logic further prevents any other load miss requests, which were not previously merged in the merged miss request before it was issued, from obtaining a copy of data from the returned fill data. Such prevention in a non-coherent memory computing system supports memory ordering.

    Prefetch circuit for a processor with pointer optimization

    公开(公告)号:US10402334B1

    公开(公告)日:2019-09-03

    申请号:US15948072

    申请日:2018-04-09

    Applicant: Apple Inc.

    Abstract: In an embodiment, a processor may implement an access map-pattern match (AMPM)-based prefetch circuit with features designed to improve prefetching accuracy and/or reduce power consumption. In an embodiment, the prefetch circuit may be configured to detect that pointer reads are occurring (e.g. “pointer chasing.”) The prefetch circuit may be configured to increase the frequency at which prefetch requests are generated for an access map in which pointer read activity is detected, compared to the frequency at which the prefetch requests would be generated in the pointer read activity is not generated. In an embodiment, the prefetch circuit may also detect access maps that are store-only, and may reduce the frequency of prefetches for store only access maps as compared to the frequency of load-only or load/store maps.

    CONCURRENT STORE AND LOAD OPERATIONS
    34.
    发明申请
    CONCURRENT STORE AND LOAD OPERATIONS 有权
    当前存储和负载操作

    公开(公告)号:US20150199272A1

    公开(公告)日:2015-07-16

    申请号:US14154122

    申请日:2014-01-13

    Applicant: Apple Inc.

    CPC classification number: G06F12/0815 G06F12/0844 G06F12/0891

    Abstract: Systems, processors, and methods for efficiently handling concurrent store and load operations within a processor. A processor comprises a load-store unit (LSU) with a banked level-one (L1) data cache. When a store operation is ready to write data to the L1 data cache, the store operation will skip the write to any banks that have a conflict with a concurrent load operation. A partial write of the store operation will be performed to those banks of the L1 data cache that do not have a conflict with a concurrent load operation. For every attempt to write the store operation, a corresponding store mask will be updated to indicate which portions of the store operation were successfully written to the L1 data cache.

    Abstract translation: 用于在处理器内有效处理并发存储和加载操作的系统,处理器和方法。 处理器包括具有一级(L1)数据高速缓存的加载存储单元(LSU)。 当存储操作准备好将数据写入L1数据高速缓存时,存储操作将跳过对与并发加载操作冲突的任何存储区的写操作。 将对与数据并行加载操作不冲突的L1数据高速缓存区进行存储操作的部分写入。 对于每次尝试写存储操作时,将更新相应的存储掩码,以指示存储操作的哪些部分已成功写入L1数据高速缓存。

Patent Agency Ranking