-
公开(公告)号:US11347514B2
公开(公告)日:2022-05-31
申请号:US16277764
申请日:2019-02-15
Applicant: Apple Inc.
Inventor: Deepak Limaye , Brian R. Mestan , Gideon N. Levinsky
IPC: G06F12/00 , G06F9/38 , G06F12/0815 , G06F12/0891 , G06F12/0864
Abstract: Techniques are disclosed relating to filtering access to a content-addressable memory (CAM). In some embodiments, a processor monitors for certain microarchitectural states and filters access to the CAM in states where there cannot be a match in the CAM or where matching entries will not be used even if there is a match. In some embodiments, toggle control circuitry prevents toggling of input lines when filtering CAM access, which may reduce dynamic power consumption. In some example embodiments, the CAM is used to access a load queue to validate that out-of-order execution for a set of instructions matches in-order execution, and situations where ordering should be checked are relatively rare.
-
公开(公告)号:US20220066947A1
公开(公告)日:2022-03-03
申请号:US17008457
申请日:2020-08-31
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1027
Abstract: Systems, apparatuses, and methods for implementing translation lookaside buffer (TLB) striping to enable efficient invalidation operations are described. TLB sizes are growing in width (more features in a given page table entry) and depth (to cover larger memory footprints). A striping scheme is proposed to enable an efficient and high performance method for performing TLB maintenance operations in the face of this growth. Accordingly, a TLB stores first attribute data in a striped manner across a plurality of arrays. The striped manner allows different entries to be searched simultaneously in response to receiving an invalidation request which identifies a particular attribute of a group to be invalidated. Upon receiving an invalidation request, the TLB generates a plurality of indices with an offset between each index and walks through the plurality of arrays by incrementing each index and simultaneously checking the first attribute data in corresponding entries.
-
公开(公告)号:US20200320004A1
公开(公告)日:2020-10-08
申请号:US16374667
申请日:2019-04-03
Applicant: Apple Inc.
Inventor: Brian R. Mestan
IPC: G06F12/0811 , G06F12/0837 , G06F12/0831
Abstract: A system and method for efficiently supporting a cache memory hierarchy potentially using a zero size cache in a level of the hierarchy. In various embodiments, logic in a lower-level cache controller or elsewhere receives a miss request from an upper-level cache controller. When the requested data is non-cacheable, the logic sends a snoop request with an address of the memory access operation to the upper-level cache controller to determine whether the requested data is in the upper-level data cache. When the snoop response indicates a miss or the requested data is cacheable, the logic retrieves the requested data from memory. When the snoop response indicates a hit, the logic retrieves the requested data from the upper-level cache. The logic completes servicing the memory access operation while preventing cache storage of the received requested data in a cache at a same level of the cache memory hierarchy as the logic.
-
公开(公告)号:US10725928B1
公开(公告)日:2020-07-28
申请号:US16243901
申请日:2019-01-09
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Pradeep Kanapathipillai , Joshua William Smith
IPC: G06F12/08 , G06F12/0891 , G06F12/1045
Abstract: A system and method for efficiently performing maintenance on a cache. In various embodiments, control logic in a cache controller or elsewhere receives an indication for invalidating a range of virtual-to-physical mappings in a given translation lookaside buffer (TLB). The logic determines a first latency to invalidate entries of the TLB based on a number of addresses in the range and a number of supported page sizes simultaneously stored in the TLB. The logic determines a second latency based on a number of entries in the TLB. If the first latency is greater, then the logic traverses through each TLB entry and invalidates TLB entries storing a virtual address within the range. If the first latency is smaller, then the logic traverses through each address in the range and invalidates TLB entries storing a virtual address within the range.
-
公开(公告)号:US20200218663A1
公开(公告)日:2020-07-09
申请号:US16243901
申请日:2019-01-09
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Pradeep Kanapathipillai , Joshua William Smith
IPC: G06F12/0891 , G06F12/1045
Abstract: A system and method for efficiently performing maintenance on a cache. In various embodiments, control logic in a cache controller or elsewhere receives an indication for invalidating a range of virtual-to-physical mappings in a given translation lookaside buffer (TLB). The logic determines a first latency to invalidate entries of the TLB based on a number of addresses in the range and a number of supported page sizes simultaneously stored in the TLB. The logic determines a second latency based on a number of entries in the TLB. If the first latency is greater, then the logic traverses through each TLB entry and invalidates TLB entries storing a virtual address within the range. If the first latency is smaller, then the logic traverses through each address in the range and invalidates TLB entries storing a virtual address within the range.
-
公开(公告)号:US12079140B2
公开(公告)日:2024-09-03
申请号:US18189982
申请日:2023-03-24
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1027 , G06F9/455
CPC classification number: G06F12/1027 , G06F9/45558 , G06F2009/45583 , G06F2212/683
Abstract: Systems, apparatuses, and methods for performing efficient translation lookaside buffer (TLB) invalidation operations for splintered pages are described. When a TLB receives an invalidation request for a specified translation context, and the invalidation request maps to an entry with a relatively large page size, the TLB does not know if there are multiple translation entries stored in the TLB for smaller splintered pages of the relatively large page. The TLB tracks whether or not splintered pages for each translation context have been installed. If a TLB invalidate (TLBI) request is received, and splintered pages have not been installed, no searches are needed for splintered pages. To refresh the sticky bits, whenever a full TLB search is performed, the TLB rescans for splintered pages for other translation contexts. If no splintered pages are found, the sticky bit can be cleared and the number of full TLBI searches is reduced.
-
公开(公告)号:US11720501B2
公开(公告)日:2023-08-08
申请号:US17817748
申请日:2022-08-05
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1009 , G06F12/1027 , G06F12/02 , G06F12/0817 , G06F12/128 , G06F12/0811 , G06F12/0802
CPC classification number: G06F12/1009 , G06F12/0238 , G06F12/0802 , G06F12/0811 , G06F12/0824 , G06F12/1027 , G06F12/128 , G06F2212/60
Abstract: Techniques are disclosed relating to controlling cache replacement. In some embodiments, a computing system performs multiple searches of a data structure, where one or more of the searches traverse multiple links between elements of the data structure. The system may cache, in a traversal cache, traversal information that is usable by searches to skip one or more links traversed by one or more prior searches. The system may store tracking information that indicates a location in the traversal cache at which prior traversal information for a first search is stored. The system may select, based on the tracking information, an entry in the traversal cache for new traversal information generated by the first search. The selection may override a default replacement policy for the traversal cache, e.g., to select the location in the traversal cache to replace the prior traversal information with the new traversal information.
-
公开(公告)号:US11119767B1
公开(公告)日:2021-09-14
申请号:US16906396
申请日:2020-06-19
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
9.
公开(公告)号:US12229557B2
公开(公告)日:2025-02-18
申请号:US18601640
申请日:2024-03-11
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
10.
公开(公告)号:US20240248717A1
公开(公告)日:2024-07-25
申请号:US18601640
申请日:2024-03-11
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
CPC classification number: G06F9/30043 , G06F9/3004 , G06F9/30087 , G06F9/321 , G06F9/3826 , G06F9/3834 , G06F9/3842 , G06F9/528 , G06F2209/521
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
-
-
-
-
-
-
-
-