-
公开(公告)号:US11099990B2
公开(公告)日:2021-08-24
申请号:US16545521
申请日:2019-08-20
Applicant: Apple Inc.
Inventor: Gideon N. Levinsky , Brian R. Mestan , Deepak Limaye , Mridul Agarwal
IPC: G06F12/0811
Abstract: A system and method for efficiently forwarding cache misses to another level of the cache hierarchy. Logic in a cache controller receives a first non-cacheable load miss request and stores it in a miss queue. When the logic determines the target address of the first load miss request is within a target address range of an older pending second load miss request stored in the miss queue with an open merge window, the logic merges the two requests into a single merged miss request. Additional requests may be similarly merged. The logic issues the merged miss requests based on determining the merge window has closed. The logic further prevents any other load miss requests, which were not previously merged in the merged miss request before it was issued, from obtaining a copy of data from the returned fill data. Such prevention in a non-coherent memory computing system supports memory ordering.
-
公开(公告)号:US20210056024A1
公开(公告)日:2021-02-25
申请号:US16545521
申请日:2019-08-20
Applicant: Apple Inc.
Inventor: Gideon N. Levinsky , Brian R. Mestan , Deepak Limaye , Mridul Agarwal
IPC: G06F12/0811
Abstract: A system and method for efficiently forwarding cache misses to another level of the cache hierarchy. Logic in a cache controller receives a first non-cacheable load miss request and stores it in a miss queue. When the logic determines the target address of the first load miss request is within a target address range of an older pending second load miss request stored in the miss queue with an open merge window, the logic merges the two requests into a single merged miss request. Additional requests may be similarly merged. The logic issues the merged miss requests based on determining the merge window has closed. The logic further prevents any other load miss requests, which were not previously merged in the merged miss request before it was issued, from obtaining a copy of data from the returned fill data. Such prevention in a non-coherent memory computing system supports memory ordering.
-
公开(公告)号:US20230236988A1
公开(公告)日:2023-07-27
申请号:US18189982
申请日:2023-03-24
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1027 , G06F9/455
CPC classification number: G06F12/1027 , G06F9/45558 , G06F2212/683 , G06F2009/45583
Abstract: Systems, apparatuses, and methods for performing efficient translation lookaside buffer (TLB) invalidation operations for splintered pages are described. When a TLB receives an invalidation request for a specified translation context, and the invalidation request maps to an entry with a relatively large page size, the TLB does not know if there are multiple translation entries stored in the TLB for smaller splintered pages of the relatively large page. The TLB tracks whether or not splintered pages for each translation context have been installed. If a TLB invalidate (TLBI) request is received, and splintered pages have not been installed, no searches are needed for splintered pages. To refresh the sticky bits, whenever a full TLB search is performed, the TLB rescans for splintered pages for other translation contexts. If no splintered pages are found, the sticky bit can be cleared and the number of full TLBI searches is reduced.
-
公开(公告)号:US11675710B2
公开(公告)日:2023-06-13
申请号:US17016179
申请日:2020-09-09
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1045 , G06F12/0882 , G06F9/455 , G06F12/1027
CPC classification number: G06F12/1063 , G06F9/45558 , G06F12/0882 , G06F12/1027 , G06F2009/45583 , G06F2212/7201
Abstract: Systems, apparatuses, and methods for limiting translation lookaside buffer (TLB) searches using active page size are described. A TLB stores virtual-to-physical address translations for a plurality of different page sizes. When the TLB receives a command to invalidate a TLB entry corresponding to a specified virtual address, the TLB performs, for the plurality of different pages sizes, multiple different lookups of the indices corresponding to the specified virtual address. In order to reduce the number of lookups that are performed, the TLB relies on a page size presence vector and an age matrix to determine which page sizes to search for and in which order. The page size presence vector indicates which page sizes may be stored for the specified virtual address. The age matrix stores a preferred search order with the most probable page size first and the least probable page size last.
-
公开(公告)号:US11615033B2
公开(公告)日:2023-03-28
申请号:US17016229
申请日:2020-09-09
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1027 , G06F9/455
Abstract: Systems, apparatuses, and methods for performing efficient translation lookaside buffer (TLB) invalidation operations for splintered pages are described. When a TLB receives an invalidation request for a specified translation context, and the invalidation request maps to an entry with a relatively large page size, the TLB does not know if there are multiple translation entries stored in the TLB for smaller splintered pages of the relatively large page. The TLB tracks whether or not splintered pages for each translation context have been installed. If a TLB invalidate (TLBI) request is received, and splintered pages have not been installed, no searches are needed for splintered pages. To refresh the sticky bits, whenever a full TLB search is performed, the TLB rescans for splintered pages for other translation contexts. If no splintered pages are found, the sticky bit can be cleared and the number of full TLBI searches is reduced.
-
公开(公告)号:US20230012199A1
公开(公告)日:2023-01-12
申请号:US17817748
申请日:2022-08-05
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/0802
Abstract: Techniques are disclosed relating to controlling cache replacement. In some embodiments, a computing system performs multiple searches of a data structure, where one or more of the searches traverse multiple links between elements of the data structure. The system may cache, in a traversal cache, traversal information that is usable by searche s to skip one or more links traversed by one or more prior searches. The system may store tracking information that indicates a location in the traversal cache at which prior traversal information for a first search is stored. The system may select, based on the tracking information, an entry in the traversal cache for new traversal information generated by the first search. The selection may override a default replacement policy for the traversal cache, e.g., to select the location in the traversal cache to replace the prior traversal information with the new traversal information.
-
公开(公告)号:US11422946B2
公开(公告)日:2022-08-23
申请号:US17008457
申请日:2020-08-31
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/10 , G06F12/1027
Abstract: Systems, apparatuses, and methods for implementing translation lookaside buffer (TLB) striping to enable efficient invalidation operations are described. TLB sizes are growing in width (more features in a given page table entry) and depth (to cover larger memory footprints). A striping scheme is proposed to enable an efficient and high performance method for performing TLB maintenance operations in the face of this growth. Accordingly, a TLB stores first attribute data in a striped manner across a plurality of arrays. The striped manner allows different entries to be searched simultaneously in response to receiving an invalidation request which identifies a particular attribute of a group to be invalidated. Upon receiving an invalidation request, the TLB generates a plurality of indices with an offset between each index and walks through the plurality of arrays by incrementing each index and simultaneously checking the first attribute data in corresponding entries.
-
公开(公告)号:US20220075735A1
公开(公告)日:2022-03-10
申请号:US17016179
申请日:2020-09-09
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1045 , G06F12/0882 , G06F9/455
Abstract: Systems, apparatuses, and methods for limiting translation lookaside buffer (TLB) searches using active page size are described. A TLB stores virtual-to-physical address translations for a plurality of different page sizes. When the TLB receives a command to invalidate a TLB entry corresponding to a specified virtual address, the TLB performs, for the plurality of different pages sizes, multiple different lookups of the indices corresponding to the specified virtual address. In order to reduce the number of lookups that are performed, the TLB relies on a page size presence vector and an age matrix to determine which page sizes to search for and in which order. The page size presence vector indicates which page sizes may be stored for the specified virtual address. The age matrix stores a preferred search order with the most probable page size first and the least probable page size last.
-
公开(公告)号:US20220075734A1
公开(公告)日:2022-03-10
申请号:US17016229
申请日:2020-09-09
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1027 , G06F9/455
Abstract: Systems, apparatuses, and methods for performing efficient translation lookaside buffer (TLB) invalidation operations for splintered pages are described. When a TLB receives an invalidation request for a specified translation context, and the invalidation request maps to an entry with a relatively large page size, the TLB does not know if there are multiple translation entries stored in the TLB for smaller splintered pages of the relatively large page. The TLB tracks whether or not splintered pages for each translation context have been installed. If a TLB invalidate (TLBI) request is received, and splintered pages have not been installed, no searches are needed for splintered pages. To refresh the sticky bits, whenever a full TLB search is performed, the TLB rescans for splintered pages for other translation contexts. If no splintered pages are found, the sticky bit can be cleared and the number of full TLBI searches is reduced.
-
20.
公开(公告)号:US11928467B2
公开(公告)日:2024-03-12
申请号:US17473076
申请日:2021-09-13
Applicant: Apple Inc.
Inventor: Brian R. Mestan , Gideon N. Levinsky , Michael L. Karm
CPC classification number: G06F9/30043 , G06F9/3004 , G06F9/30087 , G06F9/321 , G06F9/3826 , G06F9/3834 , G06F9/3842 , G06F9/528 , G06F2209/521
Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
-
-
-
-
-
-
-
-
-