-
公开(公告)号:US20230017473A1
公开(公告)日:2023-01-19
申请号:US17373814
申请日:2021-07-13
Applicant: APPLE INC.
Inventor: John D. Pape , Mahesh K. Reddy , Prasanna Utchani Varadharajan , Pruthivi Vuyyuru
IPC: G06F12/0802
Abstract: An apparatus includes multiple processors including respective cache memories, the cache memories configured to cache cache-entries for use by the processors. At least a processor among the processors includes cache management logic that is configured to (i) receive, from one or more of the other processors, cache-invalidation commands that request invalidation of specified cache-entries in the cache memory of the processor (ii) mark the specified cache-entries as intended for invalidation but defer actual invalidation of the specified cache-entries, and (iii) upon detecting a synchronization event associated with the cache-invalidation commands, invalidate the cache-entries that were marked as intended for invalidation.
-
公开(公告)号:US12008369B1
公开(公告)日:2024-06-11
申请号:US17652501
申请日:2022-02-25
Applicant: Apple Inc.
Inventor: John D. Pape , Skanda K. Srinivasa , Francesco Spadini , Brian T. Mokrzycki
CPC classification number: G06F9/30043 , G06F9/3001 , G06F9/30058 , G06F9/3016 , G06F9/30185 , G06F9/3838 , G06F9/3858 , G06F9/3861
Abstract: Techniques are disclosed that relate to executing fused instructions. A processor may include a decoder circuit and a load/store circuit. The decoder circuit may detect a load/store instruction to load a value from a memory and detect a non-load/store instruction that depends on the value to be loaded. The decoder circuit may fuse the load/store instruction and the non-load/store instruction such that one or more operations that the non-load/store instruction is defined to perform are to be executed within the load/store circuit. The load/store circuit may receive an indication of the fused load/store and non-load/store instructions and then execute one or more operations of the load/store instruction and the one or more operations of the non-load/store instruction using a circuit included in the load/store circuit.
-
公开(公告)号:US20220066947A1
公开(公告)日:2022-03-03
申请号:US17008457
申请日:2020-08-31
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1027
Abstract: Systems, apparatuses, and methods for implementing translation lookaside buffer (TLB) striping to enable efficient invalidation operations are described. TLB sizes are growing in width (more features in a given page table entry) and depth (to cover larger memory footprints). A striping scheme is proposed to enable an efficient and high performance method for performing TLB maintenance operations in the face of this growth. Accordingly, a TLB stores first attribute data in a striped manner across a plurality of arrays. The striped manner allows different entries to be searched simultaneously in response to receiving an invalidation request which identifies a particular attribute of a group to be invalidated. Upon receiving an invalidation request, the TLB generates a plurality of indices with an offset between each index and walks through the plurality of arrays by incrementing each index and simultaneously checking the first attribute data in corresponding entries.
-
公开(公告)号:US20250094174A1
公开(公告)日:2025-03-20
申请号:US18783937
申请日:2024-07-25
Applicant: Apple Inc.
Inventor: Brandon H. Dwiel , Andrew J. Beaumont-Smith , Eric J. Furbish , John D. Pape , Stephen G. Meier , Tyler J. Huberty
IPC: G06F9/38
Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.
-
公开(公告)号:US12079140B2
公开(公告)日:2024-09-03
申请号:US18189982
申请日:2023-03-24
Applicant: Apple Inc.
Inventor: John D. Pape , Brian R. Mestan , Peter G. Soderquist
IPC: G06F12/1027 , G06F9/455
CPC classification number: G06F12/1027 , G06F9/45558 , G06F2009/45583 , G06F2212/683
Abstract: Systems, apparatuses, and methods for performing efficient translation lookaside buffer (TLB) invalidation operations for splintered pages are described. When a TLB receives an invalidation request for a specified translation context, and the invalidation request maps to an entry with a relatively large page size, the TLB does not know if there are multiple translation entries stored in the TLB for smaller splintered pages of the relatively large page. The TLB tracks whether or not splintered pages for each translation context have been installed. If a TLB invalidate (TLBI) request is received, and splintered pages have not been installed, no searches are needed for splintered pages. To refresh the sticky bits, whenever a full TLB search is performed, the TLB rescans for splintered pages for other translation contexts. If no splintered pages are found, the sticky bit can be cleared and the number of full TLBI searches is reduced.
-
公开(公告)号:US11900118B1
公开(公告)日:2024-02-13
申请号:US17817866
申请日:2022-08-05
Applicant: Apple Inc.
Inventor: John D. Pape , Francesco Spadini , Zhaoxiang Jin
CPC classification number: G06F9/3814 , G06F9/30134 , G06F9/3826 , G06F9/3838
Abstract: An apparatus includes a rescue buffer circuit, a store queue circuit, and a control circuit. The rescue buffer circuit may be configured to retain address information related to store instructions. The store queue circuit may be configured to buffer dependency information related to a particular store instruction until the particular store instruction is released to be executed. The control circuit may be configured to cause a subset of the dependency information for the particular store instruction to be written to the rescue buffer circuit. The rescue buffer circuit may be configured to retain the subset after the dependency information has been released from the store queue circuit, and to perform a subsequent load instruction corresponding to a memory location associated with the particular store instruction using the subset of the dependency information from the rescue buffer circuit.
-
公开(公告)号:US20230092898A1
公开(公告)日:2023-03-23
申请号:US17643765
申请日:2021-12-10
Applicant: Apple Inc.
Inventor: Brandon H. Dwiel , Andrew J. Beaumont-Smith , Eric J. Furbish , John D. Pape , Stephen G. Meier , Tyler J. Huberty
Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.
-
公开(公告)号:US20240329988A1
公开(公告)日:2024-10-03
申请号:US18739070
申请日:2024-06-10
Applicant: Apple Inc.
Inventor: John D. Pape , Skanda K. Srinivasa , Francesco Spadini , Brian T. Mokrzycki
CPC classification number: G06F9/30043 , G06F9/3001 , G06F9/30058 , G06F9/3016 , G06F9/30185 , G06F9/3838 , G06F9/3858 , G06F9/3861
Abstract: Techniques are disclosed that relate to executing fused instructions. A processor may include a decoder circuit and a load/store circuit. The decoder circuit may detect a load/store instruction to load a value from a memory and detect a non-load/store instruction that depends on the value to be loaded. The decoder circuit may fuse the load/store instruction and the non-load/store instruction such that one or more operations that the non-load/store instruction is defined to perform are to be executed within the load/store circuit. The load/store circuit may receive an indication of the fused load/store and non-load/store instructions and then execute one or more operations of the load/store instruction and the one or more operations of the non-load/store instruction using a circuit included in the load/store circuit.
-
公开(公告)号:US12050918B2
公开(公告)日:2024-07-30
申请号:US18361244
申请日:2023-07-28
Applicant: Apple Inc.
Inventor: Brandon H. Dwiel , Andrew J. Beaumont-Smith , Eric J. Furbish , John D. Pape , Stephen G. Meier , Tyler J. Huberty
IPC: G06F9/38
CPC classification number: G06F9/3881 , G06F9/382 , G06F9/383 , G06F9/3877
Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.
-
公开(公告)号:US20240095037A1
公开(公告)日:2024-03-21
申请号:US18361244
申请日:2023-07-28
Applicant: Apple Inc.
Inventor: Brandon H. Dwiel , Andrew J. Beaumont-Smith , Eric J. Furbish , John D. Pape , Stephen G. Meier , Tyler J. Huberty
IPC: G06F9/38
CPC classification number: G06F9/3881 , G06F9/382 , G06F9/383 , G06F9/3877
Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.
-
-
-
-
-
-
-
-
-