-
公开(公告)号:US12159142B1
公开(公告)日:2024-12-03
申请号:US18310919
申请日:2023-05-02
Applicant: Apple Inc.
Inventor: Yuan C. Chou , Chang Xu , Deepankar Duggal , Debasish Chandra
Abstract: Techniques are disclosed relating to predicting values for load operations. In some embodiments, front-end circuitry is configured to predict values of load operations based on multiple value tagged geometric length predictor (VTAGE) prediction tables (based on program counter information and branch history information). Training circuitry may adjust multiple VTAGE learning tables based on completed load operations. Control circuitry may pre-compute access information (e.g., an index) for a VTAGE learning table for a load based on branch history information that is available to the front-end circuitry but that is unavailable to the training circuitry, store the pre-computed access information, and provide the pre-computed access information from the first storage circuitry to the training circuitry to access the VTAGE learning table based on completion of the load. This may facilitate VTAGE training without pipelining the branch history information.
-
公开(公告)号:US20250094567A1
公开(公告)日:2025-03-20
申请号:US18510540
申请日:2023-11-15
Applicant: Apple Inc.
Inventor: John D Pape , Deepankar Duggal , Christopher M Tsay , Andrew H Lin , Corey C Stappenbeck
Abstract: In an embodiment, a processor includes hardware circuitry which may be used to authenticate instruction operands. The processor may execute instructions that perform operand authentication both speculatively and non-speculatively. During speculative execution of such instructions, the processor may execute authentication such that no differences in observable state of the processor, relative to authentication result, are detectable via a side channel. During speculative execution, a result of authentication may be deferred until speculative execution of the instruction, and additional instructions, may be completed. Upon resolution of a condition that indicates acceptance of the speculative execution, a speculative execution result may cause a processor exception and stalling of execution at the instruction to be performed.
-
公开(公告)号:US20240329990A1
公开(公告)日:2024-10-03
申请号:US18740430
申请日:2024-06-11
Applicant: Apple Inc.
Inventor: Deepankar Duggal , Kulin N Kothari , Mridul Agarwal , Chang Xu , Yanran Yang , Richard F Russo , Yuan C Chou , Douglas C Holman
CPC classification number: G06F9/30087 , G06F9/3802 , G06F9/522
Abstract: A system, e.g., a system on a chip (SOC), may include one or more processors. A processor may execute an instruction synchronization barrier (ISB) instruction to enforce an ordering constraint on instructions. To execute the ISB instruction, the processor may determine whether contexts of the processor required for execution of instructions older than the ISB instruction are consumed for the older instructions. Responsive to determining that the contexts are consumed for the older instructions, the processor may initiate fetching of an instruction younger than the ISB instruction, without waiting for the older instructions to retire.
-
公开(公告)号:US12175248B2
公开(公告)日:2024-12-24
申请号:US18305151
申请日:2023-04-21
Applicant: Apple Inc.
Inventor: Yuan C. Chou , Deepankar Duggal , Debasish Chandra , Niket K Choudhary , Richard F. Russo
Abstract: Disclosed techniques relate to re-use of speculative results from an incorrect execution path. In some embodiments, when a control transfer instruction is mispredicted, a load instruction may have been executed on the wrong path. In disclosed embodiments, result storage circuitry records information that indicates destination registers of speculatively-executed load instructions including a first load instruction. Control flow tracker circuitry may store information indicating a reconvergence point for the control transfer instruction. Re-use control circuitry may track registers written by instructions prior to the reconvergence point, determine that the first load instruction does not depend on data from any instruction between the control transfer instruction and the reconvergence point, and use, as a result of the first load instruction, a value from a recorded destination register that was written based on speculative execution of the first load, notwithstanding the misprediction of the control transfer instruction.
-
公开(公告)号:US20240354111A1
公开(公告)日:2024-10-24
申请号:US18305173
申请日:2023-04-21
Applicant: Apple Inc.
Inventor: Yuan C. Chou , Deepankar Duggal , Debasish Chandra , Niket K. Choudhary , Richard F. Russo
CPC classification number: G06F9/3842 , G06F9/3005 , G06F9/3016
Abstract: Disclosed techniques relate to re-use of speculative results from an incorrect execution path. In some embodiments, when a first control transfer instruction is mispredicted, a second control transfer instruction may have been executed on the wrong path because of the misprediction. Result storage circuitry may record information indicating a determined direction for the second control transfer instruction. Control flow tracker circuitry may store, for the first control transfer instruction, information indicating a reconvergence point. Re-use control circuitry may track registers written by instructions prior to the reconvergence point, determine, based on the tracked registers, that the second control transfer instruction does not depend on data from any instruction between the first control transfer instruction and the reconvergence point, and use the recorded determined direction for the second control transfer instruction, notwithstanding the misprediction of the first control transfer instruction.
-
公开(公告)号:US20210064376A1
公开(公告)日:2021-03-04
申请号:US16551208
申请日:2019-08-26
Applicant: Apple Inc.
Inventor: Deepankar Duggal , Conrado Blasco , Muawya M. Al-Otoom , Richard F. Russo
IPC: G06F9/38
Abstract: Systems, apparatuses, and methods for implementing a physical register last reference scheme are described. A system includes a processor with a mapper, history file, and freelist. When an entry in the mapper is updated with a new architectural register-to-physical register mapping, the processor creates a new history file entry for the given instruction that caused the update. The processor also searches the mapper to determine if the old physical register that was previously stored in the mapper entry is referenced by any other mapper entries. If there are no other mapper entries that reference this old physical register, then a last reference indicator is stored in the new history file entry. When the given instruction retires, the processor checks the last reference indicator in the history file entry to determine whether the old physical register can be returned to the freelist of available physical registers.
-
公开(公告)号:US20250036415A1
公开(公告)日:2025-01-30
申请号:US18358890
申请日:2023-07-25
Applicant: Apple Inc.
Inventor: Deepankar Duggal , Pruthivi Vuyyuru , Ian D. Kountanis
Abstract: A processor may include a conditional instruction prediction tracking circuit. During fetch of a conditional instruction from memory to an instruction cache of the processor, the conditional instruction prediction tracking circuit may predict whether the conditional instruction is biased. Responsive to a prediction that the conditional instruction is biased, the conditional instruction prediction tracking circuit may cause the conditional instruction to be executed according to the predicted bias. Sometimes the conditional prediction tracking circuit may cause the conditional instruction to be re-coded such that it may be executed as an unconditional instruction.
-
公开(公告)号:US11200062B2
公开(公告)日:2021-12-14
申请号:US16551208
申请日:2019-08-26
Applicant: Apple Inc.
Inventor: Deepankar Duggal , Conrado Blasco , Muawya M. Al-Otoom , Richard F. Russo
IPC: G06F9/38
Abstract: Systems, apparatuses, and methods for implementing a physical register last reference scheme are described. A system includes a processor with a mapper, history file, and freelist. When an entry in the mapper is updated with a new architectural register-to-physical register mapping, the processor creates a new history file entry for the given instruction that caused the update. The processor also searches the mapper to determine if the old physical register that was previously stored in the mapper entry is referenced by any other mapper entries. If there are no other mapper entries that reference this old physical register, then a last reference indicator is stored in the new history file entry. When the given instruction retires, the processor checks the last reference indicator in the history file entry to determine whether the old physical register can be returned to the freelist of available physical registers.
-
公开(公告)号:US20210173654A1
公开(公告)日:2021-06-10
申请号:US16705023
申请日:2019-12-05
Applicant: Apple Inc.
Inventor: Deepankar Duggal , Kulin N. Kothari , Conrado Blasco , Muawya M. Al-Otoom
Abstract: Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.
-
公开(公告)号:US20200272467A1
公开(公告)日:2020-08-27
申请号:US16286213
申请日:2019-02-26
Applicant: Apple Inc.
Inventor: Aditya Kesiraju , Andrew J. Beaumont-Smith , Deepankar Duggal , Ran A. Chachick
Abstract: In an embodiment, a coprocessor includes multiple processing elements arranged in a grid of one or more rows and one or more columns. A given processing element includes an arithmetic/logic unit (ALU) circuit configured to perform an ALU operation specified by an instruction executable by the coprocessor, wherein the ALU circuit is configured to produce a result. The given processing element further comprises a first memory coupled to the execute circuit. The first memory is configured to store results generated by the given processing element. The first memory includes a portion of a result memory implemented by the coprocessor, wherein locations in the result memory are specifiable as destination operands of instructions executable by the coprocessor. The portion of the result memory implemented by the first memory is the portion of the result memory that the given processing element is capable of updating.
-
-
-
-
-
-
-
-
-