Trace Cache Techniques Based on Biased Control Transfer Instructions

    公开(公告)号:US20250021332A1

    公开(公告)日:2025-01-16

    申请号:US18352309

    申请日:2023-07-14

    Applicant: Apple Inc.

    Abstract: Disclosed techniques relate to trace cache circuitry configured to identify and cache traces that satisfy certain criteria. Prediction circuitry may track directions of executed control transfer instructions, including a first category of control transfer instructions that meet a first threshold bias level toward a given direction (which may be referred to as “stable”) and a second category of control transfer instructions that do not meet the first threshold bias level (which may be referred to as “unstable”). Trace cache circuitry may identify traces of instructions that satisfy a set of criteria, including: only control transfer instructions of the first category are allowed as internal control transfer instructions and a control transfer instruction in the second category is allowed only at an end of a given trace. Disclosed techniques may advantageously provide performance and power advantages of trace caching with reduced complexity, relative to certain traditional trace caches.

    Multi-table Signature Prefetch
    2.
    发明申请

    公开(公告)号:US20230023860A1

    公开(公告)日:2023-01-26

    申请号:US17382123

    申请日:2021-07-21

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to signature-based instruction prefetching. In some embodiments, processor pipeline circuitry executes a computer program that includes control transfer instructions, such that the execution follows a taken path through the computer program. First signature prefetch table circuitry indicates prefetch addresses for signatures generated using a first signature generation technique and second signature prefetch table circuitry that indicates prefetch addresses for signatures generated using a second, different signature generation technique. Signature prefetch circuitry, in response to a prefetch training event: determines a first signature according to the first technique and a second signature according to the second technique and selects one but not both of the first and second signature prefetch tables to train using the first signature or the second signature.

    Sequential prefetch boost
    3.
    发明授权

    公开(公告)号:US10346309B1

    公开(公告)日:2019-07-09

    申请号:US15497338

    申请日:2017-04-26

    Applicant: Apple Inc.

    Abstract: In an embodiment, a prefetch circuit may implement prefetch “boosting” to reduce the cost of cold (compulsory) misses and thus potentially improve performance. When a demand miss occurs, the prefetch circuit may generate one or more prefetch requests. The prefetch circuit may monitor the progress of the demand miss (and optionally the previously-generated prefetch requests as well) through the cache hierarchy to memory. At various progress points, if the demand miss remains a miss, additional prefetch requests may be launched. For example, if the demand miss accesses a lower level cache and misses, additional prefetch requests may be launched because the latency avoided in prefetching the additional cache blocks is higher, which may over ride the potential that the additional cache blocks are incorrectly prefetched.

    Conditional Instructions Distribution and Execution

    公开(公告)号:US20230244495A1

    公开(公告)日:2023-08-03

    申请号:US17590722

    申请日:2022-02-01

    Applicant: Apple Inc.

    Abstract: A processor may include an instruction distribution circuit and a plurality of execution pipelines. The instruction distribution circuit may distribute a conditional instruction to a first execution pipeline for execution when the conditional instruction is associated with a prediction of a high confidence level, or to a second execution pipeline for execution when the conditional instruction is associated with a prediction of a low confidence level. The second execution pipeline, not the first execution pipeline, may directly instruct the processor to obtain an instruction from a target address for execution, when the conditional instruction is mispredicted. Thus, when the conditional instruction is distributed to the first execution pipeline for execution and determined to be mispredicted, the first execution pipeline may cause the conditional instruction to be re-executed in the second execution pipeline to cause the instruction from the correct target address to be obtained for execution.

    NEXT FETCH PREDICTOR RETURN ADDRESS STACK
    6.
    发明申请
    NEXT FETCH PREDICTOR RETURN ADDRESS STACK 有权
    下一个FETCH PREDICTOR返回地址堆栈

    公开(公告)号:US20140344558A1

    公开(公告)日:2014-11-20

    申请号:US13893898

    申请日:2013-05-14

    Applicant: Apple Inc.

    CPC classification number: G06F9/3806 G06F9/30054 G06F9/382 G06F9/3848

    Abstract: A system and method for efficient branch prediction. A processor includes a next fetch predictor to generate a fast branch prediction for branch instructions at an early pipeline stage. The processor also includes a main return address stack (RAS) at a later pipeline stage for predicting the target of return instructions. When a return instruction is encountered, the prediction from the next fetch predictor is replaced by the top of the main RAS. If there are any recent call or return instructions in flight toward the main RAS, then a separate prediction is generated by a mini-RAS.

    Abstract translation: 一种有效的分支预测的系统和方法。 处理器包括下一个提取预测器,用于在早期流水线阶段生成分支指令的快速分支预测。 该处理器还包括在稍后流水线阶段的主返回地址堆栈(RAS),用于预测返回指令的目标。 当遇到返回指令时,来自下一个提取预测器的预测由主RAS的顶部代替。 如果飞行中有最近的呼叫或返回指令进入主RAS,则由小型RAS产生单独的预测。

    Multi-table signature prefetch
    7.
    发明授权

    公开(公告)号:US11630670B2

    公开(公告)日:2023-04-18

    申请号:US17382123

    申请日:2021-07-21

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to signature-based instruction prefetching. In some embodiments, processor pipeline circuitry executes a computer program that includes control transfer instructions, such that the execution follows a taken path through the computer program. First signature prefetch table circuitry indicates prefetch addresses for signatures generated using a first signature generation technique and second signature prefetch table circuitry indicates prefetch addresses for signatures generated using a second, different signature generation technique. Signature prefetch circuitry, in response to a prefetch training event, determines a first signature according to the first technique and a second signature according to the second technique and selects one but not both of the first and second signature prefetch tables to train using the first signature or the second signature.

    Systems and methods for optimizing authentication branch instructions

    公开(公告)号:US11468168B1

    公开(公告)日:2022-10-11

    申请号:US15484439

    申请日:2017-04-11

    Applicant: Apple Inc.

    Abstract: Systems, apparatuses, and methods for efficient handling of subroutine epilogues. When an indirect control transfer instruction corresponding to a procedure return for a subroutine is identified, the return address and a signature are retrieved from one or more of a return address stack and the memory stack. An authenticator generates a signature based on at least a portion of the retrieved return address. While the signature is being generated, instruction processing speculatively continues. No instructions are permitted to commit yet. The generated signature is later compared to a copy of the signature generated earlier during the corresponding procedure call. A mismatch causes an exception.

    Next fetch predictor return address stack
    9.
    发明授权
    Next fetch predictor return address stack 有权
    下一个提取预测器返回地址堆栈

    公开(公告)号:US09405544B2

    公开(公告)日:2016-08-02

    申请号:US13893898

    申请日:2013-05-14

    Applicant: Apple Inc.

    CPC classification number: G06F9/3806 G06F9/30054 G06F9/382 G06F9/3848

    Abstract: A system and method for efficient branch prediction. A processor includes a next fetch predictor to generate a fast branch prediction for branch instructions at an early pipeline stage. The processor also includes a main return address stack (RAS) at a later pipeline stage for predicting the target of return instructions. When a return instruction is encountered, the prediction from the next fetch predictor is replaced by the top of the main RAS. If there are any recent call or return instructions in flight toward the main RAS, then a separate prediction is generated by a mini-RAS.

    Abstract translation: 一种有效的分支预测的系统和方法。 处理器包括下一个提取预测器,用于在早期流水线阶段生成分支指令的快速分支预测。 该处理器还包括在稍后流水线阶段的主返回地址堆栈(RAS),用于预测返回指令的目标。 当遇到返回指令时,来自下一个提取预测器的预测由主RAS的顶部代替。 如果飞行中有最近的呼叫或返回指令进入主RAS,则由小型RAS产生单独的预测。

Patent Agency Ranking