INSTRUCTION ELIMINATION THROUGH HARDWARE DRIVEN MEMOIZATION OF LOOP INSTANCES

    公开(公告)号:US20240103874A1

    公开(公告)日:2024-03-28

    申请号:US17951859

    申请日:2022-09-23

    CPC classification number: G06F9/381 G06F9/30065 G06F9/325

    Abstract: Methods and apparatus for instruction elimination through hardware driven memoization of loop instances. A hardware-based loop memoization technique learns repeating sequences of loops and transparently removes instructions for the loop instructions from instruction sequences while making their output available to dependent instructions as if the loop instructions had been executed. A path-based predictor is implemented at the front-end to predict these loop instances and remove their instructions from instruction sequences. A novel memoization prediction micro-operation (Uop) is inserted into the instruction sequence for instances of loops that are predicted to be memoized. The memoization prediction Uop is used to compare the input signature (expected set of input values for the loop) with the actual signature to determine correct and incorrect predictions. The input signature learnt is based on all live-ins of a loop, both explicit register-based live-ins as well as loads to memory in the loop body that determine code path and outputs.

    Selectively updating branch predictors for loops executed from loop buffers in a processor

    公开(公告)号:US11928474B2

    公开(公告)日:2024-03-12

    申请号:US17832350

    申请日:2022-06-03

    CPC classification number: G06F9/3844 G06F9/325 G06F9/381

    Abstract: Selectively updating branch predictors for loops executed from loop buffers is disclosed herein. In some aspects, a branch predictor update circuit of a processor is configured to detect a loop comprising a plurality of loop instructions in an instruction stream, and to determine that the loop is stored within a loop buffer circuit of the processor. The branch predictor update circuit is further configured to determine a count of potential history register updates to the history register for the plurality of loop instructions, and to determine whether the count of potential history register updates exceeds a size of the history register. The branch predictor update circuit is also configured to, responsive to determining that the count of potential history register updates does not exceed the size of the history register, update a branch predictor of the branch predictor circuit based on the plurality of loop instructions.

Patent Agency Ranking