Predicting an outcome of an instruction following a flush

    公开(公告)号:US11157284B1

    公开(公告)日:2021-10-26

    申请号:US16891431

    申请日:2020-06-03

    Applicant: Arm Limited

    Abstract: An apparatus is described, comprising processing circuitry to speculatively execute an earlier instruction and a later instruction by generating a prediction of an outcome of the earlier instruction and a prediction of an outcome of the later instruction, wherein the prediction of the outcome of the earlier instruction causes a first control flow path to be executed. The apparatus also comprises storage circuitry to store the outcome of the later instruction in response to the later instruction completing, and flush circuitry to generate a flush in response to the prediction of the outcome of the earlier instruction being incorrect. Permission circuitry permits the generating of the prediction by the processing circuitry. When re-executing the later instruction in a second control flow path following the flush, the processing circuitry is adapted to perform the generating the prediction of the outcome of the later instruction as the outcome stored in the storage circuitry during execution of the first control flow path. The permission circuitry is adapted to permit or inhibit generating the prediction of the outcome of the later instruction as the outcome stored in the storage circuitry in dependence on a condition.

    Memcpy micro-operation reduction
    12.
    发明授权

    公开(公告)号:US12204785B2

    公开(公告)日:2025-01-21

    申请号:US17871332

    申请日:2022-07-22

    Applicant: Arm Limited

    Abstract: There is provided a data processing apparatus in which decode circuitry receives a memory copy instruction containing an indication of a source area of memory, an indication of a destination area of memory, and an indication of a remaining copy length. In response to receiving the memory copy instruction, the decode circuitry generates at least one active memory copy operation or a null memory copy operation. The active memory copy operation causes one or more execution units to perform a memory copy from part of the source area of memory to part of the destination area of memory and the null memory copy operation leaves the destination area of memory unmodified.

    Shared pointer for local history records used by prediction circuitry

    公开(公告)号:US11334361B2

    公开(公告)日:2022-05-17

    申请号:US16806063

    申请日:2020-03-02

    Applicant: Arm Limited

    Abstract: An apparatus has processing circuitry, and history storage circuitry to store local history records. Each local history record corresponds to a respective subset of instruction addresses and tracks a sequence of observed instruction behaviour observed for successive instances of instructions having addresses in that subset. Pointer storage circuitry to store a shared pointer shared between the local history records. The shared pointer indicates a common storage position reached in each local history record. Prediction circuitry determines predicted instruction behaviour for a given instruction address based on a selected portion of a selected local history record stored in the history storage circuitry. The prediction circuitry selects the selected local history record based on the given instruction address and selects the selected portion based on the shared pointer.

    Apparatus and method for performing multiple control flow predictions

    公开(公告)号:US10963258B2

    公开(公告)日:2021-03-30

    申请号:US16155049

    申请日:2018-10-09

    Applicant: Arm Limited

    Abstract: A data processing apparatus is provided that includes lookup circuitry to provide first prediction data in respect of a first block of instructions and second prediction data in respect of a second block of instructions. First processing circuitry provides a first control flow prediction in respect of the first block of instructions using the first prediction data and second processing circuitry provides a second control flow prediction in respect of the second block of instructions using the second prediction data. The first block of instructions and the second block of instructions collectively define a prediction block and the lookup circuitry uses a reference to the prediction block as at least part of an index to both the first prediction data and the second prediction data.

    Shortcut path for a branch target buffer

    公开(公告)号:US10817298B2

    公开(公告)日:2020-10-27

    申请号:US15335741

    申请日:2016-10-27

    Applicant: ARM LIMITED

    Abstract: An apparatus comprises a branch target buffer (BTB) to store predicted target addresses of branch instructions. In response to a fetch block address identifying a fetch block comprising two or more program instructions, the BTB performs a lookup to identify whether it stores one or more predicted target addresses for one or more branch instructions in the fetch block. When the BTB is identified in the lookup as storing predicted target addresses for more than one branch instruction in said fetch block, branch target selecting circuitry selects a next fetch block address from among the multiple predicted target addresses returned in the lookup. A shortcut path bypassing the branch target selecting circuitry is provided to forward a predicted target address identified in the lookup as the next fetch block address when a predetermined condition is satisfied.

    Cache hierarchy management
    17.
    发明授权

    公开(公告)号:US10268581B2

    公开(公告)日:2019-04-23

    申请号:US15479348

    申请日:2017-04-05

    Applicant: ARM Limited

    Abstract: A cache hierarchy and a method of operating the cache hierarchy are disclosed. The cache hierarchy comprises a first cache level comprising an instruction cache, and predecoding circuitry to perform a predecoding operation on instructions having a first encoding format retrieved from memory to generate predecoded instructions having a second encoding format for storage in the instruction cache. The cache hierarchy further comprises a second cache level comprising a cache and the first cache level instruction cache comprises cache control circuitry to control an eviction procedure for the instruction cache in which a predecoded instruction having the second encoding format which is evicted from the instruction cache is stored at the second cache level in the second encoding format. This enables the latency and power cost of the predecoding operation to be avoided when the predecoded instruction is then retrieved from the second cache level for storage in the first level instruction cache again.

Patent Agency Ranking