Branch target look up suppression

    公开(公告)号:US11029959B2

    公开(公告)日:2021-06-08

    申请号:US16120674

    申请日:2018-09-04

    Applicant: Arm Limited

    Abstract: Branch prediction circuitry processes blocks of instructions and provides instruction fetch circuitry with indications of predicted next blocks of instructions to be retrieved from memory. Main branch target storage stores branch target predictions for branch instructions in the blocks of instructions. Secondary branch target storage caches the branch target predictions from the main branch target storage. Look-ups in the secondary branch target storage and the main branch target storage are performed in parallel. The main branch target storage is set-associative and an entry in the main branch target storage comprises multiple ways, wherein each way of the multiple ways stores a branch target prediction for one branch instruction. The branch prediction circuitry stores a way prediction for which of the multiple ways contain the branch target predictions for a predicted next block of instructions and stores a flag associated with the way prediction indicating whether all branch target predictions stored for the predicted next block of instructions in the main branch target storage are also stored in the secondary branch target storage. An active value of the flag suppresses the look-up in the main branch target storage for the predicted next block of instructions.

    Prefetching techniques
    3.
    发明授权

    公开(公告)号:US10817426B2

    公开(公告)日:2020-10-27

    申请号:US16139160

    申请日:2018-09-24

    Applicant: Arm Limited

    Abstract: A variety of data processing apparatuses are provided in which stride determination circuitry determines a stride value as a difference between a current address and a previously received address. Stride storage circuitry stores an association between stride values determined by the stride determination circuitry and a frequency during a training period. Prefetch circuitry causes a further data value to be proactively retrieved from a further address. The further address is the current address modified by a stride value in the stride storage circuitry having a highest frequency during the training period. The variety of data processing apparatuses are directed towards improving efficiency by variously disregarding certain candidate stride values, considering additional further addresses for prefetching by using multiple stride values, using feedback to adjust the training process and compensating for page table boundaries.

    Handling move instructions via register renaming or writing to a different physical register using control flags

    公开(公告)号:US10528355B2

    公开(公告)日:2020-01-07

    申请号:US14757576

    申请日:2015-12-24

    Applicant: ARM LIMITED

    Abstract: An apparatus has processing circuitry, register rename circuitry and control circuitry which selects one of first and second move handling techniques for handling a move instruction specifying a source logical register and a destination logical register. In the first technique, the register rename circuitry maps the destination logical register of the move to the same physical register as the source logical register. In the second technique, the processing circuitry writes a data value read from a physical register corresponding to the source logical register to a different physical register corresponding to the destination local register. The second technique is selected when the move instruction specifies the same source logical register as one of the source and destination logical registers as an earlier move instruction handled according to the first technique, and the register mapping used for that register when handling the earlier move instruction is still current.

    Circuitry and method
    5.
    发明授权

    公开(公告)号:US11989583B2

    公开(公告)日:2024-05-21

    申请号:US17218425

    申请日:2021-03-31

    Applicant: Arm Limited

    Abstract: Circuitry comprises two or more clusters of execution units, each cluster comprising one or more execution units to execute processing instructions; and scheduler circuitry to maintain one or more queues of processing instructions, the scheduler circuitry comprising picker circuitry to select a queued processing instruction for issue to an execution unit of one of the clusters of execution units for execution; in which: the scheduler circuitry is configured to maintain dependency data associated with each queued processing instruction, the dependency data for a queued processing instruction indicating any source operands which are required to be available for use in execution of that queued processing instruction and to inhibit issue of that queued processing instruction until all of the required source operands for that queued processing instruction are available and is configured to be responsive to an indication to the scheduler circuitry of the availability of the given operand as a source operand for use in execution of queued processing instructions; and the scheduler circuitry is responsive to an indication of availability of one or more last awaited source operands for a given queued processing instruction, to inhibit issue by the scheduler circuitry of the given queued processing instruction to an execution unit in a cluster of execution units other than a cluster of execution units containing an execution unit which generated at least one of those last awaited source operands.

    Cache storage techniques
    6.
    发明授权

    公开(公告)号:US10810126B2

    公开(公告)日:2020-10-20

    申请号:US16139517

    申请日:2018-09-24

    Applicant: Arm Limited

    Abstract: The present disclosure is concerned with improvements to cache systems that can be used to improve the performance (e.g. hit performance) and/or bandwidth within a memory hierarchy. For instance, a data processing apparatus is provided that comprises a cache. Access circuitry receives one or more requests for data and when the data is present in the cache the data is returned. Retrieval circuitry retrieves the data and stores the data in the cache, either proactively or in response to the one or more requests for the data. Control circuitry evicts the data from the cache and, in dependence on at least one condition, stores the data in the further cache. The at least one condition comprises a requirement that the data was stored into the cache proactively and that a number of the one or more requests is above a threshold value.

    Storage circuitry request tracking

    公开(公告)号:US10776043B2

    公开(公告)日:2020-09-15

    申请号:US16118610

    申请日:2018-08-31

    Applicant: Arm Limited

    Abstract: Storage circuitry is provided, that is designed to form part of a memory hierarchy. The storage circuitry comprises receiver circuitry for receiving a request to obtain data from the memory hierarchy. Transfer circuitry causes the data to be stored at a selected destination in response to the request, wherein the selected destination is selected in dependence on at least one selection condition. Tracker circuitry tracks the request while the request is unresolved. If at least one selection condition is met then the destination is the storage circuitry and otherwise the destination is other storage circuitry in the memory hierarchy.

    Arbitration of requests requiring a variable number of resources

    公开(公告)号:US10521368B2

    公开(公告)日:2019-12-31

    申请号:US14757577

    申请日:2015-12-24

    Applicant: ARM LIMITED

    Abstract: Arbitration circuitry is provided for arbitrating between requests awaiting servicing. The requests require variable numbers of resources and the arbitration circuitry permits the request to be serviced in a different order to the order in which they were received. Checking circuitry prevents a given request other than a oldest request from being serviced when a number of available resources is less than a threshold number of resources. The threshold number is varied based on the number of resources required for at least one other request awaiting servicing.

    Apparatus and method for performing multiple control flow predictions

    公开(公告)号:US10963258B2

    公开(公告)日:2021-03-30

    申请号:US16155049

    申请日:2018-10-09

    Applicant: Arm Limited

    Abstract: A data processing apparatus is provided that includes lookup circuitry to provide first prediction data in respect of a first block of instructions and second prediction data in respect of a second block of instructions. First processing circuitry provides a first control flow prediction in respect of the first block of instructions using the first prediction data and second processing circuitry provides a second control flow prediction in respect of the second block of instructions using the second prediction data. The first block of instructions and the second block of instructions collectively define a prediction block and the lookup circuitry uses a reference to the prediction block as at least part of an index to both the first prediction data and the second prediction data.

    Scheduling in a data processing apparatus

    公开(公告)号:US10754687B2

    公开(公告)日:2020-08-25

    申请号:US16005811

    申请日:2018-06-12

    Applicant: Arm Limited

    Abstract: There is provided a data processing apparatus that includes processing circuitry for executing a plurality of instructions. Storage circuitry stores a plurality of entries, each entry relating to an instruction in the plurality of instructions and including a dependency field. The dependency field stores a data dependency of that instruction on a previous instruction in the plurality of instructions. Scheduling circuitry schedules the execution of the plurality of instructions in an order that depends on each data dependency. When the previous instruction is a single-cycle instruction, the dependency field includes a reference to one of the entries that relates to the previous instruction, otherwise, the data dependency field includes an indication of an output destination of the previous instruction.

Patent Agency Ranking