Arbitration of requests requiring a variable number of resources

    公开(公告)号:US20170185542A1

    公开(公告)日:2017-06-29

    申请号:US14757577

    申请日:2015-12-24

    Applicant: ARM LIMITED

    Abstract: Arbitration circuitry is provided for arbitrating between requests awaiting servicing. The requests require variable numbers of resources and the arbitration circuitry permits the request to be serviced in a different order to the order in which they were received. Checking circuitry prevents a given request other than a oldest request from being serviced when a number of available resources is less than a threshold number of resources. The threshold number is varied based on the number of resources required for at least one other request awaiting servicing.

    Speculative register file read suppression
    12.
    发明授权
    Speculative register file read suppression 有权
    推测寄存器文件读取抑制

    公开(公告)号:US09542194B2

    公开(公告)日:2017-01-10

    申请号:US14482146

    申请日:2014-09-10

    Applicant: ARM LIMITED

    CPC classification number: G06F9/3857 G06F9/384

    Abstract: A single threaded out-of-order processor 2 includes an architected register file 22 and a speculative register file 20. Speculative register allocation circuitry 24 serves to allocate speculative registers for use in accordance with an allocation sequence and taken from a position determined by a tail point. Read suppression circuitry 30 serves to maintain a boundary pointer corresponding to a position within the allocation sequence such that no speculative register more recently allocated within the allocation sequence than that corresponding to the boundary pointer can have a valid register value. The read suppression circuitry 30 serves to suppress read operations for source operands lying within a read-suppression region delimited by the tail point and the boundary pointer. Separate boundary pointers may be maintained for different types of register values, such as integer register values and floating point register values.

    Abstract translation: 单线程无序处理器2包括架构化寄存器文件22和推测寄存器文件20.推测性寄存器分配电路24用于根据分配序列分配推测寄存器,并从由尾部确定的位置 点。 读取抑制电路30用于维持与分配序列内的位置相对应的边界指针,使得在分配序列内最近不再分配比与边界指针对应的推测寄存器可以具有有效的寄存器值。 读取抑制电路30用于抑制位于由尾点和边界指针限定的读取抑制区域内的源操作数的读取操作。 可以为不同类型的寄存器值保持单独的边界指针,例如整数寄存器值和浮点寄存器值。

    Responding to branch misprediction for predicated-loop-terminating branch instruction

    公开(公告)号:US11693666B2

    公开(公告)日:2023-07-04

    申请号:US17505854

    申请日:2021-10-20

    Applicant: Arm Limited

    Abstract: A predicated-loop-terminating branch instruction controls, based on whether a loop termination condition is satisfied, whether the processing circuitry should process a further iteration of a predicated loop body or process a following instruction. If at least one unnecessary iteration of the predicated loop body is processed following a mispredicted-non-termination branch misprediction when the loop termination condition is mispredicted as unsatisfied for a given iteration when it should have been satisfied, processing of the at least one unnecessary iteration of the predicated loop body is predicated to suppress an effect of the at least one unnecessary iteration. When the mispredicted-non-termination branch misprediction is detected for the given iteration of the predicated-loop-terminating branch instruction, in response to determining that a flush suppressing condition is satisfied, flushing of the at least one unnecessary iteration of the predicated loop body is suppressed as a response to the mispredicted-non-termination branch misprediction.

    Writebacks of prefetched data
    14.
    发明授权

    公开(公告)号:US11204878B1

    公开(公告)日:2021-12-21

    申请号:US17065834

    申请日:2020-10-08

    Applicant: Arm Limited

    Abstract: An apparatus is provided that includes a memory hierarchy comprising a plurality of caches and a memory. Prefetch circuitry acquires data from the memory hierarchy before the data is explicitly requested by processing circuitry configured to execute a stream of instructions. Writeback circuitry causes the data to be written back from a higher level cache of the memory hierarchy to a lower level cache of the memory hierarchy and tracking circuitry tracks a proportion of entries that are stored in the lower level cache of the memory hierarchy having been written back from the higher level cache of the memory hierarchy, that are subsequently explicitly requested by the processing circuitry in response to one of the instructions.

    Misprediction of predicted taken branches in a data processing apparatus

    公开(公告)号:US11086629B2

    公开(公告)日:2021-08-10

    申请号:US16185073

    申请日:2018-11-09

    Applicant: Arm Limited

    Abstract: Apparatus and a method of operating the same is disclosed. Instruction fetch circuitry is provided to fetch a block of instructions from memory and branch prediction circuitry to generate branch prediction indications for each branch instruction present in the block of instructions. The branch prediction circuitry is responsive to identification of a first conditional branch instruction in the block of instructions that is predicted to be taken to modify a branch prediction indication generated for the first conditional branch instruction to include a subsequent branch status indicator. When there is a subsequent branch instruction after the first conditional branch instruction in the block of instructions that is predicted to be taken the subsequent branch status indicator has a first value, and otherwise the subsequent branch status indicator has a second value. This supports improved handling of a misprediction as taken.

    Handling multiple control flow instructions

    公开(公告)号:US10817299B2

    公开(公告)日:2020-10-27

    申请号:US16124264

    申请日:2018-09-07

    Applicant: Arm Limited

    Abstract: A data processing apparatus is provided that includes a plurality of control flow execution circuits to simultaneously execute a first control flow instruction having a first type and a second control flow instruction having a second type from a plurality of instructions. A control flow prediction update circuit updates at most one of: a prediction of the first control flow instruction based on a result of the first control flow instruction, and a prediction of the second control flow instruction based on a result of the second control flow instruction.

    CACHE STORAGE TECHNIQUES
    17.
    发明申请

    公开(公告)号:US20200097410A1

    公开(公告)日:2020-03-26

    申请号:US16139517

    申请日:2018-09-24

    Applicant: Arm Limited

    Abstract: The present disclosure is concerned with improvements to cache systems that can be used to improve the performance (e.g. hit performance) and/or bandwidth within a memory hierarchy. For instance, a data processing apparatus is provided that comprises a cache. Access circuitry receives one or more requests for data and when the data is present in the cache the data is returned. Retrieval circuitry retrieves the data and stores the data in the cache, either proactively or in response to the one or more requests for the data. Control circuitry evicts the data from the cache and, in dependence on at least one condition, stores the data in the further cache. The at least one condition comprises a requirement that the data was stored into the cache proactively and that a number of the one or more requests is above a threshold value.

    PREFETCHING TECHNIQUES
    18.
    发明申请

    公开(公告)号:US20200097409A1

    公开(公告)日:2020-03-26

    申请号:US16139160

    申请日:2018-09-24

    Applicant: Arm Limited

    Abstract: A variety of data processing apparatuses are provided in which stride determination circuitry determines a stride value as a difference between a current address and a previously received address. Stride storage circuitry stores an association between stride values determined by the stride determination circuitry and a frequency during a training period. Prefetch circuitry causes a further data value to be proactively retrieved from a further address. The further address is the current address modified by a stride value in the stride storage circuitry having a highest frequency during the training period. The variety of data processing apparatuses are directed towards improving efficiency by variously disregarding certain candidate stride values, considering additional further addresses for prefetching by using multiple stride values, using feedback to adjust the training process and compensating for page table boundaries.

Patent Agency Ranking