Technique for training a prediction apparatus

    公开(公告)号:US11580032B2

    公开(公告)日:2023-02-14

    申请号:US17153147

    申请日:2021-01-20

    Applicant: Arm Limited

    Abstract: A technique is provided for training a prediction apparatus. The apparatus has an input interface for receiving a sequence of training events indicative of program instructions, and identifier value generation circuitry for performing an identifier value generation function to generate, for a given training event received at the input interface, an identifier value for that given training event. The identifier value generation function is arranged such that the generated identifier value is dependent on at least one register referenced by a program instruction indicated by that given training event. Prediction storage is provided with a plurality of training entries, where each training entry is allocated an identifier value as generated by the identifier value generation function, and is used to maintain training data derived from training events having that allocated identifier value. Matching circuitry is then responsive to the given training event to detect whether the prediction storage has a matching training entry (i.e. an entry whose allocated identifier value matches the identifier value for the given training event). If so, it causes the training data in the matching training entry to be updated in dependence on the given training event.

    Data processing apparatus and method for performing load-exclusive and store-exclusive operations
    3.
    发明授权
    Data processing apparatus and method for performing load-exclusive and store-exclusive operations 有权
    用于执行负载独占和专有操作的数据处理装置和方法

    公开(公告)号:US09223701B2

    公开(公告)日:2015-12-29

    申请号:US13861622

    申请日:2013-04-12

    Applicant: ARM LIMITED

    Abstract: A data processing apparatus is provided in which a processor unit accesses data values stored in a memory and a cache stores local copies of a subset of the data values. The cache maintains a status value for each local copy stored in the cache. When the processor unit executes a load-exclusive operation, a first data value is loaded from a specified memory location and an exclusive use monitor begins monitoring the specified memory location for accesses. When the processor unit executes a store-exclusive operation, a second data value is stored to the specified memory location if the exclusive use monitor indicates that the first data value has not been modified since the load-exclusive operation was executed. When a local copy of the first data value is stored in the cache and the status value for the local copy of the first data value indicates that the processor unit has exclusive usage of the first data value, the data processing apparatus is configured to prevent modification of the status value for a predetermined time period after the processor unit has executed the load-exclusive operation.

    Abstract translation: 提供了一种数据处理装置,其中处理器单元访问存储在存储器中的数据值,并且高速缓存存储数据值的子集的本地副本。 高速缓存为存储在缓存中的每个本地副本维护一个状态值。 当处理器单元执行负载专用操作时,从指定的存储器位置加载第一数据值,并且专用监视器开始监视指定的存储器位置以进行访问。 当处理器单元执行存储专用操作时,如果独占使用监视器指示执行了负载独占操作后第一数据值未被修改,则将第二数据值存储到指定的存储器位置。 当第一数据值的本地副本存储在高速缓存中时,第一数据值的本地副本的状态值指示处理器单元具有第一数据值的独占使用时,数据处理装置被配置为防止修改 在处理器单元执行负载独占操作之后的预定时间段内的状态值。

    Sharing instruction encoding space between a coprocessor and auxiliary execution circuitry

    公开(公告)号:US11263014B2

    公开(公告)日:2022-03-01

    申请号:US16531295

    申请日:2019-08-05

    Applicant: Arm Limited

    Abstract: Data processing apparatuses, methods of data processing, and non-transitory computer-readable media on which computer-readable code is stored defining logical configurations of processing devices are disclosed. In an apparatus, fetch circuitry retrieves a sequence of instructions and execution circuitry performs data processing operations with respect to data values in a set of registers. An auxiliary execution circuitry interface and a coprocessor interface to provide a connection to a coprocessor outside the apparatus are provided. Decoding circuitry generates control signals in dependence on the instructions of the sequence of instructions, wherein the decoding circuitry is responsive to instructions in a first subset of an instruction encoding space to generate the control signals to control the execution circuitry to perform the first data processing operations, and the decoding circuitry is responsive to instructions in a second subset of the instruction encoding space to generate the control signals in dependence on a configuration condition: to generate the control signals for the auxiliary execution circuitry interface when the configuration condition has a first state; and to generate the control signals for the coprocessor interface when the configuration condition has a second state.

    Apparatus and method for performing branch prediction

    公开(公告)号:US10831499B2

    公开(公告)日:2020-11-10

    申请号:US16106382

    申请日:2018-08-21

    Applicant: Arm Limited

    Abstract: An apparatus and method are provided for performing branch prediction. The apparatus has processing circuitry for executing instructions, and branch prediction circuitry for making branch outcome predictions in respect of branch instructions. The branch prediction circuitry includes loop prediction circuitry having a plurality of entries, where each entry is used to maintain branch outcome prediction information for a loop controlling branch instruction that controls repeated execution of a loop comprising a number of instructions. The branch prediction circuitry is arranged to analyse blocks of instructions and to produce a prediction result for each block that is dependent on branch outcome predictions made for any branch instructions appearing in the associated block. A prediction queue then stores the prediction results produced by the branch prediction circuitry in order to determine the instructions to be executed by the processing circuitry. When the block of instructions being analysed comprises a loop controlling branch instruction that has an active entry in the loop prediction circuitry, and a determined condition is detected in respect of the associated loop, the loop prediction circuitry is arranged to produce a prediction result that identifies multiple iterations of the loop. This can significantly boost prediction bandwidth for certain types of loop.

    Apparatus and method for mapping architectural registers to physical registers
    6.
    发明授权
    Apparatus and method for mapping architectural registers to physical registers 有权
    将架构寄存器映射到物理寄存器的装置和方法

    公开(公告)号:US09311088B2

    公开(公告)日:2016-04-12

    申请号:US13927552

    申请日:2013-06-26

    Applicant: ARM Limited

    Abstract: An apparatus and method are provided for performing register renaming. Available register identifying circuitry is provided to identify which physical registers form a pool of physical registers available to be mapped by register renaming circuitry to an architectural register specified by an instruction to be executed. Configuration data whose value is modified during operation of the processing circuitry is stored such that, when the configuration data has a first value, the configuration data identifies at least one architectural register of the architectural register set which does not require mapping to a physical register by the register renaming circuitry. The register identifying circuitry is arranged to reference the modified data value, such that when the configuration data has the first value, the number of physical registers in the pool is increased due to the reduction in the number of architectural registers which require mapping to physical registers.

    Abstract translation: 提供了一种用于执行寄存器重命名的装置和方法。 提供可用的寄存器识别电路以识别哪些物理寄存器形成可由寄存器重命名电路映射到由要执行的指令指定的架构寄存器的物理寄存器池。 存储其值在处理电路的操作期间被修改的配置数据,使得当配置数据具有第一值时,配置数据识别架构寄存器集合的至少一个体系结构寄存器,其不需要映射到物理寄存器 寄存器重命名电路。 寄存器识别电路被布置为引用修改的数据值,使得当配置数据具有第一值时,由于需要映射到物理寄存器的架构寄存器的数量的减少,池中的物理寄存器的数量增加 。

    Task dispatch
    9.
    发明授权

    公开(公告)号:US11550620B2

    公开(公告)日:2023-01-10

    申请号:US17190729

    申请日:2021-03-03

    Applicant: Arm Limited

    Abstract: Apparatuses and methods are disclosed for performing data processing operations in main processing circuitry and delegating certain tasks to auxiliary processing circuitry. User-specified instructions executed by the main processing circuitry comprise a task dispatch specification specifying an indication of the auxiliary processing circuitry and multiple data words defining a delegated task comprising at least one virtual address indicator. In response to the task dispatch specification the main processing circuitry performs virtual-to-physical address translation with respect to the at least one virtual address indicator to derive at least one physical address indicator, and issues a task dispatch memory write transaction to the auxiliary processing circuitry comprises the indication of the auxiliary processing circuitry and the multiple data words, wherein the at least one virtual address indicator in the multiple data words is substituted by the at least one physical address indicator.

    Apparatus and method for controlling allocation of data into a cache storage

    公开(公告)号:US10394716B1

    公开(公告)日:2019-08-27

    申请号:US15946848

    申请日:2018-04-06

    Applicant: Arm Limited

    Abstract: An apparatus and method are provided for controlling allocation of data into cache storage. The apparatus comprises processing circuitry for executing instructions, and a cache storage for storing data accessed when executing the instructions. Cache control circuitry is arranged, while a sensitive allocation condition is determined to exist, to be responsive to the processing circuitry speculatively executing a memory access instruction that identifies data to be allocated into the cache storage, to allocate the data into the cache storage and to set a conditional allocation flag in association with the data allocated into the cache storage. The cache control circuitry is then responsive to detecting an allocation resolution event, to determine based on the type of the allocation resolution event whether to clear the conditional allocation flag such that the data is thereafter treated as unconditionally allocated, or to cause invalidation of the data in the cache storage. Such an approach can reduce the vulnerability of a cache to speculation-based cache timing side-channel attacks.

Patent Agency Ranking