Abstract:
Apparatuses and methods are disclosed for performing data processing operations in main processing circuitry and delegating certain tasks to auxiliary processing circuitry. User-specified instructions executed by the main processing circuitry comprise a task dispatch specification specifying an indication of the auxiliary processing circuitry and multiple data words defining a delegated task comprising at least one virtual address indicator. In response to the task dispatch specification the main processing circuitry performs virtual-to-physical address translation with respect to the at least one virtual address indicator to derive at least one physical address indicator, and issues a task dispatch memory write transaction to the auxiliary processing circuitry comprises the indication of the auxiliary processing circuitry and the multiple data words, wherein the at least one virtual address indicator in the multiple data words is substituted by the at least one physical address indicator.
Abstract:
A technique is provided for training a prediction apparatus. The apparatus has an input interface for receiving a sequence of training events indicative of program instructions, and identifier value generation circuitry for performing an identifier value generation function to generate, for a given training event received at the input interface, an identifier value for that given training event. The identifier value generation function is arranged such that the generated identifier value is dependent on at least one register referenced by a program instruction indicated by that given training event. Prediction storage is provided with a plurality of training entries, where each training entry is allocated an identifier value as generated by the identifier value generation function, and is used to maintain training data derived from training events having that allocated identifier value. Matching circuitry is then responsive to the given training event to detect whether the prediction storage has a matching training entry (i.e. an entry whose allocated identifier value matches the identifier value for the given training event). If so, it causes the training data in the matching training entry to be updated in dependence on the given training event.
Abstract:
An apparatus and method are provided for processing instructions. The apparatus has execution circuitry for executing instructions, where each instruction requires an associated operation to be performed using one or more source operand values in order to produce a result value. Issue circuitry is used to maintain a record of pending instructions awaiting execution by the execution circuitry, and prediction circuitry is used to produce a predicted source operand value for a chosen pending instruction. Optimisation circuitry is then arranged to detect an optimisation condition for the chosen pending instruction when the predicted source operand value is such that, having regard to the associated operation for the chosen pending instruction, the result value is known without performing the associated operation. In response to detection of the optimisation condition, an optimisation operation is implemented instead of causing the execution circuitry to perform the associated operation in order to execute the chosen pending instruction. This can lead to significant performance and/or power consumption improvements.
Abstract:
A data processing apparatus is provided in which a processor unit accesses data values stored in a memory and a cache stores local copies of a subset of the data values. The cache maintains a status value for each local copy stored in the cache. When the processor unit executes a load-exclusive operation, a first data value is loaded from a specified memory location and an exclusive use monitor begins monitoring the specified memory location for accesses. When the processor unit executes a store-exclusive operation, a second data value is stored to the specified memory location if the exclusive use monitor indicates that the first data value has not been modified since the load-exclusive operation was executed. When a local copy of the first data value is stored in the cache and the status value for the local copy of the first data value indicates that the processor unit has exclusive usage of the first data value, the data processing apparatus is configured to prevent modification of the status value for a predetermined time period after the processor unit has executed the load-exclusive operation.
Abstract:
A data processing apparatus comprises receiver circuitry for receiving instructions from each of a plurality of requester devices. Processing circuitry executes the instructions associated with each of a subset of the requester devices at a time and arbitration circuitry determines the subset of the requester devices and causes the instructions associated with each of the subset of the requester devices to be executed next. In response to the receiver circuitry receiving an instruction of a predetermined type from one of the requester devices outside the subset of requester devices, the arbitration circuitry causes the instruction of the predetermined type to be executed next.
Abstract:
Data processing apparatuses, methods of data processing, and non-transitory computer-readable media on which computer-readable code is stored defining logical configurations of processing devices are disclosed. In an apparatus, fetch circuitry retrieves a sequence of instructions and execution circuitry performs data processing operations with respect to data values in a set of registers. An auxiliary execution circuitry interface and a coprocessor interface to provide a connection to a coprocessor outside the apparatus are provided. Decoding circuitry generates control signals in dependence on the instructions of the sequence of instructions, wherein the decoding circuitry is responsive to instructions in a first subset of an instruction encoding space to generate the control signals to control the execution circuitry to perform the first data processing operations, and the decoding circuitry is responsive to instructions in a second subset of the instruction encoding space to generate the control signals in dependence on a configuration condition: to generate the control signals for the auxiliary execution circuitry interface when the configuration condition has a first state; and to generate the control signals for the coprocessor interface when the configuration condition has a second state.
Abstract:
An apparatus and method includes execution circuitry including a wide operand execution unit configured to allow up to N bits of operand data to be processed during execution of a single instruction. Decoder circuitry decodes and generates, for each instruction, at least one control data block identifying an operation to be performed by the execution circuitry and at least two re-combineable control data blocks for the instruction. Issue queue control circuitry then allocates a slot in the issue queue for each of the at least two data blocks and up to M bits of associated operand data, and marks those allocated slots to identify that they contain re-combineable control data blocks. The issue queue control circuitry issues the combined block to said wide operand execution unit along with the operand data contained in each of the allocated slots for said at least two control data blocks.
Abstract:
An apparatus has processing circuitry for executing instructions and fetch circuitry for fetching the instructions for execution. When a branch instruction is encountered by the fetch circuitry, it determines subsequent instructions to be fetched in dependence on an initial branch direction prediction for the branch instruction made by branch prediction circuitry. Value prediction circuitry is used to maintain a predicted result value for one or more instructions, and dispatch circuitry maintains a record of pending instructions that have been fetched by the fetch circuitry and are awaiting execution by the processing circuitry, and selects pending instructions from the record for dispatch to the processing circuitry. When a given instruction whose predicted result value is maintained by the value prediction circuitry has a dependent instruction whose outcome is dependent on a result value of the given instruction, the dispatch circuitry nay be arranged to enable speculative execution of that dependent instruction using the predicted result value of the given instruction. Analysis circuitry is arranged, when the dependent instruction is the branch instruction, to detect a mispredict condition when an additional branch direction prediction for the branch instruction determined using the predicted result value for the given instruction is considered more accurate that the initial branch direction prediction, and the additional branch direction prediction differs to the initial branch direction prediction. On detection of the mispredict condition, a control signal is issued to indicate that the branch instruction has been mispredicted.
Abstract:
Coherency control circuitry (10) supports processing of a safe-speculative-read transaction received from a requesting master device (4). The safe-speculative-read transaction is of a type requesting that target data is returned to a requesting cache (11) of the requesting master device (4) while prohibiting any change in coherency state associated with the target data in other caches (12) in response to the safe-speculative-read transaction. In response, at least when the target data is cached in a second cache associated with a second master device, at least one of the coherency control circuitry (10) and the second cache (12) is configured to return a safe-speculative-read response while maintaining the target data in the same coherency state within the second cache. This helps to mitigate against speculative side-channel attacks.
Abstract:
An apparatus and method of operating a data processing apparatus are disclosed. The apparatus comprises data processing circuitry to perform data processing operations in response to a sequence of instructions, wherein the data processing circuitry is capable of performing speculative execution of at least some of the sequence of instructions. A cache structure comprising entries stores temporary copies of data items which are subjected to the data processing operations and speculative execution tracking circuitry monitors correctness of the speculative execution and responsive to indication of incorrect speculative execution to cause entries in the cache structure allocated by the incorrect speculative execution to be evicted from the cache structure.