Abstract:
Examples of the present disclosure relate to an apparatus comprising execution circuitry to execute instructions defining data processing operations on data items. The apparatus comprises cache storage to store temporary copies of the data items. The apparatus comprises prefetching circuitry to a) predict that a data item will be subject to the data processing operations by the execution circuitry by determining that the data item is consistent with an extrapolation of previous data item retrieval by the execution circuitry, and identifying that at least one control flow element of the instructions indicates that the data item will be subject to the data processing operations by the execution circuitry; and b) prefetch the data item into the cache storage.
Abstract:
A processing pipeline for processing instructions with instructions from multiple threads in flight concurrently may have control circuitry to detect a stalling event associated with a given thread. In response, at least one instruction of the given thread may be flushed from the pipeline, and the control circuitry may trigger fetch circuitry to reduce a fraction of the fetched instructions which are fetched from the given thread. A mechanism is also described to determine when to trigger a predetermined action when a delay in accessing information becomes greater than a delay threshold, and to update the delay threshold based on a difference between a return delay when the information is returned from the storage circuitry and the delay threshold.
Abstract:
An apparatus for data processing and a method of data processing are provided. Data processing operations are performed in response to instructions which reference architectural registers using physical registers to store data values when performing the data processing operations. Mappings between the architectural registers and the physical registers are stored, and when a data hazard condition is identified with respect to out-of-order program execution of an instruction, an architectural register specified in the instruction is remapped to an available physical register. A reorder buffer stores an entry for each destination architectural register specified by the instruction, entries being stored in program order, and an entry specifies a destination architectural register and an original physical register to which the destination architectural register was mapped before the architectural register remapped to an available physical register.
Abstract:
Apparatus for processing data 2 is provided with fetch circuitry 16 for fetching program instructions for execution from one or more active threads of instructions having respective program counter values. Pipeline circuitry 22, 24 has a first operating mode and a second operating mode. Mode switching circuitry 30 switches the pipeline circuitry 22, 24, between the first operating mode and the second operating mode in dependence upon a number of active threads of program instructions having program instructions available to be executed. The first operating mode has a lower average energy consumption per instruction executed than the second operating mode and the second operating mode has a higher average rate of instruction execution for a single thread than the first operating mode. The first operating mode may utilise a barrel processing pipeline 22 to perform interleaved multiple thread processing. The second operating mode may utilise an out-of-order processing pipeline 24 for performing out-of-order processing.
Abstract:
Processing circuitry performs a processing operation to generate a two's complement result value representing a positive or negative number in two's complement representation. Normalization-and-rounding circuitry converts the two's complement result value to a normalized-and-rounded floating-point result value represented using sign-magnitude representation. The normalization-and-rounding circuitry comprises incrementing circuitry to perform an increment addition (e.g. a rounding increment or a conversion increment) to generate a fraction of the normalized-and-rounded floating-point result value. For an operation where the increment addition is required to be performed, tininess detection circuitry detects the after-rounding tininess status based on a still-to-be-incremented version of the normalized-and-rounded floating-point result value prior to the increment addition by the increment circuitry.
Abstract:
An apparatus and method are provided for controlling allocation of instructions into an instruction cache storage. The apparatus comprises processing circuitry to execute instructions, fetch circuitry to fetch instructions from memory for execution by the processing circuitry, and an instruction cache storage to store instructions fetched from the memory by the fetch circuitry. Cache control circuitry is responsive to the fetch circuitry fetching a target instruction from a memory address determined as a target address of an instruction flow changing instruction, at least when the memory address is within a specific address range, to prevent allocation of the fetched target instruction into the instruction cache storage unless the fetched target instruction is at least one specific type of instruction. It has been found that such an approach can inhibit the performance of speculation-based caching timing side-channel attacks.
Abstract:
A data processing apparatus comprises branch prediction circuitry adapted to store at least one branch prediction state entry in relation to a stream of instructions, input circuitry to receive at least one input to generate a new branch prediction state entry, wherein the at least one input comprises a plurality of bits; and coding circuitry adapted to perform an encoding operation to encode at least some of the plurality of bits based on a value associated with a current execution environment in which the stream of instructions is being executed. This guards against potential attacks which exploit the ability for branch prediction entries trained by one execution environment to be used by another execution environment as a basis for branch predictions.
Abstract:
A processing pipeline for processing instructions with instructions from multiple threads in flight concurrently may have control circuitry to detect a stalling event associated with a given thread. In response, at least one instruction of the given thread may be flushed from the pipeline, and the control circuitry may trigger fetch circuitry to reduce a fraction of the fetched instructions which are fetched from the given thread. A mechanism is also described to determine when to trigger a predetermined action when a delay in accessing information becomes greater than a delay threshold, and to update the delay threshold based on a difference between a return delay when the information is returned from the storage circuitry and the delay threshold.
Abstract:
Data processing circuitry comprises instruction queue circuitry to maintain one or more instruction queues to store fetched instructions; instruction decode circuitry to decode instructions dispatched from the one or more instruction queues, the instruction decode circuitry being configured to allocate one or more processor resources of a set of processor resources to a decoded instruction for use in execution of that decoded instruction; detection circuitry to detect, for an instruction to be dispatched from a given instruction queue, a prediction indicating whether sufficient processor resources are predicted to be available for allocation to that instruction by the instruction decode circuitry; and dispatch circuitry to dispatch an instruction from the given instruction queue to the instruction decode circuitry, the dispatch circuitry being responsive to the detection circuitry to allow deletion of the dispatched instruction from that instruction queue when the prediction indicates that sufficient processor resources are predicted to be available for allocation to that instruction by the instruction decode circuitry.
Abstract:
An apparatus comprises a processing pipeline comprising out-of-order execution circuitry and second execution circuitry. Control circuitry monitors at least one reordering metric indicative of an extent to which instructions are executed out of order by the out-of-order execution circuitry, and controls whether instructions are executed using the out-of-order execution circuitry or the second execution circuitry based on the reordering metric. A speculation metric indicative of a fraction of executed instructions that are flushed due to a mis-speculation can also be used to determine whether to execute instructions on first or second execution circuitry having different performance or energy consumption characteristics.