Abstract:
A data processing apparatus 2 performs multi-threaded processing using the processing pipeline 6, 8, 10, 12, 14, 16, 18. Flush control circuitry 30 is responsive to multiple different types of flush trigger. Different types of flush trigger result in different sets of state being flushed for the thread which resulted in the flush trigger with state for other thread not being flushed. For example, a relatively low latency stall may result in flushing back to a first flush point whereas a longer latency stall results in flushing back to a second flush point and the loss of more state data. The data flushed back to the first flushed point may be a subset of the data flushed back to the second flush point.
Abstract:
An apparatus and method are provided for processing instructions. The apparatus has execution circuitry for executing instructions, where each instruction requires an associated operation to be performed using one or more source operand values in order to produce a result value. Issue circuitry is used to maintain a record of pending instructions awaiting execution by the execution circuitry, and prediction circuitry is used to produce a predicted source operand value for a chosen pending instruction. Optimisation circuitry is then arranged to detect an optimisation condition for the chosen pending instruction when the predicted source operand value is such that, having regard to the associated operation for the chosen pending instruction, the result value is known without performing the associated operation. In response to detection of the optimisation condition, an optimisation operation is implemented instead of causing the execution circuitry to perform the associated operation in order to execute the chosen pending instruction. This can lead to significant performance and/or power consumption improvements.
Abstract:
An apparatus comprises prediction circuitry (40, 100, 80) for determining, based on current prediction policy information (43, 82, 104), a predicted behaviour to be used for processing instructions. The current prediction policy information is updated based on an outcome of processing of instructions. A storage structure (50) stores at least one entry identifying previous prediction policy information (60) for a corresponding block of instructions. In response to an instruction from a block having a corresponding entry in the storage structure (50) which identifies the previous prediction policy information (60), the current prediction policy information (43, 82, 104) can be reset based on the previous prediction policy information 60 identified in the corresponding entry of the storage structure (50).
Abstract:
An apparatus and method are described, the apparatus comprising processing circuitry to perform data processing operations, microarchitecture circuitry used by the processing circuitry during performance of the data processing operations, and an interface to receive interrupt requests. The processing circuitry is responsive to a received interrupt request to perform an interrupt service routine, and the apparatus comprises prediction circuitry to determine a predicted time of reception of a next interrupt of at least one given type. The apparatus also comprises microarchitecture control circuitry arranged to vary a configuration of the microarchitecture circuitry between a performance based configuration and a responsiveness based configuration in dependence on the predicted time, so as to seek to increase the responsiveness of the apparatus to interrupts as the predicted time is approached.
Abstract:
A processing pipeline may have first and second execution circuits having different performance or energy consumption characteristics. Instruction supply circuitry may support different instruction supply schemes with different energy consumption or performance characteristics. This can allow a further trade-off between performance and energy efficiency. Architectural state storage can be shared between the execute units to reduce the overhead of switching between the units. In a parallel execution mode, groups of instructions can be executed on both execute units in parallel.
Abstract:
An apparatus (2) has a processing pipeline (4) supporting at least a first processing mode and a second processing mode with different energy consumption or performance characteristics. A storage structure (22, 30, 36, 50, 40, 64, 44) is accessible in both the first and second processing modes. When the second processing mode is selected, control circuitry (70) triggers a subset (102) of the entries of the storage structure to be placed in a power saving state.
Abstract:
A processing pipeline may have first and second execution circuits having different performance or energy consumption characteristics. Instruction supply circuitry may support different instruction supply schemes with different energy consumption or performance characteristics. This can allow a further trade-off between performance and energy efficiency. Architectural state storage can be shared between the execute units to reduce the overhead of switching between the units. In a parallel execution mode, groups of instructions can be executed on both execute units in parallel.
Abstract:
A data processing apparatus and method are provided for handling retrieval of instructions from an instruction cache. Fetch circuitry retrieves instructions from the instruction cache into a temporary buffer, and execution circuitry executes a sequence of instructions retrieved from the temporary buffer, that sequence including branch instructions. Branch prediction circuitry is configured to predict, for each identified branch instruction in the sequence, if that branch instruction will result in a taken branch when that branch instruction is subsequently executed by the execution circuitry. In a normal operating mode, the fetch circuitry retrieves one or more speculative instructions from the instruction cache between the time that a source branch instruction is retrieved from the instruction cache and the branch prediction circuitry predicts if that source branch instruction will result in the taken branch. In the event that that source branch instruction is predicted as taken, the one or more speculative instructions are discarded. In the event that a source branch instruction is predicted as taken, throttle prediction circuitry maintains a count value indicative of a number of instructions appearing in the sequence between that source branch instruction and a subsequent branch instruction in the sequence that is also predicted as taken. Responsive to a subsequent occurrence of the source branch instruction, that is predicted as taken, the throttle prediction circuitry operates the fetch circuitry in a throttled mode where the number of instructions subsequently retrieved by the fetch circuitry from the instruction cache is limited dependent on the count value, and then the fetch circuitry is prevented from retrieving any further instructions from the instruction cache for a predetermined number of clock cycles. This serves to reduce the power consumption consumed in accessing the instruction cache to retrieve speculative instructions which later need to be discarded.
Abstract:
A processor has a processing pipeline with first, second and third stages. An instruction at the first stage takes fewer cycles to reach the second stage then the third stage. The second and third stages each have a duplicated processing resource. For a pending instruction which requires the duplicated resource and can be processed using the duplicated resource at either of the second and third stages, the first stage determines whether a required operand would be available when the pending instruction would reach the second stage. If the operand would be available, then the pending instruction is processed using the duplicated resource at the second stage, while if the operand would not be available in time then the instruction is processed using the duplicated resource in the third pipeline stage. This technique helps to reduce delays caused by data dependency hazards.
Abstract:
Coherency control circuitry (10) supports processing of a safe-speculative-read transaction received from a requesting master device (4). The safe-speculative-read transaction is of a type requesting that target data is returned to a requesting cache (11) of the requesting master device (4) while prohibiting any change in coherency state associated with the target data in other caches (12) in response to the safe-speculative-read transaction. In response, at least when the target data is cached in a second cache associated with a second master device, at least one of the coherency control circuitry (10) and the second cache (12) is configured to return a safe-speculative-read response while maintaining the target data in the same coherency state within the second cache. This helps to mitigate against speculative side-channel attacks.