Abstract:
Data processing apparatus comprises vector processing circuitry to apply a vector processing instruction to data vectors having a data vector length, each data vector comprising a plurality of data items equal in number to the data vector length, the vector processing circuitry having circuitry defining a plurality of processing lanes, there being at least as many processing lanes as a maximum data vector length; and control circuitry to selectively vary the data vector length used by the vector processing circuitry amongst a plurality of possible data vector length values up to the maximum data vector length and to disable operation of a subset of the processing lanes so that the disabled subset of processing lanes are unavailable for use by the vector processing circuitry and there remain at least as many enabled processing lanes as the data vector length set by the control circuitry.
Abstract:
Analytics processing circuitry can include a data scavenger and a data analyzer coupled to receive the data from the data scavenger. The data scavenger collects data from at least one element of interest of a plurality of elements of interest of an IC. The data analyzer identifies patterns in the data from the data scavenger over a time frame or for a snapshot of time based on a predefined metric. The analytics processing circuitry can further include a moderator and a risk predictor. The risk predictor generates a risk assessment regarding whether the data collected by the data scavenger is indicative of normal behavior or abnormal behavior based at least on the output of the data analyzer and a behavioral model for the IC, which can be device and application specific. A threat response can be performed based on the risk assessment.
Abstract:
A data processing apparatus and method are provided for executing a vector scan instruction. The data processing apparatus comprises a vector register store configured to store vector operands, and processing circuitry configured to perform operations on vector operands retrieved from said vector register store. Further, control circuitry is configured to control the processing circuitry to perform the operations required by one or more instructions, said one or more instructions including a vector scan instruction specifying a vector operand comprising N vector elements and defining a scan operation to be performed on a sequence of vector elements within the vector operand. The control circuitry is responsive to the vector scan instruction to partition the N vector elements of the specified vector operand into P groups of adjacent vector elements, where P is between 2 and N/2, and to control the processing circuitry to perform a partitioned scan operation yielding the same result as the defined scan operation. The processing circuitry is configured to perform the partitioned scan operation by performing separate scan operations on those vector elements of the sequence contained within each group to produce intermediate results for each group, and to perform a computation operation to combine the intermediate results into a final result vector operand containing a sequence of result vector elements. The partitioned scan operation approach of the present invention enables a balance to be achieved between energy consumption and performance.
Abstract:
Apparatuses and method are disclosed for protecting the integrity of data stored in a protected area of memory. Data in the protected area of memory is retrieved in data blocks and an authentication code is associated with a memory granule contiguously comprising a first data block and a second data block. Calculation of the authentication code comprises a cryptographic calculation based on a first hash value determined from the first data block and a second hash value determined from the second data block. A hash value cache is provided to store hash values determined from data blocks retrieved from the protected area of the memory. When the first data block and its associated authentication code are retrieved from memory, a lookup for the second hash value in the hash value cache is performed, and a verification authentication code is calculated for the memory granule to which that data block belongs. The integrity of the first data block is contingent on the verification authentication code matching the retrieved authentication code.
Abstract:
A data processing apparatus (2) has scalar processing circuitry (32-42) and vector processing circuitry (38, 40, 42). When executing main scalar processing on the scalar processing circuitry (32-42), or main vector processing using a subset of said plurality of lanes on the vector processing circuitry (38, 40, 42), checker processing is executed using at least one lane of the plurality of lanes on the vector processing circuitry (38, 40, 42), the checker processing comprising operations corresponding to at least part of the main scalar/vector processing. Errors can then be detected based on a comparison of an outcome of the main processing and an outcome of the checker processing. This provides a technique for achieving functional safety in a high end processor with better performance and reduced hardware cost compared to a dual/triple core lockstep approach.
Abstract:
An apparatus and method are provided for transferring a plurality of data structures from memory into one or more vectors of data elements stored in a register bank. The apparatus has first interface circuitry to receive data structures retrieved from memory, where each data structure has an associated identifier and comprises N data elements. Multi-axial buffer circuitry is provided having an array of storage elements, where along a first axis the array is organised as N sets of storage elements, each set containing a plurality VL of storage elements, and where along a second axis the array is organised as groups of N storage elements, with each group containing a storage element from each of the N sets. Access control circuitry then stores the N data elements of a received data structure in one of the groups selected in dependence on the associated identifier. Responsive to an indication that all required data structures have been stored in the multi-axial buffer circuitry, second interface circuitry then outputs the data elements stored in one or more of the sets of storage elements as one or more corresponding vectors of data elements for storage in a register bank, each vector containing VL data elements. Such an approach can significantly increase the performance of handling such load operations, and give rise to potential energy savings.
Abstract:
A masked-vector-comparison instruction specifies a source vector operand comprising a plurality of source data elements, a mask value, and a comparison target operand. In response to the masked-vector-comparison instruction, an instruction decoder 10 controls processing circuitry 16 to: for each active source data element of the source vector operand, determine whether the active source data element satisfies a comparison condition, based on a masked comparison between one or more compared bits of the active source data element and one or more compared bits of the comparison target operand, the mask value specifying a pattern of compared bits and non-compared bits within the comparison target operand and the active source data element; and generate a result value indicative of which of the source data elements of the source vector operand, if any, is an active source data element satisfying the comparison condition. This instruction is useful for variable length decoding operations.
Abstract:
An apparatus comprising data processing circuitry for processing data in one of a plurality of operating states, an instruction decoder for decoding instructions and error checking circuitry for performing error checking operations. In response to a touch instruction being decoded by the instruction decoder, error checking operation is performed on selected architectural state. The architectural state is architecturally inaccessible to the operating state. As a result of the touch instruction, the architectural state remains unchanged, at least when no error is detected.
Abstract:
A data processing apparatus comprises branch prediction circuitry adapted to store at least one branch prediction state entry in relation to a stream of instructions, input circuitry to receive at least one input to generate a new branch prediction state entry, wherein the at least one input comprises a plurality of bits; and coding circuitry adapted to perform an encoding operation to encode at least some of the plurality of bits based on a value associated with a current execution environment in which the stream of instructions is being executed. This guards against potential attacks which exploit the ability for branch prediction entries trained by one execution environment to be used by another execution environment as a basis for branch predictions.
Abstract:
A method of operation concealment for a cryptographic system includes randomly selecting which one of at least two cryptographic operation blocks receives a key to apply a valid operation to data and outputs a result that is used for subsequent operations. Noise can be added by operating the other of the at least two cryptographic operation blocks using a modified key. The modified key can be generated by mixing the key with a block-unique-identifier, a device secret, a slowly adjusting output of a counter, or a combination thereof. In some cases, noise can be added to a cryptographic system by transforming input data of the other cryptographic operation block(s) by mixing the input data with the block-unique-identifier, device secret, counter output, or a combination thereof. A cryptographic system with operation concealment can further include a distributed (across a chip) or interweaved arrangement of subblocks of the cryptographic operation blocks.