Abstract:
A data processing apparatus includes first vector registers and second vector registers, both dynamically spatially and dynamically temporally dividable. Decode circuitry receives one or more matrix multiplication instructions that indicate a set of first elements in the first vector registers and a set of second elements in the second vector registers, and in response to receiving the matrix multiplication instructions they generate a matrix multiplication operation. The matrix multiplication operation causes one or more execution units to perform a matrix multiplication of the set of first elements by the set of second elements and an average bit width of the first elements is different to an average bit width of the second elements.
Abstract:
A control table (22) defines information for controlling a processing component (20) to perform an operation. The table (22) comprises entries each corresponding to avariable size region defined by a first limit address and one of a second limit address and size. A binary search procedure is provided for looking up the table, comprising a number of search window narrowing steps, each narrowing a current search window of candidate entries to a narrower search window comprising fewer entries, based on a comparison of a query address against the first limit address of a selected candidate entry of the current search window. The comparison is independent of the second limit address or size of the selected candidate entry. After the search window is narrowed to a single entry, the query address is compared with the second limit address or size of that single entry.
Abstract:
An apparatus comprises: processing circuitry to perform data processing in one of a plurality of security domains including at least a secure domain and a less secure domain, and memory access checking circuitry to check whether a memory access is allowed depending on security attribute data indicating which domain is associated with a target address. In response to a given change of program flow from processing in the less secure domain to a target instruction having an address associated with the secure domain: a fault is triggered when the target instruction is an instruction other than a gateway instruction indicating a valid entry point to the secure domain. When the target instruction is said gateway instruction, a stack pointer verifying action is triggered to verify whether it is safe to use a selected stack pointer stored in a selected stack pointer register.
Abstract:
An apparatus comprises processing circuitry (4) and an instruction decoder (6) which supports vector instructions for which multiple lanes of processing are performed on respective data elements of a vector value. In response to a vector predication instruction, the instruction decoder (6) controls the processing circuitry (4) to set control information based on the outcome of a number of element comparison operations each for determining whether a corresponding element passes or fails a test condition. The control information controls processing of a predetermined number of subsequent vector instructions after the vector predication instruction. The predetermined number is hard-wired or identified by the vector predication instruction. For one of the subsequent vector instructions, an operation for a given portion of a given lane of vector processing is masked based on the outcome indicated by the control information for a corresponding data element.
Abstract:
A processing apparatus 2 has a secure domain 90 and a less secure domain 80. Security protection hardware 40 performs security checking operations when the processing circuitry 2 calls between domains. A data store 6 stores several software libraries 100 and library management software 110. The library management software 110 selects at least one of the libraries 100 as an active library which is executable by the processing circuitry 4 and at least one other library 100 as inactive libraries which are not executable. In response to an access to an inactive library, the library management software 110 switches which library is active.
Abstract:
Processing circuitry can operate in a secure domain and a less secure domain. In response to an initial exception from background processing performed by the processing circuitry, state saving of data from a first subset of registers is performed by exception control circuitry before triggering an exception handling routine, while the exception handling routine has responsibility for performing state saving of data from a second subset of registers. In response to a first exception causing a transition from the secure domain from a less secure domain, where the background processing was in the less secure domain, the exception control circuitry performs additional state saving of data from the second set of registers before triggering the exception handling routine. In response to a tail-chained exception causing a transition from the secure domain to the less secure domain, the exception handling routine is triggered without performing an additional state saving.
Abstract:
First and second forms of a complex multiply instruction are provided for operating on first and second operand vectors comprising multiple data elements including at least one real data element for representing the real part of a complex number and at least one imaginary element for representing an imaginary part of the complex number. One of the first and second forms of the instruction targets at least one real element of the destination vector and the other targets at least one imaginary element. By executing one of each instruction, complex multiplications of the form (a+ib)*(c+id) can be calculated using relatively few instructions and with only two vector register read ports, enabling DSP algorithms such as FFTs to be calculated more efficiently using relatively low power hardware implementations.
Abstract:
An apparatus comprises processing circuitry, a number of vector register and a number of scalar registers. An instruction decoder is provided which supports decoding of a vector multiply-add instruction specifying at least one vector register and at least one scalar register. In response to the vector multiply-add instruction, the decoder controls the processing circuitry to perform a vector multiply-add instruction in which each lane of processing generates a respective result data element corresponding to a sum of difference of a product value and an addend value, with the product value comprising the product of a respective data element of a first vector value and a multiplier value. In each lane of processing at least one of the multiplier value and the addend value is specified as a portion of a scalar value stored in a scalar register.
Abstract:
Processing circuitry supports overlapped execution of vector instructions when at least one beat of a first vector instruction is performed in parallel with at least one beat of a second vector instruction. The processing circuitry also supports mixed-scalar-vector instructions for which one of a destination register and one or more source registers is a vector register and another is a scalar register. In a sequence including first and subsequent mixed-scalar-vector instructions, instances of relaxed execution which can potentially lead to uncertain and incorrect results are permitted by the processing circuitry when the instructions are separated by fewer than a predetermined number of intervening instructions. In practice the situations which lead to the uncertain results are very rare and so it is not justified providing relatively expensive dependency checking circuitry for eliminating such cases.
Abstract:
Processing circuitry can operate in a secure domain and a less secure domain. In response to an initial exception from background processing performed by the processing circuitry, state saving of data from a first subset of registers is performed by exception control circuitry before triggering an exception handling routine, while the exception handling routine has responsibility for performing state saving of data from a second subset of registers. In response to a first exception causing a transition from the secure domain from a less secure domain, where the background processing was in the less secure domain, the exception control circuitry performs additional state saving of data from the second set of registers before triggering the exception handling routine. In response to a tail-chained exception causing a transition from the secure domain to the less secure domain, the exception handling routine is triggered without performing an additional state saving.