A DATA PROCESSING APPARATUS AND METHOD FOR HANDLING STALLED DATA

    公开(公告)号:US20240296132A1

    公开(公告)日:2024-09-05

    申请号:US18574277

    申请日:2022-06-21

    Applicant: Arm Limited

    CPC classification number: G06F13/1673 G06F13/161 G06F13/26

    Abstract: There is provided a data processing apparatus and method. The data processing apparatus comprises a plurality of processing elements connected via a network arranged on a single chip to form a spatial architecture. Each processing element comprising processing circuitry to perform processing operations and memory control circuitry to perform data transfer operations and to issue data transfer requests for requested data to the network. The memory control circuitry is configured to monitor the network to retrieve the requested data from the network. Each processing element is further provided with local storage circuitry comprising a plurality of local storage sectors to store data associated with the processing operations, and auxiliary memory control circuitry to monitor the network to detect stalled data (S60). The auxiliary memory control circuitry is configured to transfer the stalled data from the network to an auxiliary storage buffer (S66) dynamically selected from amongst the plurality of local storage sectors (S64).

    Input channel processing for triggered-instruction processing element

    公开(公告)号:US12045622B2

    公开(公告)日:2024-07-23

    申请号:US17941404

    申请日:2022-09-09

    Applicant: Arm Limited

    CPC classification number: G06F9/3856 G06F9/3802

    Abstract: One or more triggered-instruction processing elements are provided, a given triggered-instruction processing element comprising execution circuitry to execute processing operations in response to instructions according to a triggered instruction architecture. Input channel processing circuitry receives a given tagged data item (comprising a data value and a tag value) for a given input channel, and in response controls enqueuing of the data value of the given tagged data item to a selected buffer structure selected from among at least two buffer structures mapped onto register storage accessible to one or more of the triggered-instruction processing elements in response to a computation instruction for controlling performance of a computation operation. The selected buffer structure is selected based at least on the tag value, so data values of tagged data items specifying different tag values for the given input channel are allocatable to different buffer structures.

    Vector interleaving in a data processing apparatus

    公开(公告)号:US11093243B2

    公开(公告)日:2021-08-17

    申请号:US16630622

    申请日:2018-07-02

    Applicant: ARM LIMITED

    Abstract: Vector interleaving techniques in a data processing apparatus are disclosed, comprising apparatuses, instructions, methods of operating the apparatuses, and simulator implementations. A vector interleaving instruction specifies a first source register, second source register, and destination register. A first set of input data items is retrieved from the first source register and a second set of input data items from the second source register. A data processing operation is performed on selected input data item pairs taken from the first and second set of input data items to generate a set of result data items, which are stored as a result data vector in the destination register. First source register dependent result data items are stored in a first set of alternating positions in the destination data vector and second source register dependent result data items are stored in a second set of alternating positions in the destination data vector.

    Data processing
    4.
    发明授权

    公开(公告)号:US10445093B2

    公开(公告)日:2019-10-15

    申请号:US15371670

    申请日:2016-12-07

    Applicant: ARM Limited

    Abstract: Data processing apparatus comprises vector processing circuitry to apply a vector processing instruction to data vectors having a data vector length, each data vector comprising a plurality of data items equal in number to the data vector length, the vector processing circuitry having circuitry defining a plurality of processing lanes, there being at least as many processing lanes as a maximum data vector length; and control circuitry to selectively vary the data vector length used by the vector processing circuitry amongst a plurality of possible data vector length values up to the maximum data vector length and to disable operation of a subset of the processing lanes so that the disabled subset of processing lanes are unavailable for use by the vector processing circuitry and there remain at least as many enabled processing lanes as the data vector length set by the control circuitry.

    Element size increasing instruction

    公开(公告)号:US09965275B2

    公开(公告)日:2018-05-08

    申请号:US14814582

    申请日:2015-07-31

    Applicant: ARM LIMITED

    Abstract: An apparatus comprises processing circuitry to generate a result vector including at least one N-bit data element in response to an element size increasing instruction identifying at least a first input vector including M-bit data elements, where N>M. First and second forms of the element size increasing instruction are provided for generating the result vector using first and second subsets of data elements of the first input vector respectively. Positions of the first and second subsets of data elements in the first input vector are interleaved.

    Performance monitoring information informed register renaming

    公开(公告)号:US12182575B2

    公开(公告)日:2024-12-31

    申请号:US18079308

    申请日:2022-12-12

    Applicant: Arm Limited

    Inventor: Mbou Eyole

    Abstract: A data processing apparatus comprises: a physical register array, prediction circuitry, register rename circuitry, and hardware execution circuitry. The physical register array comprises a plurality of sectors having one or more different access properties, each of the plurality of sectors having one or more different access properties compared to other sectors of the plurality of sectors, each sector of the plurality of sectors comprising at least one physical register. The prediction circuitry to predict, for a given instruction, a sector identifier identifying one of the sectors of the physical register array to be used for a destination register of the given instruction. The prediction circuitry is configured to select the sector identifier in dependence on prediction information learnt from performance monitoring information indicative of performance achieved for a sequence of instructions when using different sector identifiers for the given instruction. The register rename circuitry to map a destination architectural register identifier specified by the given instruction to a destination physical register in the sector identified by the sector identifier predicted by the prediction circuitry. The execution circuitry to execute the given instruction and generate a result to be written to the destination physical register mapped to the destination architectural register identifier by the register rename circuitry.

    CIRCUITRY AND METHOD FOR INSTRUCTION EXECUTION IN DEPENDENCE UPON TRIGGER CONDITIONS

    公开(公告)号:US20240220269A1

    公开(公告)日:2024-07-04

    申请号:US18261966

    申请日:2022-01-19

    Applicant: Arm Limited

    CPC classification number: G06F9/3853

    Abstract: Circuitry comprises processing circuitry configured to execute program instructions in dependence upon respective trigger conditions matching a current trigger state and to set a next trigger state in response to program instruction execution; the processing circuitry comprising: instruction storage configured to selectively provide a group of two or more program instructions for execution in parallel; and trigger circuitry responsive to the generation of a trigger state by execution of program instructions and to a trigger condition associated with a given group of program instructions, to control the instruction storage to provide program instructions of the given group of program instructions for execution.

    Replicate partition instruction
    8.
    发明授权

    公开(公告)号:US11947962B2

    公开(公告)日:2024-04-02

    申请号:US16468098

    申请日:2017-11-10

    Applicant: ARM LIMITED

    CPC classification number: G06F9/30032 G06F9/30018 G06F9/30036 G06F9/30109

    Abstract: In response to a replicate partition instruction specifying partition information defining positions of a plurality of variable size partitions within a result vector, an instruction decoder (20) controls the processing circuitry (80) to generate a result vector in which each partition having more than one data element comprises data values or element indices of a sequence of data elements of a source vector starting or ending at a selected data element position. This instruction can be useful for accelerating processing of data structures smaller than the vector length.

Patent Agency Ranking