REGISTER RENAMING
    1.
    发明公开
    REGISTER RENAMING 审中-公开

    公开(公告)号:US20240192959A1

    公开(公告)日:2024-06-13

    申请号:US18079308

    申请日:2022-12-12

    Applicant: Arm Limited

    Inventor: Mbou EYOLE

    CPC classification number: G06F9/384 G06F9/3844

    Abstract: A data processing apparatus comprises: a physical register array comprising a plurality of sectors having one or more different access properties, each sector of the plurality of sectors comprising at least one physical register; prediction circuitry to predict, for a given instruction, a sector identifier identifying one of the sectors of the physical register array to be used for a destination register of the given instruction, wherein the prediction circuitry is configured to select the sector identifier in dependence on prediction information learnt from performance monitoring information indicative of performance achieved for a sequence of instructions when using different sector identifiers for the given instruction; register rename circuitry to map a destination architectural register identifier specified by the given instruction to a destination physical register in the sector identified by the sector identifier predicted by the prediction circuitry; and execution circuitry to execute the given instruction and generate a result to be written to the destination physical register mapped to the destination architectural register identifier by the register rename circuitry.

    STATISTICAL MODE DETERMINATION
    2.
    发明申请

    公开(公告)号:US20200371806A1

    公开(公告)日:2020-11-26

    申请号:US16417840

    申请日:2019-05-21

    Applicant: Arm Limited

    Abstract: Apparatuses, methods of operating apparatuses, and corresponding computer programs are disclosed. In the apparatuses input circuitry receives input data comprising at least one data element and shift circuitry generates, for each data element of the input data, a bit-map giving a one-hot encoding representation of the data element, wherein a position of a set bit in the bit-map is dependent on the data element. Summation circuitry generates a position summation value for each position in the bit-map, wherein each position summation value is a sum across all bit-maps generated by the shift circuitry from the input data. Maximum identification circuitry determines at least one largest position summation value generated by the summation circuitry and output circuitry to generate an indication of at least one data element corresponding to the at least one largest position summation value. The statistical mode of the data elements in the input data is thereby efficiently determined.

    AN APPARATUS AND METHOD FOR MANAGING ADDRESS COLLISIONS WHEN PERFORMING VECTOR OPERATIONS

    公开(公告)号:US20190114172A1

    公开(公告)日:2019-04-18

    申请号:US16090357

    申请日:2017-04-06

    Applicant: ARM Limited

    Abstract: An apparatus and method are provided for managing address collisions when performing vector operations. The apparatus has a register store for storing vector operands, each vector operand comprising a plurality of elements, and execution circuitry for executing instructions in order to perform operations specified by the instructions. The execution circuitry has access circuitry for performing memory access operations in order to move the vector operands between the register store and memory, and processing circuitry for performing data processing operations using the vector operands. The execution circuitry may be arranged to iteratively execute a vector loop, where during each iteration the execution circuitry executes a sequence of instructions to implement the vector loop. The sequence includes a check instruction identifying a plurality of memory addresses, and the execution circuitry is responsive to execution of the check instruction to determine whether an address hazard condition exists amongst the plurality of memory addresses. N For each iteration of the vector loop, the execution circuitry is responsive to execution of the check instruction determining an absence of the hazard address condition, to employ a default level of vectorisation when executing the sequence of instructions to implement the vector loop. In contrast, in the presence of the address hazard condition, the execution circuitry employs a reduced level of vectorisation when executing the sequence of instructions to implement the vector loop. Such an approach has been found to provide a low latency mechanism for dynamically adjusting the level of vectorisation employed during each iteration of the vector loop, enabling code to be vectorised whilst still enabling efficient performance in the presence of address hazard conditions.

    INPUT CHANNEL PROCESSING FOR TRIGGERED-INSTRUCTION PROCESSING ELEMENT

    公开(公告)号:US20240086201A1

    公开(公告)日:2024-03-14

    申请号:US17941404

    申请日:2022-09-09

    Applicant: Arm Limited

    CPC classification number: G06F9/3855 G06F9/3802

    Abstract: One or more triggered-instruction processing elements are provided, a given triggered-instruction processing element comprising execution circuitry to execute processing operations in response to instructions according to a triggered instruction architecture. Input channel processing circuitry receives a given tagged data item (comprising a data value and a tag value) for a given input channel, and in response controls enqueuing of the data value of the given tagged data item to a selected buffer structure selected from among at least two buffer structures mapped onto register storage accessible to one or more of the triggered-instruction processing elements in response to a computation instruction for controlling performance of a computation operation. The selected buffer structure is selected based at least on the tag value, so data values of tagged data items specifying different tag values for the given input channel are allocatable to different buffer structures.

    DEVICES AND HEADSETS
    5.
    发明申请

    公开(公告)号:US20210295464A1

    公开(公告)日:2021-09-23

    申请号:US16824040

    申请日:2020-03-19

    Applicant: Arm Limited

    Abstract: A device has a content processing component operable in first and second content processing states, a display, at least one sensor operable to output sensor data indicative of at least one eye positional characteristic of a user, and a processor. The processor is configured to process the data, and in the first processing state, determine a region of the display corresponding to a foveal region of an eye of a user, and perform foveated processing of content to be displayed on the display such that a relatively high-quality video content is generated for display in the region and a relatively low-quality video content is generated for display outside the region. The second processing state is entered in response to a trigger. In the second processing state, the foveated processing used is overridden such that relatively low-quality video content is generated for display in at least a portion of the region.

    ONLINE INSTRUCTION TAGGING
    6.
    发明申请

    公开(公告)号:US20210117204A1

    公开(公告)日:2021-04-22

    申请号:US16658490

    申请日:2019-10-21

    Applicant: Arm Limited

    Abstract: Apparatuses and methods of data processing are disclosed for tagging instructions on-line. Instruction tag storage stores information indicative of a tag applied to certain instruction identifiers. A data processing operation performed by the data processing circuitry in response to an executed instruction is dependent on whether there is a corresponding instruction identifier for the executed instruction in the instruction tag storage which has the instruction tag. Register writer storage is maintained, and an entry is created for each register writing instruction encountered which causes a result value to be written to a destination register, where the entry comprises an indication of the destination register and the register writing instruction. An instruction tagging queue buffers instruction identifiers and an instruction identifier is added to the queue for a predetermined type of instruction when it is encountered. Instruction tagging circuitry tags the instructions in the instruction tagging queue and determines one or more producer instructions which each produce at least one data value which is a source operand of a subject instruction and adds the one or more producer instructions to the instruction tagging queue. Data dependency graphs are thus elaborated and online tagging of such data dependency graphs is thus supported.

    VECTOR INTERLEAVING IN A DATA PROCESSING APPARATUS

    公开(公告)号:US20210026629A1

    公开(公告)日:2021-01-28

    申请号:US16630622

    申请日:2018-07-02

    Applicant: ARM LIMITED

    Abstract: Vector interleaving techniques in a data processing apparatus are disclosed, comprising apparatuses, instructions, methods of operating the apparatuses, and simulator implementations. A vector interleaving instruction specifies a first source register, second source register, and destination register. A first set of input data items is retrieved from the first source register and a second set of input data items from the second source register. A data processing operation is performed on selected input data item pairs taken from the first and second set of input data items to generate a set of result data items, which are stored as a result data vector in the destination register. First source register dependent result data items are stored in a first set of alternating positions in the destination data vector and second source register dependent result data items are stored in a second set of alternating positions in the destination data vector.

    REPLICATE ELEMENTS INSTRUCTION
    8.
    发明申请

    公开(公告)号:US20190303155A1

    公开(公告)日:2019-10-03

    申请号:US16468108

    申请日:2017-11-10

    Applicant: ARM LIMITED

    Abstract: A replicate elements instruction defining a plurality of variable length segments in a result vector controls processing circuitry (80) to generate a result vector in which, in each respective segment, a repeating value is repeated throughout that segment of the result vector, the repeating value comprising a data value or element index of a selected data element of a source vector. This instructions is useful for accelerating processing of data structures smaller than the vector length.

    APPARATUS AND METHOD FOR TRANSFERRING A PLURALITY OF DATA STRUCTURES BETWEEN MEMORY AND A PLURALITY OF VECTOR REGISTERS
    9.
    发明申请
    APPARATUS AND METHOD FOR TRANSFERRING A PLURALITY OF DATA STRUCTURES BETWEEN MEMORY AND A PLURALITY OF VECTOR REGISTERS 有权
    用于传输存储器和多个矢量寄存器之间的数据结构的大量数据的装置和方法

    公开(公告)号:US20170031865A1

    公开(公告)日:2017-02-02

    申请号:US14814590

    申请日:2015-07-31

    Abstract: An apparatus and method are provided for transferring a plurality of data structures between memory and a plurality of vector registers, each vector register being arranged to store a vector operand comprising a plurality of data elements. Access circuitry is used to perform access operations to move data elements of vector operands between the data structures in memory and specified vector registers, each data structure comprising multiple data elements stored at contiguous addresses in the memory. Decode circuitry is responsive to a single access instruction identifying a plurality of vector registers and a plurality of data structures that are located discontiguously with respect to each other in the memory, to generate control signals to control the access circuitry to perform a sequence of access operations to move the plurality of data structures between the memory and the plurality of vector registers such that the vector operand in each vector register holds a corresponding data element from each of the plurality of data structures. This provides a very efficient mechanism for performing complex access operations, resulting in an increase in execution speed, and potential reductions in power consumption.

    Abstract translation: 提供了一种用于在存储器和多个向量寄存器之间传送多个数据结构的装置和方法,每个向量寄存器被布置为存储包括多个数据元素的向量操作数。 访问电路用于执行访问操作以在存储器和指定向量寄存器中的数据结构之间移动向量操作数的数据元素,每个数据结构包括存储在存储器中的连续地址处的多个数据元素。 解码电路响应于识别多个向量寄存器的单个访问指令和在存储器中相对于彼此无关位置的多个数据结构,以产生控制信号以控制访问电路执行一系列访问操作 以在存储器和多个向量寄存器之间移动多个数据结构,使得每个向量寄存器中的向量操作数保持来自多个数据结构中的每一个的相应数据元素。 这为执行复杂的访问操作提供了非常有效的机制,从而导致执行速度的提高以及潜在的功耗降低。

    APPARATUS AND METHOD FOR PERFORMING A SPLICE OPERATION

    公开(公告)号:US20240354105A1

    公开(公告)日:2024-10-24

    申请号:US18762800

    申请日:2024-07-03

    Applicant: ARM LIMITED

    CPC classification number: G06F9/30032 G06F9/30018 G06F9/30036

    Abstract: An apparatus and a method are provided for performing a splice operation, the apparatus having a set of vector registers and one or more control registers. Processing circuitry is arranged to execute a sequence of instructions including a splice instruction that identifies at least a first vector register and at least one control register. The first vector register stores a first vector of data elements having a vector length, and the at least one control register stores control data identifying, independently of the vector length, one or more data elements occupying sequential data element positions within the first vector of data elements. The processing circuitry is responsive to execution of the splice instruction to extract from the first vector each data element identified by the control data in the at least one control register, and to output the extracted data elements within a result vector of data elements that also contains data elements from a second vector. Since the control data in the at least one control register identifies the data elements to be extracted without reference to the vector length, this provides a great deal of flexibility as to how the data elements to be extracted may be selected within the first vector.

Patent Agency Ranking