AN APPARATUS AND METHOD FOR TRANSFERRING A PLURALITY OF DATA STRUCTURES BETWEEN MEMORY AND ONE OR MORE VECTORS OF DATA ELEMENTS STORED IN A REGISTER BANK
    1.
    发明申请
    AN APPARATUS AND METHOD FOR TRANSFERRING A PLURALITY OF DATA STRUCTURES BETWEEN MEMORY AND ONE OR MORE VECTORS OF DATA ELEMENTS STORED IN A REGISTER BANK 审中-公开
    用于传输存储器与存储在寄存器中的数据元素的一个或多个向量之间的数据结构的大量数据结构的装置和方法

    公开(公告)号:WO2017021676A1

    公开(公告)日:2017-02-09

    申请号:PCT/GB2016/051769

    申请日:2016-06-15

    Applicant: ARM LIMITED

    CPC classification number: G06F9/30036 G06F9/30018 G06F9/30043 G06F9/3824

    Abstract: An apparatus and method are provided for transferring a plurality of data structures from memory into one or more vectors of data elements stored in a register bank. The apparatus has first interface circuitry to receive data structures retrieved from memory, where each data structure has an associated identifier and comprises N data elements. Multi-axial buffer circuitry is provided having an array of storage elements, where along a first axis the array is organised as N sets of storage elements, each set containing a plurality VL of storage elements, and where along a second axis the array is organised as groups of N storage elements, with each group containing a storage element from each of the N sets. Access control circuitry then stores the N data elements of a received data structure in one of the groups selected in dependence on the associated identifier. Responsive to an indication that all required data structures have been stored in the multi-axial buffer circuitry, second interface circuitry then outputs the data elements stored in one or more of the sets of storage elements as one or more corresponding vectors of data elements for storage in a register bank, each vector containing VL data elements. Such an approach can significantly increase the performance of handling such load operations, and give rise to potential energy savings.

    Abstract translation: 提供了一种用于将多个数据结构从存储器传送到存储在寄存器组中的数据元素的一个或多个向量的装置和方法。 该装置具有用于接收从存储器检索的数据结构的第一接口电路,其中每个数据结构具有相关联的标识符并且包括N个数据元素。 提供具有存储元件阵列的多轴缓冲器电路,其中沿着第一轴将阵列组织为N组存储元件,每组包含多个存储元件的VL,并且其中沿着第二轴阵列被组织 作为N个存储元件的组,每个组包含来自N个集合中的每一个的存储元素。 接入控制电路然后将接收的数据结构的N个数据元素存储在根据相关联的标识符选择的组中的一个中。 响应于所有需要的数据结构已被存储在多轴缓冲器电路中的指示,第二接口电路然后将存储在一组或多组存储元件中的数据元素输出为用于存储的数据元素的一个或多个相应向量 在寄存器组中,每个向量包含VL数据元素。 这种方法可以显着提高处理这种负载操作的性能,并产生潜在的节能。

    A DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING SEGMENTED OPERATIONS
    2.
    发明申请
    A DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING SEGMENTED OPERATIONS 审中-公开
    一种数据处理装置和执行分离操作的方法

    公开(公告)号:WO2015118299A1

    公开(公告)日:2015-08-13

    申请号:PCT/GB2015/050132

    申请日:2015-01-21

    Applicant: ARM LIMITED

    Abstract: A data processing apparatus and method are provided for performing segmented operations. The data processing apparatus comprises a vector register store for storing vector operands, and vector processing circuitry providing N lanes of parallel processing, and arranged to perform a segmented operation on up to N data elements provided by a specified vector operand, each data element being allocated to one of the N lanes. The up to N data elements forms a plurality of segments, and performance of the segmented operation comprises performing a separate operation on the data elements of each segment, the separate operation involving interaction between the lanes containing the data elements of the associated segment. Predicate generation circuitry is responsive to a compute descriptor instruction specifying an input vector operand comprising a plurality of segment descriptors, to generate per lane predicate information used by the vector processing circuitry when performing the segmented operation to maintain a boundary between each of the plurality of segments. As a result, interaction between lanes containing data elements from different segments is prevented. This allows very effective utilisation of the lanes of parallel processing within the vector processing circuitry to be achieved.

    Abstract translation: 提供了一种用于执行分段操作的数据处理装置和方法。 数据处理装置包括用于存储向量操作数的向量寄存器存储器和提供N个并行处理通道的向量处理电路,并且被布置为对由指定向量操作数提供的多达N个数据元素执行分段操作,每个数据元素被分配 到N条车道之一。 最多N个数据元素形成多个段,并且分段操作的执行包括对每个段的数据元素执行单独的操作,该单独操作涉及包含相关段的数据元素的通道之间的交互。 谓词生成电路响应于指定包括多个段描述符的输入向量操作数的计算描述符指令,以在执行分割操作时生成由向量处理电路使用的每通道谓词信息,以维持多个段中的每个段之间的边界 。 结果,阻止了包含来自不同段的数据元素的通道之间的相互作用。 这允许在矢量处理电路内非常有效地利用并行处理的通道。

Patent Agency Ranking