APPARATUS AND METHOD FOR MEMORY-HIERARCHY AWARE PRODUCER-CONSUMER INSTRUCTIONS
    1.
    发明申请
    APPARATUS AND METHOD FOR MEMORY-HIERARCHY AWARE PRODUCER-CONSUMER INSTRUCTIONS 审中-公开
    用于记忆级别生产者消费者指令的装置和方法

    公开(公告)号:US20140208031A1

    公开(公告)日:2014-07-24

    申请号:US13994724

    申请日:2011-12-21

    IPC分类号: G06F12/08 G06T1/60

    摘要: An apparatus and method are described for efficiently transferring data from a producer core to a consumer core within a central processing unit (CPU). For example, one embodiment of a method comprises: A method for transferring a chunk of data from a producer core of a central processing unit (CPU) to consumer core of the CPU, comprising: writing data to a buffer within the producer core of the CPU until a designated amount of data has been written; upon detecting that the designated amount of data has been written, responsively generating an eviction cycle, the eviction cycle causing the data to be transferred from the fill buffer to a cache accessible by both the producer core and the consumer core; and upon the consumer core detecting that data is available in the cache, providing the data to the consumer core from the cache upon receipt of a read signal from the consumer core.

    摘要翻译: 描述了一种用于在中央处理单元(CPU)内有效地将数据从生产者核心传送到消费者核心的装置和方法。 例如,一种方法的一个实施例包括:一种用于将数据块从中央处理单元(CPU)的生产者核心转移到CPU的消费者核心的方法,包括:将数据写入到所述CPU的生产者核心内的缓冲器 CPU直到指定数据量被写入; 在检测到指定量的数据被写入时,响应地产生驱逐周期,使得将数据从填充缓冲器传送到可由生产者核心和消费者核心访问的高速缓存的逐出循环; 并且在消费者核心检测到数据在高速缓存中可用时,在从消费者核心接收到读取信号时从高速缓存提供数据给消费者核心。

    METHOD AND APPARATUS FOR EFFICIENT MATRIX ALIGNMENT IN A SYSTOLIC ARRAY

    公开(公告)号:US20190042262A1

    公开(公告)日:2019-02-07

    申请号:US16147506

    申请日:2018-09-28

    IPC分类号: G06F9/38 G06F15/80 G06F9/30

    摘要: An apparatus and method for efficient matrix alignment in a systolic array. For example, one embodiment of a processor comprises: a first set of physical tile registers to store first matrix data in rows or columns; a second set of physical tile registers to store second matrix data in rows or columns; a decoder to decode a matrix instruction identifying a first input matrix, a first offset, a second input matrix, and a second offset; and execution circuitry, responsive to the matrix instruction, to read a subset of rows or columns from the first set of physical tile registers in accordance with the first offset, spanning multiple physical tile registers from the first set if indicated by the first offset to generate a first input matrix and the execution circuitry to read a subset of rows or columns from the second set of physical tile registers in accordance with the second offset, spanning multiple physical tile registers from the second set if indicated by the second offset to generate a second input matrix; and the execution circuitry to perform an arithmetic operation with the first and second input matrices in accordance with an opcode of the matrix instruction.

    GATHER CACHE ARCHITECTURE
    6.
    发明申请

    公开(公告)号:US20120254542A1

    公开(公告)日:2012-10-04

    申请号:US13078380

    申请日:2011-04-01

    IPC分类号: G06F12/08

    CPC分类号: G06F12/0815 G06F12/0804

    摘要: Apparatuses and methods to perform gather instructions are presented. In one embodiment, an apparatus comprises a gather logic module which includes a gather logic unit to identify locality of data elements in response to a gather instruction. The apparatus includes memory comprising a plurality of memory rows including a memory row associated with the gather instruction. The apparatus further includes memory structure to store data element addresses accessed in response to the gather instruction.

    摘要翻译: 提出了执行收集指令的装置和方法。 在一个实施例中,装置包括收集逻辑模块,其包括收集逻辑单元,以响应于收集指令来识别数据元素的位置。 所述装置包括存储器,所述存储器包括多个存储器行,所述存储器行包括与所述收集指令相关联的存储器行。 该装置还包括用于存储响应于收集指令而被访问的数据元素地址的存储器结构。

    Gather using index array and finite state machine
    7.
    发明授权
    Gather using index array and finite state machine 有权
    收集使用索引数组和有限状态机

    公开(公告)号:US08972697B2

    公开(公告)日:2015-03-03

    申请号:US13487184

    申请日:2012-06-02

    IPC分类号: G06F12/02

    摘要: Methods and apparatus are disclosed for using an index array and finite state machine for scatter/gather operations. Embodiment of apparatus may comprise: decode logic to decode a scatter/gather instruction and generate a set of micro-operations, and an index array to hold a set of indices and a corresponding set of mask elements. A finite state machine facilitates the gather operation. Address generation logic generates an address from an index of the set of indices for at least each of the corresponding mask elements having a first value. An address is accessed to load a corresponding data element if the mask element had the first value. The data element is written at an in-register position in a destination vector register according to a respective in-register position the index. Values of corresponding mask elements are changed from the first value to a second value responsive to completion of their respective loads.

    摘要翻译: 公开了使用索引阵列和有限状态机进行散射/收集操作的方法和装置。 设备的实施例可以包括:解码逻辑以解码分散/收集指令并生成一组微操作,以及索引阵列以保存一组索引和相应的一组掩码元素。 有限状态机有助于收集操作。 地址生成逻辑从针对具有第一值的对应掩模元素中的至少每一个的索引集合的索引生成地址。 如果mask元素具有第一个值,则访问地址以加载相应的数据元素。 根据相应的注册位置的索引,将数据元素写入到目的地向量寄存器的寄存器位置。 响应于其相应负载的完成,对应的屏蔽元件的值从第一值改变为第二值。

    Gather cache architecture
    8.
    发明授权
    Gather cache architecture 有权
    收集缓存架构

    公开(公告)号:US08688962B2

    公开(公告)日:2014-04-01

    申请号:US13078380

    申请日:2011-04-01

    IPC分类号: G06F9/30

    CPC分类号: G06F12/0815 G06F12/0804

    摘要: Apparatuses and methods to perform gather instructions are presented. In one embodiment, an apparatus comprises a gather logic module which includes a gather logic unit to identify locality of data elements in response to a gather instruction. The apparatus includes memory comprising a plurality of memory rows including a memory row associated with the gather instruction. The apparatus further includes memory structure to store data element addresses accessed in response to the gather instruction.

    摘要翻译: 提出了执行收集指令的装置和方法。 在一个实施例中,装置包括收集逻辑模块,其包括收集逻辑单元,以响应于收集指令来识别数据元素的位置。 所述装置包括存储器,所述存储器包括多个存储器行,所述存储器行包括与所述收集指令相关联的存储器行。 该装置还包括用于存储响应于收集指令而被访问的数据元素地址的存储器结构。

    GATHER USING INDEX ARRAY AND FINITE STATE MACHINE
    9.
    发明申请
    GATHER USING INDEX ARRAY AND FINITE STATE MACHINE 有权
    使用索引阵列和有限状态机

    公开(公告)号:US20130326160A1

    公开(公告)日:2013-12-05

    申请号:US13487184

    申请日:2012-06-02

    IPC分类号: G06F12/00

    摘要: Methods and apparatus are disclosed for using an index array and finite state machine for scatter/gather operations. Embodiment of apparatus may comprise: decode logic to decode a scatter/gather instruction and generate a set of micro-operations, and an index array to hold a set of indices and a corresponding set of mask elements. A finite state machine facilitates the gather operation. Address generation logic generates an address from an index of the set of indices for at least each of the corresponding mask elements having a first value. An address is accessed to load a corresponding data element if the mask element had the first value. The data element is written at an in-register position in a destination vector register according to a respective in-register position the index. Values of corresponding mask elements are changed from the first value to a second value responsive to completion of their respective loads.

    摘要翻译: 公开了使用索引阵列和有限状态机进行散射/收集操作的方法和装置。 设备的实施例可以包括:解码逻辑以解码分散/收集指令并生成一组微操作,以及索引阵列以保存一组索引和相应的一组掩码元素。 有限状态机有助于收集操作。 地址生成逻辑从针对具有第一值的对应掩模元素中的至少每一个的索引集合的索引生成地址。 如果mask元素具有第一个值,则访问地址以加载相应的数据元素。 根据相应的注册位置的索引,将数据元素写入到目的地向量寄存器的寄存器位置。 响应于其相应负载的完成,对应的屏蔽元件的值从第一值改变为第二值。

    Apparatus and method for down conversion of data types

    公开(公告)号:US10474463B2

    公开(公告)日:2019-11-12

    申请号:US13997006

    申请日:2011-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus and method are described for down-converting from a source operand to a destination operand with masking. For example, a method according to one embodiment includes the following operations: reading a source operand value to be down-converted from a first value to a down-converted value and stored in a destination location; reading each mask register bit stored in a mask register, the mask register bit(s) indicating whether to perform a masking operation or a conversion operation on the source operand value; if the mask register bit(s) indicates that a masking operation is to be performed, then performing a specified masking operation and storing the results of the masking operation in the destination location; and if the mask register bit indicates that a masking operation is not to be performed, then down-converting the source operand value and storing the down-converted value in the specified destination location.