METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE VECTOR PACKED TUPLE CROSS-COMPARISON FUNCTIONALITY
    1.
    发明申请
    METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE VECTOR PACKED TUPLE CROSS-COMPARISON FUNCTIONALITY 审中-公开
    方法,装置,说明和逻辑提供向量包装的十字形跨比较功能

    公开(公告)号:WO2016109170A1

    公开(公告)日:2016-07-07

    申请号:PCT/US2015/065514

    申请日:2015-12-14

    CPC classification number: G06F9/30036 G06F9/30018 G06F9/30021 G06F9/3834

    Abstract: Instructions and logic provide SIMD vector packed tuple cross-comparison functionality. Some processor embodiments include first and second registers with a variable plurality of data fields, each of the data fields to store an element of a first data type. The processor executes SIMD instructions for vector packed tuple cross-comparisons in some embodiments, which for each data field of a portion of data fields in a tuple of the first register, compares its corresponding element with every element of a corresponding portion of data fields in a tuple of the second register and sets mask bits corresponding to elements of the second register portion, in a bit-mask corresponding to unmasked elements of the corresponding first register portion, according to the corresponding comparison. In some embodiments bit-masks are shifted by corresponding elements in data fields of a third register. The comparison type is indicated by an immediate operand.

    Abstract translation: 指令和逻辑提供SIMD向量填充元组交叉比较功能。 一些处理器实施例包括具有可变多个数据字段的第一和第二寄存器,每个数据字段用于存储第一数据类型的元素。 在一些实施例中,处理器执行用于向量填充元组交叉比较的SIMD指令,对于第一寄存器的元组中的数据字段的一部分的每个数据字段,将其相应元素与数据字段的相应部分的每个元素进行比较 根据对应的比较,在对应于相应的第一寄存器部分的未屏蔽元件的位掩码中,设置与第二寄存器部分的元件对应的掩码位的第二寄存器的元组。 在一些实施例中,位掩码由第三寄存器的数据字段中的相应元素移位。 比较类型由即时操作数指示。

    PACKED DATA OPERATION MASK COMPARISON PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
    2.
    发明申请
    PACKED DATA OPERATION MASK COMPARISON PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS 审中-公开
    包装数据操作掩码比较处理器,方法,系统和指令

    公开(公告)号:WO2013101124A1

    公开(公告)日:2013-07-04

    申请号:PCT/US2011/067972

    申请日:2011-12-29

    Abstract: Receive packed data operation mask comparison instruction indicating first packed data operation mask having first packed data operation mask bits and second packed data operation mask having second packed data operation mask bits. Each packed data operation mask bit of first mask corresponds to a packed data operation mask bit of second mask in corresponding position. Modify first flag to first value if bitwise AND of each packed data operation mask bit of first mask with each corresponding packed data operation mask bit of second mask is zero. Otherwise modify first flag to second value. Modify second flag to third value if bitwise AND of each packed data operation mask bit of first mask with bitwise NOT of each corresponding packed data operation mask bit of second mask is zero. Otherwise modify second flag to fourth value.

    Abstract translation: 接收指示具有第一打包数据操作屏蔽位的第一打包数据操作掩码的打包数据操作掩码比较指令和具有第二打包数据操作掩码位的第二打包数据操作掩码。 第一掩码的每个打包数据操作屏蔽位对应于相应位置的第二掩码的打包数据操作屏蔽位。 将第一个掩码的每个打包数据操作屏蔽位的按位AND和第二个掩码的每个对应的打包数据操作掩码位的第一个值修改为第一个值为零。 否则将第一个标志修改为第二个值。 如果第二掩码的每个对应的打包数据操作屏蔽位的按位NOT的第一掩码的每个打包数据操作屏蔽位的按位AND为零,则将第二标志修改为第三值。 否则将第二个标志修改为第四个值。

    FLOATING POINT SCALING PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
    3.
    发明申请
    FLOATING POINT SCALING PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS 审中-公开
    浮点定标处理器,方法,系统和指令

    公开(公告)号:WO2013101010A1

    公开(公告)日:2013-07-04

    申请号:PCT/US2011/067684

    申请日:2011-12-28

    CPC classification number: G06F7/483 G06F9/30014 G06F9/30036

    Abstract: A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.

    Abstract translation: 一种方面的方法包括接收浮点缩放指令。 浮点缩放指令指示包括一个或多个浮点数据元素的第一源,包括一个或多个对应浮点数据元素的第二源和目的地。 响应于浮点缩放指令,结果存储在目的地中。 结果包括一个或多个相应的结果浮点数据元素,每个元素包括第二源的相应浮点数据元素乘以第一源的一个或多个浮点数据元素的基数,并将其代入 第一个源的相应浮点数据元素。 公开了其它方法,装置,系统和指令。

    APPARATUS AND METHOD OF IMPROVED PERMUTE INSTRUCTIONS
    4.
    发明申请
    APPARATUS AND METHOD OF IMPROVED PERMUTE INSTRUCTIONS 审中-公开
    装置和方法的改进的准则说明

    公开(公告)号:WO2013095637A1

    公开(公告)日:2013-06-27

    申请号:PCT/US2011/067210

    申请日:2011-12-23

    CPC classification number: G06F9/30029 G06F9/30018 G06F9/30032 G06F9/30036

    Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

    Abstract translation: 描述了具有指令执行逻辑电路的设备。 指令执行逻辑电路具有输入向量元素路由电路,以针对三个不同指令中的每一个执行以下操作:对于多个输出向量元素位置中的每一个,将输入向量元素路由到输出向量元素位置, 输入矢量元素位置,可用于输出矢量元素。 输出矢量元素和每个输入矢量元素位置是三个不同指令的三个可用位宽中的一个。 该装置还包括耦合到输入矢量元件路由电路的屏蔽层电路,以屏蔽由输入矢量路由元件电路创建的数据结构。 屏蔽层电路被设计为以三个不同的粒度级进行掩模,这些粒度级对应于三个可用比特宽度。

    APPARATUS AND METHOD FOR MEMORY-HIERARCHY AWARE PRODUCER-CONSUMER INSTRUCTION
    8.
    发明申请
    APPARATUS AND METHOD FOR MEMORY-HIERARCHY AWARE PRODUCER-CONSUMER INSTRUCTION 审中-公开
    用于记忆级别生产者消费者指令的装置和方法

    公开(公告)号:WO2013095464A1

    公开(公告)日:2013-06-27

    申请号:PCT/US2011/066630

    申请日:2011-12-21

    Abstract: An apparatus and method are described for efficiently transferring data from a producer core to a consumer core within a central processing unit (CPU). For example, one embodiment of a method comprises: A method for transferring a chunk of data from a producer core of a central processing unit (CPU) to consumer core of the CPU, comprising: writing data to a buffer within the producer core of the CPU until a designated amount of data has been written; upon detecting that the designated amount of data has been written, responsively generating an eviction cycle, the eviction cycle causing the data to be transferred from the fill buffer to a cache accessible by both the producer core and the consumer core; and upon the consumer core detecting that data is available in the cache, providing the data to the consumer core from the cache upon receipt of a read signal from the consumer core.

    Abstract translation: 描述了一种用于在中央处理单元(CPU)内有效地将数据从生产者核心传送到消费者核心的装置和方法。 例如,一种方法的一个实施例包括:一种用于将数据块从中央处理单元(CPU)的生产者核心传送到CPU的消费者核心的方法,包括:将数据写入到所述CPU的生产者核心内的缓冲器 CPU直到指定数据量被写入; 在检测到指定量的数据被写入时,响应地产生驱逐周期,使得将数据从填充缓冲器传送到可由生产者核心和消费者核心访问的高速缓存的逐出循环; 并且在消费者核心检测到数据在高速缓存中可用时,在从消费者核心接收到读取信号时从高速缓存提供数据给消费者核心。

Patent Agency Ranking