IN-LANE VECTOR SHUFFLE INSTRUCTIONS
    75.
    发明申请

    公开(公告)号:US20180121198A1

    公开(公告)日:2018-05-03

    申请号:US15801652

    申请日:2017-11-02

    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

    Instruction and Logic for Early Underflow Detection and Rounder Bypass

    公开(公告)号:US20180088940A1

    公开(公告)日:2018-03-29

    申请号:US15280324

    申请日:2016-09-29

    CPC classification number: G06F9/30014 G06F7/00 G06F7/483 G06F7/5443

    Abstract: A processor for floating point underflow detection includes circuitry to decode a first instruction and a floating point unit. The decoded instruction, when executed by the processor, may be for performing a fused multiply-add (FMA) operation. The floating point unit includes circuitry to determine a non-normalized result of the first instruction based on a first input, a second input, and a third input. The floating point unit further includes circuitry to determine whether underflow exists in the non-normalized result based on a first exponent of the first input, a second exponent of the second input, and a third exponent of the third input.

    GATHER USING INDEX ARRAY AND FINITE STATE MACHINE

    公开(公告)号:US20170192934A1

    公开(公告)日:2017-07-06

    申请号:US14616323

    申请日:2015-02-06

    Abstract: Methods and apparatus are disclosed for using an index array and finite state machine for scatter/gather operations. Embodiment of apparatus may comprise: decode logic to decode a scatter/gather instruction and generate a set of micro-operations, and an index array to hold a set of indices and a corresponding set of mask elements. A finite state machine facilitates the gather operation. Address generation logic generates an address from an index of the set of indices for at least each of the corresponding mask elements having a first value. An address is accessed to load a corresponding data element if the mask element had the first value. The data element is written at an in-register position in a destination vector register according to a respective in-register position the index. Values of corresponding mask elements are changed from the first value to a second value responsive to completion of their respective loads.

    Vector shuffle instructions operating on multiple lanes each having a plurality of data elements using a same set of per-lane control bits

    公开(公告)号:US09672034B2

    公开(公告)日:2017-06-06

    申请号:US13838048

    申请日:2013-03-15

    CPC classification number: G06F9/30032 G06F9/30036 G06F9/3885 G06F9/3887

    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

Patent Agency Ranking