Vector mask driven clock gating for power efficiency of a processor

    公开(公告)号:US10133577B2

    公开(公告)日:2018-11-20

    申请号:US13997791

    申请日:2012-12-19

    Abstract: A processor includes an instruction schedule and dispatch (schedule/dispatch) unit to receive a single instruction multiple data (SIMD) instruction to perform an operation on multiple data elements stored in a storage location indicated by a first source operand. The instruction schedule/dispatch unit is to determine a first of the data elements that will not be operated to generate a result written to a destination operand based on a second source operand. The processor further includes multiple processing elements coupled to the instruction schedule/dispatch unit to process the data elements of the SIMD instruction in a vector manner, and a power management unit coupled to the instruction schedule/dispatch unit to reduce power consumption of a first of the processing elements configured to process the first data element.

    Memory fault suppression via re-execution and hardware FSM

    公开(公告)号:US09715432B2

    公开(公告)日:2017-07-25

    申请号:US14581859

    申请日:2014-12-23

    Abstract: Exemplary aspects are directed toward resolving fault suppression in hardware, which at the same time does not incur a performance hit. For example, when multiple instructions are executing simultaneously, a mask can specify which elements need not be executed. If the mask is disabled, those elements do not need to be executed. A determination is then made as to whether a fault happens in one of the elements that have been disabled. If there is a fault in one of the elements that has been disabled, a state machine re-fetches the instructions in a special mode. More specifically, the state machine determines if the fault is on a disabled element, and if the fault is on a disabled element, then the state machine specifies that the fault should be ignored. If during the first execution there was no mask, if there is an error present during execution, then the element is re-run with the mask to see if the error is a “real” fault.

    INSTRUCTION AND LOGIC TO PROVIDE VECTOR LINEAR INTERPOLATION FUNCTIONALITY
    65.
    发明申请
    INSTRUCTION AND LOGIC TO PROVIDE VECTOR LINEAR INTERPOLATION FUNCTIONALITY 有权
    指令和逻辑提供矢量线性插值功能

    公开(公告)号:US20160266902A1

    公开(公告)日:2016-09-15

    申请号:US13977736

    申请日:2011-12-16

    Abstract: Instructions and logic provide vector linear interpolation functionality. In some embodiments, responsive to an instruction specifying: a first operand from a set of vector registers, a size of each of the vector elements, a portion of the vector elements upon which to compute linear interpolations, a second operand from a set of vector registers, and a third operand; an execution unit, reads a first, a second and a third value of the size of vector elements from corresponding data fields in the first, the second and the third operand respectively and computes an interpolated value as the first value multiplied by the second value minus the second value multiplied by the third value plus the third value.

    Abstract translation: 指令和逻辑提供矢量线性插值功能。 在一些实施例中,响应于指令指定:来自一组向量寄存器的第一操作数,每个向量元素的大小,用于计算线性内插的向量元素的一部分,来自一组向量的第二操作数 寄存器和第三操作数; 执行单元分别从第一,第二和第三操作数中的对应数据字段读取向量元素的大小的第一值,第二和第三值,并计算内插值作为第一值乘以第二值减去 第二个值乘以第三个值加上第三个值。

    Vector address conflict resolution with vector population count functionality
    67.
    发明授权
    Vector address conflict resolution with vector population count functionality 有权
    矢量地址冲突解决与矢量人口计数功能

    公开(公告)号:US09411592B2

    公开(公告)日:2016-08-09

    申请号:US13731005

    申请日:2012-12-29

    Abstract: Instructions and logic provide SIMD address conflict resolution with vector population count functionality. Some embodiments include processors with a register with a variable plurality of data fields, each of the data fields to store a variable second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of bits set to one for corresponding data fields. Responsive to decoding a vector population count instruction, execution units count the number of bits set to one for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector population count instructions can be used with variable sized elements and conflict masks to generate iteration counts and completion masks to be used each iteration to resolve dependencies in gather-modify-scatter SIMD operations.

    Abstract translation: 指令和逻辑提供SIMD地址冲突解决与向量群体计数功能。 一些实施例包括具有可变多个数据字段的寄存器的处理器,每个数据字段用于存储可变的第二多个位。 目的地寄存器具有对应的数据字段,这些数据字段中的每一个用于存储为相应的数据字段设置为1的位数的计数。 响应于对向量群体计数指令进行解码,执行单元对寄存器中的每个数据字段设置为1的位数进行计数,并将计数存储在第一目的地寄存器的相应数据字段中。 矢量人口计数指令可用于可变大小的元素和冲突掩码,以生成迭代计数和完成掩码,以便在每次迭代中使用以解决聚集修改散射SIMD操作中的依赖关系。

    INSTRUCTION AND LOGIC TO PERFORM AN INVERSE CENTRIFUGE OPERATION
    68.
    发明申请
    INSTRUCTION AND LOGIC TO PERFORM AN INVERSE CENTRIFUGE OPERATION 审中-公开
    指导和逻辑执行反向离散操作

    公开(公告)号:US20160179548A1

    公开(公告)日:2016-06-23

    申请号:US14580055

    申请日:2014-12-22

    Abstract: In one embodiment a processing device implements a set of instructions to perform an inverse centrifuge operation using vector or general purpose registers. The inverse centrifuge operation interleaves bits from opposite regions of a source and writes the interleaved bits to a destination. The instructions use a control mask where each bit with a mask value of one is obtained from one side of the source register or vector elements with a mask of zero are obtained from the opposing side.

    Abstract translation: 在一个实施例中,处理装置实现一组指令以使用向量或通用寄存器来执行逆离心机操作。 反向离心机操作从源的相对区域交错比特,并将交错比特写入目的地。 指令使用控制掩码,其中从源寄存器的一侧获得具有掩码值为1的每个位或从相对侧获得具有零掩蔽的向量元素。

Patent Agency Ranking