METHOD AND APPARATUS FOR PERFORMING CONFLICT DETECTION
    1.
    发明申请
    METHOD AND APPARATUS FOR PERFORMING CONFLICT DETECTION 有权
    用于执行冲突检测的方法和装置

    公开(公告)号:US20160179528A1

    公开(公告)日:2016-06-23

    申请号:US14581607

    申请日:2014-12-23

    Abstract: An apparatus and method are described for performing conflict detection operations. For example, one embodiment of a processor comprises: a first source vector register to store a first set of data elements; a second source vector register to store a second set of data elements; conflict detection logic to perform a specified comparison operation comparing each of the first set of data elements with specified data elements from the second set and generating a set of comparison results, the comparison operation to be selected from a group consisting of a greater than comparison, a less than comparison, a greater than or equal to comparison, a less than or equal to comparison, and a not equal to comparison.

    Abstract translation: 描述了用于执行冲突检测操作的装置和方法。 例如,处理器的一个实施例包括:第一源向量寄存器,用于存储第一组数据元素; 第二源向量寄存器,用于存储第二组数据元素; 冲突检测逻辑,用于执行指定的比较操作,将第一组数据元素与来自第二组的指定数据元素进行比较,并生成一组比较结果,从大于比较的组中选择的比较操作, 小于比较,大于或等于比较,小于或等于比较,不等于比较。

    SYSTEMS, APPARATUSES, AND METHODS FOR CHAINED FUSED MULTIPLY ADD

    公开(公告)号:US20190121637A1

    公开(公告)日:2019-04-25

    申请号:US16169456

    申请日:2018-10-24

    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

    SYSTEMS, APPARATUSES, AND METHODS FOR CHAINED FUSED MULTIPLY ADD

    公开(公告)号:US20250060963A1

    公开(公告)日:2025-02-20

    申请号:US18815382

    申请日:2024-08-26

    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

    SYSTEMS, APPARATUSES, AND METHODS FOR CHAINED FUSED MULTIPLY ADD

    公开(公告)号:US20230083705A1

    公开(公告)日:2023-03-16

    申请号:US17952001

    申请日:2022-09-23

    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

    SYSTEMS, APPARATUSES, AND METHODS FOR CHAINED FUSED MULTIPLY ADD

    公开(公告)号:US20210081198A1

    公开(公告)日:2021-03-18

    申请号:US17107134

    申请日:2020-11-30

    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

    METHOD AND APPARATUS FOR PERFORMING A VECTOR PERMUTE WITH AN INDEX AND AN IMMEDIATE
    8.
    发明申请
    METHOD AND APPARATUS FOR PERFORMING A VECTOR PERMUTE WITH AN INDEX AND AN IMMEDIATE 审中-公开
    用索引和立即执行矢量保护的方法和装置

    公开(公告)号:US20160188530A1

    公开(公告)日:2016-06-30

    申请号:US14583644

    申请日:2014-12-27

    Abstract: An apparatus and method for performing a vector permute. For example, one embodiment of a processor comprises: a source vector register to store a plurality of source data elements; a destination vector register to store a plurality of destination data elements; a control vector register to store a plurality of control data elements, each control data element corresponding to one of the destination data elements and including an N bit value indicating whether a source data element is to be copied to the corresponding destination data element; vector permute logic to compare the N bit value of each control data element to an N bit portion of an immediate to determine whether to copy a source data element to the corresponding destination data element, wherein if the N bit values match, then the vector permute logic is to identify a source data element using an index value included in the control data element and to responsively copy the source data element to the corresponding destination data element in the destination vector register.

    Abstract translation: 用于执行向量置换的装置和方法。 例如,处理器的一个实施例包括:源向量寄存器,用于存储多个源数据元素; 目的地向量寄存器,用于存储多个目的地数据元素; 用于存储多个控制数据元素的控制向量寄存器,与目的地数据元素之一对应的每个控制数据元素,并且包括指示源数据元素是否被复制到对应的目的地数据元素的N位值; 向量置换逻辑,以将每个控制数据元素的N位值与立即数的N位部分进行比较,以确定是否将源数据元素复制到对应的目标数据元素,其中如果N位值匹配,则向量置换 逻辑是使用包括在控制数据元素中的索引值来识别源数据元素,并且将源数据元素响应地复制到目的地向量寄存器中的相应目的地数据元素。

Patent Agency Ranking