-
公开(公告)号:US20180189061A1
公开(公告)日:2018-07-05
申请号:US15396111
申请日:2016-12-30
IPC分类号: G06F9/30
CPC分类号: G06F9/3016 , G06F9/3001 , G06F9/30018 , G06F9/30021 , G06F9/30032 , G06F9/30036 , G06F9/3838
摘要: Embodiments of systems, apparatuses, and methods for instruction execution. In some embodiments, an instruction has fields for a first and a second source operand, and a destination operand. When executed, the instruction causes an arithmetic operation on broadcasted packed data elements of the first source operand and storage of results of each arithmetic operation in the destination operand, wherein the packed data elements of the first source operand to be broadcast are dictated by values of packed data elements stored in a second source operand, wherein the arithmetic operation is defined by the instruction.
-
公开(公告)号:US20180004513A1
公开(公告)日:2018-01-04
申请号:US15201138
申请日:2016-07-01
申请人: MIKHAIL PLOTNIKOV , IGOR ERMOLAEV
发明人: MIKHAIL PLOTNIKOV , IGOR ERMOLAEV
IPC分类号: G06F9/30
CPC分类号: G06F9/3016 , G06F9/30021 , G06F9/30036 , G06F9/30192
摘要: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor incudes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.
-
公开(公告)号:US09740493B2
公开(公告)日:2017-08-22
申请号:US13994549
申请日:2012-09-28
CPC分类号: G06F9/38 , G06F8/4441 , G06F8/452 , G06F9/30018 , G06F9/30036 , G06F15/8084
摘要: Loop vectorization methods and apparatus are disclosed. An example method includes generating a first control mask for a set of iterations of a loop by evaluating a condition of the loop, wherein generating the first control mask includes setting a bit of the control mask to a first value when the condition indicates that an operation of the loop is to be executed, and setting the bit of the first control mask to a second value when the condition indicates that the operation of the loop is to be bypassed. The example method also includes compressing indexes corresponding to the first set of iterations of the loop according to the first control mask.
-
14.
公开(公告)号:US20140201497A1
公开(公告)日:2014-07-17
申请号:US13976004
申请日:2011-12-23
IPC分类号: G06F9/30
CPC分类号: G06F9/3555 , G06F9/3001 , G06F9/30036 , G06F9/30098 , G06F9/30145 , G06F9/3016 , G06F9/355 , G06F9/3802 , G06F9/3893
摘要: An apparatus is described having functional unit logic circuitry. The functional unit logic circuitry has a first register to store a first input vector operand having an element for each dimension of a multi-dimensional data structure. Each element of the first vector operand specifying the size of its respective dimension. The functional unit has a second register to store a second input vector operand specifying coordinates of a particular segment of the multi-dimensional structure. The functional unit also has logic circuitry to calculate an address offset for the particular segment relative to an address of an origin segment of the multi-dimensional structure.
摘要翻译: 描述了具有功能单元逻辑电路的装置。 功能单元逻辑电路具有第一寄存器以存储具有用于多维数据结构的每个维度的元素的第一输入向量操作数。 第一个向量操作数的每个元素指定其相应维度的大小。 功能单元具有第二寄存器,用于存储指定多维结构的特定段的坐标的第二输入向量操作数。 功能单元还具有逻辑电路,用于相对于多维结构的原点片段的地址计算特定片段的地址偏移。
-
15.
公开(公告)号:US10162639B2
公开(公告)日:2018-12-25
申请号:US15912498
申请日:2018-03-05
摘要: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.
-
16.
公开(公告)号:US10162638B2
公开(公告)日:2018-12-25
申请号:US15912486
申请日:2018-03-05
摘要: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.
-
17.
公开(公告)号:US10162637B2
公开(公告)日:2018-12-25
申请号:US15912468
申请日:2018-03-05
摘要: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.
-
公开(公告)号:US20140189287A1
公开(公告)日:2014-07-03
申请号:US13728506
申请日:2012-12-27
IPC分类号: G06F9/30
CPC分类号: G06F9/30145 , G06F9/30018 , G06F9/30021 , G06F9/30036 , G06F9/30065 , G06F9/3016 , G06F9/325
摘要: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.
摘要翻译: 在一个实施例中,本发明涉及一种包括解码逻辑以接收多维循环计数器更新指令并将多维循环计数器更新指令解码为至少一个解码指令的处理器,以及执行逻辑 所述至少一个解码指令将与所述多维循环计数器更新指令相关联的第一操作数的至少一个循环计数器值更新第一量。 还公开了使用这样的指令折叠环的方法。 描述和要求保护其他实施例。
-
19.
公开(公告)号:US20180196671A1
公开(公告)日:2018-07-12
申请号:US15912486
申请日:2018-03-05
CPC分类号: G06F9/30145 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3834
摘要: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.
-
公开(公告)号:US20180189064A1
公开(公告)日:2018-07-05
申请号:US15396199
申请日:2016-12-30
IPC分类号: G06F9/30
CPC分类号: G06F9/30036 , G06F9/3001 , G06F9/30018 , G06F9/30021 , G06F9/30101 , G06F9/3016 , G06F9/3017
摘要: Embodiments of systems, apparatuses, and methods for executing an instruction. In some instances, the instruction has fields for a first source operand and a second source operand, and a destination operand. A decoded instruction causes a reduction of broadcasted packed data elements of a first packed data source with a reduction operation and store a result of each of the reductions in a packed data destination, wherein the packed data elements of the first packed data source to be used in the reduction are dictated by a result of a comparison of broadcasted values of packed data elements stored in a second packed data source to the packed data elements stored in the second packed data source without broadcasting.
-
-
-
-
-
-
-
-
-