Patent search ap:("INTEL CORPORATION") AND inv:"ROBERT VALENTINE" Page 3

21.

发明申请
FUNCTIONAL UNIT FOR INSTRUCTION EXECUTION PIPELINE CAPABLE OF SHIFTING DIFFERENT CHUNKS OF A PACKED DATA OPERAND BY DIFFERENT AMOUNTS 审中-公开

公开(公告)号：US20180217841A1

公开(公告)日：2018-08-02

申请号：US15849333

申请日：2017-12-20

Applicant: Intel Corporation

Inventor： TAL ULIEL , ROBERT VALENTINE

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3802 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3824 , G06F9/3867

Abstract: A method is described that includes fetching an instruction. The method further includes decoding the instruction. The instruction specifies an operation, a first operand and a second operand. The method further includes fetching the first and second operands of the instruction. The first and second operands are each composed of a plurality of larger chunks having constituent elements. The method further includes performing the operation specified by the instruction including generating a resultant composed of a plurality of larger chunks having constituent elements. The generating of the resultant includes selecting for each element in the resultant a contiguous group of bits from a same positioned chunk of the first operand as the chunk of the element in the resultant, the contiguous group of bits being identified by a same positioned element of the second operand as the element in the resultant.

22.

发明申请
APPARATUS AND METHOD OF IMPROVED EXTRACT INSTRUCTIONS 审中-公开

公开(公告)号：US20180081689A1

公开(公告)日：2018-03-22

申请号：US15809818

申请日：2017-11-10

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , JESUS CORBAL , BRET L. TOLL , MARK J. CHARNEY , ZEEV SPERBER , AMIT GRADSTEIN

IPC: G06F9/30

Abstract: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non-overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non-overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity and second granularity.

23.

发明申请
FUSIBLE INSTRUCTIONS AND LOGIC TO PROVIDE OR-TEST AND AND-TEST FUNCTIONALITY USING MULTIPLE TEST SOURCES 审中-公开
Title translation: 使用多个测试源提供可靠的说明和逻辑提供测试和测试功能

公开(公告)号：US20170052788A1

公开(公告)日：2017-02-23

申请号：US15340916

申请日：2016-11-01

Applicant: Intel Corporation

Inventor： MAXIM LOKTYUKHIN , ROBERT VALENTINE , JULIAN C. HORN , MARK J. CHARNEY

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3822 , G06F9/30029 , G06F9/30058 , G06F9/30094 , G06F9/3836

Abstract: Fusible instructions and logic provide OR-test and AND-test functionality on multiple test sources. Some embodiments include a processor decode stage to decode a test instruction for execution, the instruction specifying first, second and third source data operands, and an operation type. Execution units, responsive to the decoded test instruction, perform one logical operation, according to the specified operation type, between data from the first and second source data operands, and perform a second logical operation between the data from the third source data operand and the result of the first logical operation to set a condition flag. Some embodiments generate the test instruction dynamically by fusing one logical instruction with a prior-art test instruction. Other embodiments generate the test instruction through a just-in-time compiler. Some embodiments also fuse the test instruction with a subsequent conditional branch instruction, and perform a branch according to how the condition flag is set.

Abstract translation: 易熔指令和逻辑在多个测试源上提供OR测试和与测试功能。一些实施例包括解码用于执行的测试指令的处理器解码级，指定第一，第二和第三源数据操作数的指令以及操作类型。执行单元响应于解码的测试指令，根据指定的操作类型在来自第一和第二源数据操作数的数据之间执行一个逻辑操作，并且执行来自第三源数据操作数的数据和第一个逻辑运算结果设置条件标志。一些实施例通过将一个逻辑指令与现有技术的测试指令进行融合来动态地产生测试指令。其他实施例通过即时编译器生成测试指令。一些实施例还将测试指令与随后的条件分支指令融合，并且根据条件标志的设置来执行分支。

24.

发明申请
METHOD AND APPARATUS FOR PERFORMING A VECTOR BIT REVERSAL AND CROSSING 有权
Title translation: 用于执行向量位反转和交叉的方法和装置

公开(公告)号：US20160179529A1

公开(公告)日：2016-06-23

申请号：US14581738

申请日：2014-12-23

Applicant: INTEL CORPORATION

Inventor： JESUS CORBAL , ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , MARK J. CHARNEY

IPC: G06F9/30

CPC classification number: G06F9/30036 , G06F9/30018 , G06F9/30032

Abstract: An apparatus and method for performing a vector bit reversal and crossing. For example, one embodiment of a processor comprises: a first source vector register to store a first plurality of source bit groups, wherein a size for the bit groups is to be specified in an immediate of an instruction; a second source vector to store a second plurality of source bit groups; vector bit reversal and crossing logic to determine a bit group size from the immediate and to responsively reverse positions of contiguous bit groups within the first source vector register to generate a set of reversed bit groups, wherein the vector bit reversal and crossing logic is to additionally interleave the set of reversed bit groups with the second plurality of bit groups; and a destination vector register to store the reversed bit groups interleaved with the first plurality of bit groups.

Abstract translation: 用于执行向量位反转和交叉的装置和方法。例如，处理器的一个实施例包括：第一源向量寄存器，用于存储第一多个源位组，其中用于位组的大小将在指令的立即指定中; 用于存储第二多个源比特组的第二源向量; 矢量位反转和交叉逻辑，以从第一源向量寄存器内的连续位组的立即和响应地反向位置确定位组大小，以产生一组反向位组，其中向量位反转和交叉逻辑额外地将所述一组反转位组与所述第二多个位组进行交织; 以及目的地向量寄存器，用于存储与第一多个比特组交织的反向比特组。

25.

发明申请
METHOD AND APPARATUS FOR VECTOR INDEX LOAD AND STORE 有权
Title translation: 矢量索引装载和存储的方法和装置

公开(公告)号：US20160179526A1

公开(公告)日：2016-06-23

申请号：US14581289

申请日：2014-12-23

Applicant: INTEL CORPORATION

Inventor： ASHISH JHA , ROBERT VALENTINE , ELMOUSTAPHA OULD-AHMED-VALL

IPC: G06F9/30

CPC classification number: G06F9/30036 , G06F9/30018 , G06F9/30043 , G06F9/30101 , G06F15/8053

Abstract: An apparatus and method for performing vector index loads and stores. For example, one embodiment of a processor comprises: a vector index register to store a plurality of index values; a mask register to store a plurality of mask bits; a vector register to store a plurality of vector data elements loaded from memory; and vector index load logic to identify an index stored in the vector index register to be used for a load operation using an immediate value and to responsively combine the index with a base memory address to determine a memory address for the load operation, the vector index load logic to load vector data elements from the memory address to the vector register in accordance with the plurality of mask bits.

Abstract translation: 用于执行向量索引加载和存储的装置和方法。例如，处理器的一个实施例包括：矢量索引寄存器，用于存储多个索引值; 掩模寄存器，用于存储多个掩码位; 向量寄存器，用于存储从存储器加载的多个向量数据元素; 以及矢量索引负载逻辑，以识别存储在矢量索引寄存器中的索引，以用于使用立即值的加载操作，并且响应地将索引与基本存储器地址组合以确定用于加载操作的存储器地址，向量索引负载逻辑，以根据多个掩码位将矢量数据元素从存储器地址加载到向量寄存器。

26.

发明申请
METHOD AND APPARATUS FOR EXPANDING A MASK TO A VECTOR OF MASK VALUES 审中-公开
Title translation: 将掩模扩展到掩蔽值矢量的方法和装置

公开(公告)号：US20160179521A1

公开(公告)日：2016-06-23

申请号：US14581578

申请日：2014-12-23

Applicant: INTEL CORPORATION

Inventor： ASHISH JHA , ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE

IPC: G06F9/30

CPC classification number: G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/30072

Abstract: An apparatus and method for performing a mask expand. For example, one embodiment of a processor comprises: a source mask register to store a plurality of mask values; mask expand logic to identify a first mask bit in the source mask register to be expanded using an index value and to determine a number of bit positions within a destination mask register into which the first mask bit is to be expanded using a second value, the mask expand logic to responsively copy the first mask bit to each of the determined bit positions within the destination mask register.

Abstract translation: 一种用于执行掩模扩展的装置和方法。例如，处理器的一个实施例包括：源掩码寄存器，用于存储多个掩码值; 掩码扩展逻辑，以使用索引值来识别要被扩展的源掩码寄存器中的第一掩码位，并且使用第二值确定目标掩码寄存器中要扩展第一掩码位的位位数，掩码扩展逻辑以将第一掩码位响应地复制到目的掩码寄存器中的每个确定的位位置。

27.

发明申请
APPARATUS AND METHOD FOR SCALING PRE-SCALED RESULTS OF COMPLEX MUTIPLY-ACCUMULATE OPERATIONS ON PACKED REAL AND IMAGINARY DATA ELEMENTS 有权

公开(公告)号：US20220326946A1

公开(公告)日：2022-10-13

申请号：US17589428

申请日：2022-01-31

Applicant: Intel Corporation

Inventor： VENKATESWARA MADDURI , ELMOUSTAPHA OULD-AHMED-VALL , MARK CHARNEY , ROBERT VALENTINE , JESUS CORBAL , BINWEI YANG

IPC: G06F9/30 , G06F7/544 , G06F17/14 , G06F7/48

Abstract: An apparatus and method for performing a transform on complex data. For example, one embodiment of a processor comprises: multiplier circuitry to multiply packed real N-bit data elements in the first source register with packed real M-bit data elements in the second source register and to multiply packed imaginary N-bit data elements in the first source register with packed imaginary M-bit data elements in the second source register to generate at least four real products, adder circuitry to subtract a first selected real product from a second selected real product to generate a first temporary result and to subtract a third selected real product from a fourth selected real product to generate a second temporary result, the adder circuitry to add the first temporary result to a first packed N-bit data element from the third source register to generate a first pre-scaled result, to subtract the first temporary result from the first packed N-bit data element to generate a second pre-scaled result, to add the second temporary result to a second packed N-bit data element from the third source register to generate a third pre-scaled result, and to subtract the second temporary result from the second packed N-bit data element to generate a fourth pre-scaled result; scaling circuitry to scale the first, second, third and fourth pre-scaled results to a specified bit width to generate first, second, third, and fourth final results; and a destination register to store the first, second, third, and fourth final results in specified data element positions.

28.

发明申请
INSTRUCTION EXECUTION THAT BROADCASTS AND MASKS DATA VALUES AT DIFFERENT LEVELS OF GRANULARITY 有权

公开(公告)号：US20220215117A1

公开(公告)日：2022-07-07

申请号：US17677958

申请日：2022-02-22

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , JESUS CORBAL , BRET L. TOLL , MARK J. CHARNEY

IPC: G06F21/62 , G06F16/27 , G06F21/70 , G06F9/30 , G06F9/38

Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure. The execution unit also includes masking logic circuitry to mask the first replication data structure at a first granularity and mask the second replication data structure at a second granularity. The second granularity is twice as fine as the first granularity.

29.

发明申请
APPARATUS AND METHOD FOR COMPLEX BY COMPLEX CONJUGATE MULTIPLICATION 有权

公开(公告)号：US20220171624A1

公开(公告)日：2022-06-02

申请号：US17672504

申请日：2022-02-15

Applicant: Intel Corporation

Inventor： VENKATESWARA MADDURI , ELMOUSTAPHA OULD-AHMED-VALL , JESUS CORBAL , MARK CHARNEY , ROBERT VALENTINE , BINWEI YANG

IPC: G06F9/30

Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction. The execution circuitry includes: multiplier circuitry to select real and imaginary data elements in the first source register and second source, multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products; adder circuitry to add a first subset of the plurality of imaginary products and subtract a second subset of the plurality of imaginary products to generate a first temporary result, and to add a third subset of the plurality of imaginary products and subtract a fourth subset of the plurality of imaginary products to generate a second temporary result; and accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result, combine the second temporary result with second data from the destination register to generate a second final result, and store the first final result and second final result back in the destination register.

30.

发明申请
APPARATUS AND METHOD FOR PERFORMING DUAL SIGNED AND UNSIGNED MULTIPLICATION OF PACKED DATA ELEMENTS 有权

公开(公告)号：US20210004227A1

公开(公告)日：2021-01-07

申请号：US17027230

申请日：2020-09-21

Applicant: Intel Corporation

Inventor： VENKATESWARA MADDURI , ELMOUSTAPHA OULD-AHMED-VALL , JESUS CORBAL , MARK CHARNEY , ROBERT VALENTINE , BINWEI YANG

IPC: G06F9/30 , G06F7/00

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed byte data elements; a second source register to store a second plurality of packed byte data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to concurrently multiply each of the packed byte data elements of the first plurality with a corresponding packed byte data element of the second plurality to generate a plurality of products; adder circuitry to add specified sets of the products to generate temporary results for each set of products; zero-extension or sign-extension circuitry to zero-extend or sign-extend the temporary result for each set to generate an extended temporary result for each set; accumulation circuitry to combine each of the extended temporary results with a selected packed data value stored in a third source register to generate a plurality of final results; and a destination register to store the plurality of final results as a plurality of packed data elements in specified data element positions.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification