Patent search ap:("Intel Corporation") AND inv:"Jesus Corbal" Page 3

21.

发明申请
SYSTEMS AND METHODS FOR EXECUTING A FUSED MULTIPLY-ADD INSTRUCTION FOR COMPLEX NUMBERS 有权

公开(公告)号：US20210357217A1

公开(公告)日：2021-11-18

申请号：US17335942

申请日：2021-06-01

Applicant: Intel Corporation

Inventor： Roman S. Dubtsov , Robert Valentine , Jesus Corbal , Milind Girkar , Elmoustapha Ould-Ahmed-Vall

IPC: G06F9/30

Abstract: Disclosed embodiments relate to executing a vector-complex fused multiply-add Instruction. In one example, a method includes fetching an instruction, a format of the instruction including an opcode, a first source operand identifier, a second source operand identifier, and a destination operand identifier, wherein each of the identifiers identifies a location storing a packed data comprising at least one complex number, decoding the instruction, retrieving data associated with the first and second source operand identifiers, and executing the decoded instruction to, for each packed data element position of the identified first and second source operands, cross-multiply the real and imaginary components to generate four products: a product of real components, a product of imaginary components, and two mixed products, generate a complex result by using the four products according to the instruction, and store a result to the corresponding position of the identified destination operand.

22.

发明授权
Systems, methods, and apparatus for tile configuration 有权

公开(公告)号：US11080048B2

公开(公告)日：2021-08-03

申请号：US16487777

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Menachem Adelman , Robert Valentine , Zeev Sperber , Mark J. Charney , Bret L. Toll , Rinat Rappoport , Jesus Corbal , Dan Baum , Alexander F. Heinecke , Elmoustapha Ould-Ahmed-Vall , Yuri Gebil , Raanan Sade

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F17/16 , G06F7/76 , G06F9/38

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

23.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR SWIZZLE OPERATIONS IN A CONFIGURABLE SPATIAL ACCELERATOR 审中-公开

公开(公告)号：US20200310797A1

公开(公告)日：2020-10-01

申请号：US16370915

申请日：2019-03-30

Applicant: Intel Corporation

Inventor： Jesus Corbal , Rohan Sharma , Simon Steely, JR. , Chinmay Ashok , Kent D. Glossop , Dennis Bradford , Paul Caprioli , Louise Huot , Kermin ChoFleming , Barry Tannenbaum

IPC: G06F9/30 , G06F9/54

Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA. In one embodiment, a CSA includes a plurality of processing elements, a circuit switched interconnect network between the plurality of processing elements, and a configuration register within each processing element to store a configuration value having a first portion that, when set to a first value that indicates a first mode, causes the processing element to pass an input value to operation circuitry of the processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the processing element, and a second portion that causes the processing element to perform an operation indicated by the second portion the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry.

24.

发明授权
Floating point to fixed point conversion 有权

公开(公告)号：US10763891B2

公开(公告)日：2020-09-01

申请号：US16291231

申请日：2019-03-04

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: H03M7/24 , H03M7/40 , H03M7/42

Abstract: Embodiments of an instruction, its operation, and executional support for the instruction are described. In some embodiments, a processor comprises decode circuitry to decode an instruction having fields for an opcode, a packed data source operand identifier, and a packed data destination operand identifier; and execution circuitry to execute the decoded instruction to convert a single precision floating point data element of a least significant packed data element position of the identified packed data source operand to a fixed-point representation, store the fixed-point representation as 32-bit integer and a 32-bit integer exponent in the two least significant packed data element positions of the identified packed data destination operand, and zero of all remaining packed data elements of the identified packed data destination operand.

25.

发明授权
Apparatus and method for multiplying, summing, and accumulating sets of packed bytes 有权

公开(公告)号：US10705839B2

公开(公告)日：2020-07-07

申请号：US15850499

申请日：2017-12-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney , Jesus Corbal

IPC: G06F9/30

Abstract: A processor having a decoder to decode an instruction to generate a decoded instruction; a first source register to store a first plurality of packed signed bytes; a second source register to store a second plurality of packed signed bytes; execution circuitry to execute the decoded instruction, the execution circuitry including: multiplier circuitry to multiply each packed signed byte from the first source register with a corresponding packed signed byte from the second source register to generate temporary products, adder circuitry to add a plurality of sets of the temporary products to generate a plurality of temporary sums; negation and extension circuitry to negate and extend each of the temporary sums to doublewords sums; and accumulation circuitry to add each of the doublewords sums to a doubleword from a third source register to generate final doubleword results; and a packed data destination register to store the final doubleword results.

26.

发明授权
Systems and methods for implementing chained tile operations 有权

公开(公告)号：US10664287B2

公开(公告)日：2020-05-26

申请号：US15942201

申请日：2018-03-30

Applicant: Intel Corporation

Inventor： Christopher J. Hughes , Alexander F. Heinecke , Robert Valentine , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall

IPC: G06F9/30 , G06F15/80 , G06F17/16 , G06F9/302 , G06F9/312 , G06F9/38 , G06F15/78

Abstract: Disclosed embodiments relate to systems and methods for implementing chained tile operations. In one example, a processor includes fetch circuitry to fetch one or more instructions until a plurality of instructions has been fetched, each instruction to specify source and destination tile operands, decode circuitry to decode the fetched instructions, and execution circuitry, responsive to the decoded instructions, to: identify first and second decoded instructions belonging to a chain of instructions, dynamically select and configure a SIMD path comprising first and second processing engines (PE) to execute the first and second decoded instructions, and set aside the specified destination of the first decoded instruction, and instead route a result of the first decoded instruction from the first PE to be used by the second PE to perform the second decoded instruction.

27.

发明授权
Zeroing a cache line 有权

公开(公告)号：US10282296B2

公开(公告)日：2019-05-07

申请号：US15376647

申请日：2016-12-12

Applicant: Intel Corporation

Inventor： Jason W. Brandt , Robert S. Chappell , Jesus Corbal , Edward T. Grochowski , Stephen H. Gunther , Buford M. Guy , Thomas R. Huff , Elmoustapha Ould-Ahmed-Vall , Bret L. Toll , David Papworth , James D. Allen

IPC: G06F12/0831 , G06F12/1009 , G06F12/1027

Abstract: Embodiments of an invention a processor architecture are disclosed. In an embodiment, a processor includes a decoder, an execution unit, a coherent cache, and an interconnect. The decoder is to decode an instruction to zero a cache line. The execution unit is to issue a write command to initiate a cache line sized write of zeros. The coherent cache is to receive the write command, to determine whether there is a hit in the coherent cache and whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros, and to issue the write command toward the interconnect. The interconnect is to, responsive to receipt of the write command, issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit.

28.

发明申请
SYSTEMS AND METHODS TO ZERO A TILE REGISTER PAIR 审中-公开

公开(公告)号：US20190042256A1

公开(公告)日：2019-02-07

申请号：US15858947

申请日：2017-12-29

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman , Eyal Hadas

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to systems and methods to zero a tile register pair. In one example, a processor includes decode circuitry to decode a matrix pair zeroing instruction having fields for an opcode and an identifier to identify a destination matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded matrix pair zeroing instruction to zero every element of a left matrix and a right matrix of the identified destination matrix.

29.

发明授权
Apparatus and method of improved permute instructions with multiple granularities 有权

公开(公告)号：US09946540B2

公开(公告)日：2018-04-17

申请号：US15601960

申请日：2017-05-22

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

IPC: G06F9/30

CPC classification number: G06F9/30029 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/30109

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

30.

发明授权
Instruction and logic to perform a centrifuge operation 有权

公开(公告)号：US09904548B2

公开(公告)日：2018-02-27

申请号：US14580069

申请日：2014-12-22

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark J. Charney

IPC: G06F15/00 , G06F7/38 , G06F9/44 , G06F9/00 , G06F9/30

CPC classification number: G06F9/30145 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3013

Abstract: A processing device implements a set of instructions to perform a centrifuge operation using vector or general purpose registers. In one embodiment, the centrifuge operation separates bits in a source register to opposing regions of a destination register based on a control mask, where each source register bit with a corresponding control mask value of one is written to one region in a destination register, while source register bits with a corresponding control mask value of zero are written to an opposing region of the destination register.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification