Patent search ap:("Intel Corporation") AND inv:"Vinodh Gopal" Page 15

141.

发明授权
Systems and methods for performing matrix compress and decompress instructions 有权

公开(公告)号：US11249761B2

公开(公告)日：2022-02-15

申请号：US16934003

申请日：2020-07-20

Applicant: Intel Corporation

Inventor： Dan Baum , Michael Espig , James Guilford , Wajdi K. Feghali , Raanan Sade , Christopher J. Hughes , Robert Valentine , Bret Toll , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney , Vinodh Gopal , Ronen Zohar , Alexander F. Heinecke

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

142.

发明申请
ADDITION INSTRUCTIONS WITH INDEPENDENT CARRY CHAINS 有权

公开(公告)号：US20220027154A1

公开(公告)日：2022-01-27

申请号：US17496632

申请日：2021-10-07

Applicant: Intel Corporation

Inventor： Vinodh Gopal , James D. Guilford , Gilbert M. Wolrich , Wajdi K. Feghali , Erdinc Ozturk , Martin G. Dixon , Sean P. Mirkes , Matthew C. Merten , Tong Li , Bret T. Toll, I

IPC: G06F9/30 , G06F9/38

Abstract: A number of addition instructions are provided that have no data dependency between each other. A first addition instruction stores its carry output in a first flag of a flags register without modifying a second flag in the flags register. A second addition instruction stores its carry output in the second flag of the flags register without modifying the first flag in the flags register.

143.

发明授权
Atomic-copy-XOR instruction for replacing data in a first cacheline with data from a second cacheline 有权

公开(公告)号：US11200054B2

公开(公告)日：2021-12-14

申请号：US16019302

申请日：2018-06-26

Applicant: Intel Corporation

Inventor： Vinodh Gopal

IPC: G06F9/312 , G06F9/30 , G06F9/38

Abstract: Apparatus and associated methods for implementing atomic instructions for copy-XOR of data. An atomic-copy-xor instruction is defined having a first operand comprising an address of a first cacheline and a second operand comprising an address of a second cacheline. The atomic-copy-xor instruction, which may be included in an instruction set architecture (ISA) of a processor, performs a bitwise XOR operation on copies of data retrieved from the first cacheline and second cacheline to generate an XOR result, and replaces the data in the first cacheline with a copy of data from the second cacheline when the XOR result is non-zero. In addition to implementation using a processor core, the atomic-copy-xor instruction may be implemented using various offloading schemes under which the processor core executing the atomic-copy-xor instruction offloads operations to other components in the processor or system in which the processor is implemented, including offloading operations to a last level cache (LLC) engine, a memory controller, or a DIMM controller.

144.

发明授权
Apparatuses, methods, and systems for hashing instructions 有权

公开(公告)号：US11188335B2

公开(公告)日：2021-11-30

申请号：US17087536

申请日：2020-11-02

Applicant: Intel Corporation

Inventor： Regev Shemy , Zeev Sperber , Wajdi Feghali , Vinodh Gopal , Amit Gradstein , Simon Rubanovich , Sean Gulley , Ilya Albrekht , Jacob Doweck , Jose Yallouz , Ittai Anati

IPC: G06F9/30 , G06F9/38 , H04L9/06

Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described. In one embodiment, a processor includes a decode circuit to decode a single instruction into a decoded single instruction, the single instruction including at least one first field that identifies eight 32-bit state elements A, B, C, D, E, F, G, and H for a round according to a SM3 hashing standard and at least one second field that identifies an input message; and an execution circuit to execute the decoded single instruction to: rotate state element C left by 9 bits to form a rotated state element C, rotate state element D left by 9 bits to form a rotated state element D, rotate state element G left by 19 bits to form a rotated state element G, rotate state element H left by 19 bits to form a rotated state element H, perform two rounds according to the SM3 hashing standard on the input message and state element A, state element B, rotated state element C, rotated state element D, state element E, state element F, rotated state element G, and rotated state element H to generate an updated state element A, an updated state element B, an updated state element E, and an updated state element F, and store the updated state element A, the updated state element B, the updated state element E, and the updated state element F into a location specified by the single instruction.

145.

发明授权
Method and apparatus for energy efficient decompression using ordered tokens 有权

公开(公告)号：US11126663B2

公开(公告)日：2021-09-21

申请号：US15604793

申请日：2017-05-25

Applicant: Intel Corporation

Inventor： Sudhir K. Satpathy , Vikram B. Suresh , Sanu K. Mathew , Vinodh Gopal

IPC: G06F16/903 , G06F16/901 , G06F12/02

Abstract: In one embodiment, an apparatus comprises a decompression engine to determine a plurality of tokens used to encode a block of data; populate a lookup table with at least two of the tokens in order of increasing token length; disable a first portion of the lookup table and enable a second portion of the lookup table based on a value of a payload of the block of data; and search for a match between a token and the payload in the second portion of the lookup table.

146.

发明申请
METHODS AND APPARATUS TO PARALLELIZE DATA DECOMPRESSION 有权

公开(公告)号：US20210211139A1

公开(公告)日：2021-07-08

申请号：US16996012

申请日：2020-08-18

Applicant: Intel Corporation

Inventor： Vinodh Gopal , James D. Guilford , Sudhir K. Satpathy , Sanu K. Mathew

IPC: H03M7/30 , H03M7/40

Abstract: Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.

147.

发明授权
Low-latency link compression schemes 有权

公开(公告)号：US10924591B2

公开(公告)日：2021-02-16

申请号：US16014690

申请日：2018-06-21

Applicant: INTEL CORPORATION

Inventor： Wajdi Feghali , Vinodh Gopal , Kirk Yap , Sean Gulley , Simon Peffers

IPC: H04L29/06 , H04L12/863

Abstract: Methods and apparatus for low-latency link compression schemes. Under the schemes, selected packets or messages are dynamically selected for compression in view of current transmit queue levels. The latency incurred during compression and decompression is not added to the data-path, but sits on the side of the transmit queue. The system monitors the queue depth and, accordingly, initiates compression jobs based on the depth. Different compression levels may be dynamically selected and used based on queue depth. Under various schemes, either packets or messages are enqueued in the transmit queue or pointers to such packets and messages are enqueued. Additionally, packets/message may be compressed prior to being enqueued, or after being enqueued, wherein an original uncompressed packet is replaced with a compressed packet. Compressed and uncompressed packets may be stored in queues or buffers and transmitted using a different numbers of transmit cycles based on their compression ratios. The schemes may be implemented to improve the effective bandwidth of various types of links, including serial links, bus-type links, and socket-to-socket links in multi-socket systems.

148.

发明授权
Instruction and logic to provide SIMD secure hashing round slice functionality 有权

公开(公告)号：US10686591B2

公开(公告)日：2020-06-16

申请号：US16208542

申请日：2018-12-03

Applicant: Intel Corporation

Inventor： Gilbert M. Wolrich , Vinodh Gopal , Kirk S. Yap

IPC: H04L9/06 , G06F21/64 , G06F9/30 , G06F21/60 , G06F9/38 , G06F15/80

Abstract: Instructions and logic provide SIMD secure hashing round slice functionality. Some embodiments include a processor comprising: a decode stage to decode an instruction for a SIMD secure hashing algorithm round slice, the instruction specifying a source data operand set, a message-plus-constant operand set, a round-slice portion of the secure hashing algorithm round, and a rotator set portion of rotate settings. Processor execution units, are responsive to the decoded instruction, to perform a secure hashing round-slice set of round iterations upon the source data operand set, applying the message-plus-constant operand set and the rotator set, and store a result of the instruction in a SIMD destination register. One embodiment of the instruction specifies a hash round type as one of four MD5 round types. Other embodiments may specify a hash round type by an immediate operand as one of three SHA-1 round types or as a SHA-2 round type.

149.

发明授权
Method and apparatus for performing a shift and exclusive or operation in a single instruction 有权

公开(公告)号：US10684855B2

公开(公告)日：2020-06-16

申请号：US15686889

申请日：2017-08-25

Applicant: Intel Corporation

Inventor： Vinodh Gopal , James D. Guilford , Erdinc Ozturk , Wajdi K. Feghali , Gilbert M. Wolrich , Martin G. Dixon

IPC: G06F7/00 , G06F7/38 , G06F1/02 , G06F17/50 , G06F9/30 , G06F9/38

Abstract: Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value.

150.

发明授权
Flexible architecture and instruction for advanced encryption standard (AES) 有权

公开(公告)号：US10581590B2

公开(公告)日：2020-03-03

申请号：US14984637

申请日：2015-12-30

Applicant: Intel Corporation

Inventor： Shay Gueron , Wajdi K Feghali , Vinodh Gopal , Raghunandan Makaram , Martin G Dixon , Srinivas Chennupaty , Michael E Kounavis

IPC: H04L9/28 , G06F21/72 , H04L9/06 , G06F9/30 , G06F9/38 , H04L9/08 , G06F12/14 , G06F21/60 , G06F12/0875 , G06F12/0862 , G11C7/10 , G06F3/06

Abstract: A flexible aes instruction set for a general purpose processor is provided. The instruction set includes instructions to perform a “one round” pass for aes encryption or decryption and also includes instructions to perform key generation. An immediate may be used to indicate round number and key size for key generation for 128/192/256 bit keys. The flexible aes instruction set enables full use of pipelining capabilities because it does not require tracking of implicit registers.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification