-
221.
公开(公告)号:US20220197813A1
公开(公告)日:2022-06-23
申请号:US17133624
申请日:2020-12-23
Applicant: Intel Corporation
Inventor: Jayesh Gaur , Adarsh Chauhan , Vinodh Gopal , Vedvyas Shanbhogue , Sreenivas Subramoney , Wajdi Feghali
IPC: G06F12/0875 , G06F12/0813 , G06F12/0811 , G06F12/1045
Abstract: Methods and apparatus relating to techniques for increasing per core memory bandwidth by using forget store operations are described. In an embodiment, a cache stores a buffer. Execution circuitry executes an instruction. The instruction causes one or more cachelines in the cache to be marked based on a start address for the buffer and a size of the buffer. A marked cacheline in the cache is to be prevented from being written back to memory. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11258459B2
公开(公告)日:2022-02-22
申请号:US16996012
申请日:2020-08-18
Applicant: Intel Corporation
Inventor: Vinodh Gopal , James D. Guilford , Sudhir K. Satpathy , Sanu K. Mathew
Abstract: Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.
-
公开(公告)号:US20210365264A1
公开(公告)日:2021-11-25
申请号:US17393361
申请日:2021-08-03
Applicant: Intel Corporation
Inventor: Vinodh Gopal , James D. Guilford , Gilbert M. Wolrich , Wajdi K. Feghali , Erdinc Ozturk , Martin G. Dixon , Sean P. Mirkes , Matthew C. Merten , Tong Li , Bret T. Toll, I
Abstract: A number of addition instructions are provided that have no data dependency between each other. A first addition instruction stores its carry output in a first flag of a flags register without modifying a second flag in the flags register. A second addition instruction stores its carry output in the second flag of the flags register without modifying the first flag in the flags register.
-
公开(公告)号:US11106461B2
公开(公告)日:2021-08-31
申请号:US15939693
申请日:2018-03-29
Applicant: Intel Corporation
Inventor: Vinodh Gopal , James D. Guilford , Gilbert M. Wolrich , Wajdi K. Feghali , Erdinc Ozturk , Martin G. Dixon , Sean P. Mirkes , Bret L. Toll , Maxim Loktyukhin , Mark C. Davis , Alexandre J. Farcy
IPC: G06F9/30
Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.
-
公开(公告)号:US20210049013A1
公开(公告)日:2021-02-18
申请号:US17087536
申请日:2020-11-02
Applicant: Intel Corporation
Inventor: Regev Shemy , Zeev Sperber , Wajdi Feghali , Vinodh Gopal , Amit Gradstein , Simon Rubanovich , Sean Gulley , Ilya Albrekht , Jacob Doweck , Jose Yallouz , Ittai Anati
Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described. In one embodiment, a processor includes a decode circuit to decode a single instruction into a decoded single instruction, the single instruction including at least one first field that identifies eight 32-bit state elements A, B, C, D, E, F, G, and H for a round according to a SM3 hashing standard and at least one second field that identifies an input message; and an execution circuit to execute the decoded single instruction to: rotate state element C left by 9 bits to form a rotated state element C, rotate state element D left by 9 bits to form a rotated state element D, rotate state element G left by 19 bits to form a rotated state element G, rotate state element H left by 19 bits to form a rotated state element H, perform two rounds according to the SM3 hashing standard on the input message and state element A, state element B, rotated state element C, rotated state element D, state element E, state element F, rotated state element G, and rotated state element H to generate an updated state element A, an updated state element B, an updated state element E, and an updated state element F, and store the updated state element A, the updated state element B, the updated state element E, and the updated state element F into a location specified by the single instruction.
-
226.
公开(公告)号:US10922079B2
公开(公告)日:2021-02-16
申请号:US15856245
申请日:2017-12-28
Applicant: Intel Corporation
Inventor: Vinodh Gopal , Kirk S. Yap , James Guilford , Simon N. Peffers
IPC: G06F9/00 , G06F9/30 , G06F16/2455 , G06F16/2453 , G06F16/245
Abstract: Data element filter logic (“hardware accelerator”) in a processor that offloads computation for an in-memory database select/extract operation from a Central Processing Unit (CPU) core in the processor is provided. The Data element filter logic provides a balanced performance across an entire range of widths (number of bits) of data elements in a column-oriented Database Management System.
-
公开(公告)号:US10694217B2
公开(公告)日:2020-06-23
申请号:US16137985
申请日:2018-09-21
Applicant: Intel Corporation
Inventor: Sudhir K. Satpathy , Vinodh Gopal , James D. Guilford , Sanu K. Mathew , Vikram B. Suresh
Abstract: A processing device includes compression circuitry to encode an input stream with an encoding that translates multiple symbols of fixed length into multiple codes of variable length between one and a maximum length, to generate a compressed stream. The compression circuitry is to: determine at least a first symbol of the multiple symbols having a first code that exceeds the maximum length; identify a short code of the multiple codes that is to be lengthened to provide an increased encoding capacity for the at least the first symbol; generate multiple code-length converted values including to increase the length of the short code to the maximum length and decrease, to the maximum length, a length of the first code of the at least the first symbol; and generate, with use of the set of code-length converted values, the compressed stream at the output terminal.
-
公开(公告)号:US10592245B2
公开(公告)日:2020-03-17
申请号:US15600200
申请日:2017-05-19
Applicant: Intel Corporation
Inventor: Gilbert M. Wolrich , Vinodh Gopal , Sean M. Gulley , Kirk S. Yap , Wajdi K. Feghali
Abstract: Instructions and logic provide SIMD SM3 cryptographic hashing functionality. Some embodiments include a processor comprising: a decoder to decode instructions for a SIMD SM3 message expansion, specifying first and second source data operand sets, and an expansion extent. Processor execution units, responsive to the instruction, perform a number of SM3 message expansions, from the first and second source data operand sets, determined by the specified expansion extent and store the result into a SIMD destination register. Some embodiments also execute instructions for a SIMD SM3 hash round-slice portion of the hashing algorithm, from an intermediate hash value input, a source data set, and a round constant set. Processor execution units perform a set of SM3 hashing round iterations upon the source data set, applying the intermediate hash value input and the round constant set, and store a new hash value result in a SIMD destination register.
-
公开(公告)号:US10579380B2
公开(公告)日:2020-03-03
申请号:US14568754
申请日:2014-12-12
Applicant: Intel Corporation
Inventor: Maxim Loktyukhin , Eric W Mahurin , Bret L Toll , Martin G Dixon , Sean P Mirkes , David L Kreitzer , Elmoustapha Ould-Ahmed-Vall , Vinodh Gopal
Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.
-
公开(公告)号:US10554386B2
公开(公告)日:2020-02-04
申请号:US14014091
申请日:2013-08-29
Applicant: Intel Corporation
Inventor: Shay Gueron , Wajdi K. Feghali , Vinodh Gopal , Raghunandan Makaram , Martin G. Dixon , Srinivas Chennupaty , Michael E. Kounavis
IPC: H04L9/28 , G06F21/72 , H04L9/06 , G06F9/30 , G06F9/38 , H04L9/08 , G06F12/14 , G06F21/60 , G06F12/0875 , G06F12/0862 , G11C7/10 , G06F3/06
Abstract: A flexible aes instruction set for a general purpose processor is provided. The instruction set includes instructions to perform a “one round” pass for aes encryption or decryption and also includes instructions to perform key generation. An immediate may be used to indicate round number and key size for key generation for 128/192/256 bit keys. The flexible aes instruction set enables full use of pipelining capabilities because it does not require tracking of implicit registers.
-
-
-
-
-
-
-
-
-