Patent search ap:("INTEL CORPORATION") AND inv:"Wajdi Feghali" Page 2

11.

发明授权
Apparatuses, methods, and systems for hashing instructions 有权

公开(公告)号：US10824428B2

公开(公告)日：2020-11-03

申请号：US16370459

申请日：2019-03-29

Applicant: Intel Corporation

Inventor： Regev Shemy , Zeev Sperber , Wajdi Feghali , Vinodh Gopal , Amit Gradstein , Simon Rubanovich , Sean Gulley , Ilya Albrekht , Jacob Doweck , Jose Yallouz , Ittai Anati

IPC: G06F9/30 , H04L9/06 , G06F9/38

Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described. In one embodiment, a processor includes a decode circuit to decode a single instruction into a decoded single instruction, the single instruction including at least one first field that identifies eight 32-bit state elements A, B, C, D, E, F, G, and H for a round according to a SM3 hashing standard and at least one second field that identifies an input message; and an execution circuit to execute the decoded single instruction to: rotate state element C left by 9 bits to form a rotated state element C, rotate state element D left by 9 bits to form a rotated state element D, rotate state element G left by 19 bits to form a rotated state element G, rotate state element H left by 19 bits to form a rotated state element H, perform two rounds according to the SM3 hashing standard on the input message and state element A, state element B, rotated state element C, rotated state element D, state element E, state element F, rotated state element G, and rotated state element H to generate an updated state element A, an updated state element B, an updated state element E, and an updated state element F, and store the updated state element A, the updated state element B, the updated state element E, and the updated state element F into a location specified by the single instruction.

12.

发明授权
Apparatus and method for low-latency decompression acceleration via a single job descriptor 有权

公开(公告)号：US11989582B2

公开(公告)日：2024-05-21

申请号：US17033760

申请日：2020-09-26

Applicant: Intel Corporation

Inventor： James Guilford , George Powley , Vinodh Gopal , Wajdi Feghali

IPC: G06F9/38 , G06F9/48

CPC classification number: G06F9/4881 , G06F9/3887 , G06F2209/483

Abstract: Apparatus and method for performing low-latency multi-job submission via a single job descriptor is described herein. An apparatus embodiment includes a plurality of descriptor queues to stores job descriptors describing work to be performed and enqueue circuitry to receive a first job descriptor which includes a first field to store a Single Instruction Multiple Data (SIMD) width. If the SIMD width indicates that the first job descriptor is an SIMD job descriptor and open slots are available in the descriptor queues to store new job descriptors, then the enqueue circuitry is to generate a plurality of job descriptors based on fields of the first job descriptor and to store them in the open slots of the descriptor queues. The generated job descriptors are processed by processing pipelines to perform the work described. At least some of the generated job descriptors are processed concurrently or in parallel by different processing pipelines.

13.

发明公开
EFFICIENT IMPLEMENTATION OF ZUC AUTHENTICATION 审中-公开

公开(公告)号：US20240113863A1

公开(公告)日：2024-04-04

申请号：US18129814

申请日：2023-03-31

Applicant: Intel Corporation

Inventor： Pablo De Lara Guarch , Tomasz Kantecki , Krystian Matusiewicz , Wajdi Feghali , Vinodh Gopal , James D. Guilford

IPC: H04L9/06 , H04L9/32 , H04W12/037

CPC classification number: H04L9/065 , H04L9/3215 , H04W12/037

Abstract: Methods and apparatus relating to an efficient implementation of ZUC authentication are described. In one embodiment, a processor computes a tag update, based at least in part on stored data, for an authentication operation. The tag update is computed by replacing a ‘for’ loop with a carry-less multiply operation. Other embodiments are also claimed and disclosed.

14.

发明公开
HARDWARE ACCELERATED STRING FILTER 审中-公开

公开(公告)号：US20240028577A1

公开(公告)日：2024-01-25

申请号：US18225939

申请日：2023-07-25

Applicant: Intel Corporation

Inventor： Jixing Gu , Vinodh Gopal , Fang Xie , David Cohen , Wajdi Feghali

IPC: G06F16/22

CPC classification number: G06F16/2272 , G06F16/24568

Abstract: An apparatus may include an accelerator and a processor. The processor may receive an input string targeting a data buffer comprising a plurality of strings. The processor may receive, from the accelerator, a fixed-length data buffer based on the data buffer, respective ones of a plurality of entries of the fixed-length data buffer based on respective ones of the strings. The processor may receive, from the accelerator, a plurality of streams, respective ones of the plurality of streams to comprise a portion of respective entries in the fixed-length data buffer. The processor may generate, based on the input string, a plurality of target portions of the input string. The processor may receive, from the accelerator, indexes of the plurality of streams based on respective target portions of the input string matching respective entries of the plurality of streams. The processor may aggregate the indexes received from the accelerator.

15.

发明申请
APPLICATION PROGRAMMING INTERFACE FOR FINE GRAINED LOW LATENCY DECOMPRESSION WITHIN PROCESSOR CORE 有权

公开(公告)号：US20220197813A1

公开(公告)日：2022-06-23

申请号：US17133624

申请日：2020-12-23

Applicant: Intel Corporation

Inventor： Jayesh Gaur , Adarsh Chauhan , Vinodh Gopal , Vedvyas Shanbhogue , Sreenivas Subramoney , Wajdi Feghali

IPC: G06F12/0875 , G06F12/0813 , G06F12/0811 , G06F12/1045

Abstract: Methods and apparatus relating to techniques for increasing per core memory bandwidth by using forget store operations are described. In an embodiment, a cache stores a buffer. Execution circuitry executes an instruction. The instruction causes one or more cachelines in the cache to be marked based on a start address for the buffer and a size of the buffer. A marked cacheline in the cache is to be prevented from being written back to memory. Other embodiments are also disclosed and claimed.

16.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR HASHING INSTRUCTIONS 有权

公开(公告)号：US20210049013A1

公开(公告)日：2021-02-18

申请号：US17087536

申请日：2020-11-02

Applicant: Intel Corporation

Inventor： Regev Shemy , Zeev Sperber , Wajdi Feghali , Vinodh Gopal , Amit Gradstein , Simon Rubanovich , Sean Gulley , Ilya Albrekht , Jacob Doweck , Jose Yallouz , Ittai Anati

IPC: G06F9/30 , G06F9/38 , H04L9/06

Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described. In one embodiment, a processor includes a decode circuit to decode a single instruction into a decoded single instruction, the single instruction including at least one first field that identifies eight 32-bit state elements A, B, C, D, E, F, G, and H for a round according to a SM3 hashing standard and at least one second field that identifies an input message; and an execution circuit to execute the decoded single instruction to: rotate state element C left by 9 bits to form a rotated state element C, rotate state element D left by 9 bits to form a rotated state element D, rotate state element G left by 19 bits to form a rotated state element G, rotate state element H left by 19 bits to form a rotated state element H, perform two rounds according to the SM3 hashing standard on the input message and state element A, state element B, rotated state element C, rotated state element D, state element E, state element F, rotated state element G, and rotated state element H to generate an updated state element A, an updated state element B, an updated state element E, and an updated state element F, and store the updated state element A, the updated state element B, the updated state element E, and the updated state element F into a location specified by the single instruction.

17.

发明授权
Instruction-set architecture for programmable Cyclic Redundancy Check (CRC) computations 有权
Title translation: 可编程循环冗余校验（CRC）计算的指令集架构

公开(公告)号：US09047082B2

公开(公告)日：2015-06-02

申请号：US14254233

申请日：2014-04-16

Applicant: Intel Corporation

Inventor： Vinodh Gopal , Shay Gueron , Gilbert Wolrich , Wajdi Feghali , Kirk Yap , Bradley Burres

IPC: G06F11/00 , G06F9/30 , H03M13/09 , G06F11/10

CPC classification number: G06F9/30145 , G06F9/30018 , G06F11/1004 , H03M13/09 , H03M13/091

Abstract: A method and apparatus to perform Cyclic Redundancy Check (CRC) operations on a data block using a plurality of different n-bit polynomials is provided. A flexible CRC instruction performs a CRC operation using a programmable n-bit polynomial. The n-bit polynomial is provided to the CRC instruction by storing the n-bit polynomial in one of two operands.

Abstract translation: 提供了使用多个不同的n位多项式对数据块执行循环冗余校验（CRC）操作的方法和装置。灵活的CRC指令使用可编程的n位多项式来执行CRC操作。通过将n位多项式存储在两个操作数之一中，将n位多项式提供给CRC指令。

18.

发明公开
APPARATUSES, METHODS, AND SYSTEMS FOR HASHING INSTRUCTIONS 审中-公开

公开(公告)号：US20240036865A1

公开(公告)日：2024-02-01

申请号：US18336985

申请日：2023-06-17

Applicant: Intel Corporation

Inventor： Regev Shemy , Zeev Sperber , Wajdi Feghali , Vinodh Gopal , Amit Gradstein , Simon Rubanovich , Sean Gulley , Ilya Albrekht , Jacob Doweck , Jose Yallouz , Ittai Anati

IPC: G06F9/30 , G06F9/38 , H04L9/06

CPC classification number: G06F9/30145 , G06F9/30043 , G06F9/30196 , G06F9/3887 , H04L9/0643

Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described. In one embodiment, a processor includes a decode circuit to decode a single instruction into a decoded single instruction, the single instruction including at least one first field that identifies eight 32-bit state elements A, B, C, D, E, F, G, and H for a round according to a SM3 hashing standard and at least one second field that identifies an input message; and an execution circuit to execute the decoded single instruction to: rotate state element C left by 9 bits to form a rotated state element C, rotate state element D left by 9 bits to form a rotated state element D, rotate state element G left by 19 bits to form a rotated state element G, rotate state element H left by 19 bits to form a rotated state element H, perform two rounds according to the SM3 hashing standard on the input message and state element A, state element B, rotated state element C, rotated state element D, state element E, state element F, rotated state element G, and rotated state element H to generate an updated state element A, an updated state element B, an updated state element E, and an updated state element F, and store the updated state element A, the updated state element B, the updated state element E, and the updated state element F into a location specified by the single instruction.

19.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR HASHING INSTRUCTIONS 有权

公开(公告)号：US20220147356A1

公开(公告)日：2022-05-12

申请号：US17537373

申请日：2021-11-29

Applicant: Intel Corporation

Inventor： Regev Shemy , Zeev Sperber , Wajdi Feghali , Vinodh Gopal , Amit Gradstein , Simon Rubanovich , Sean Gulley , Ilya Albrekht , Jacob Doweck , Jose Yallouz , Ittai Anati

IPC: G06F9/30 , G06F9/38 , H04L9/06

Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described. In one embodiment, a processor includes a decode circuit to decode a single instruction into a decoded single instruction, the single instruction including at least one first field that identifies eight 32-bit state elements A, B, C, D, E, F, G, and H for a round according to a SM3 hashing standard and at least one second field that identifies an input message; and an execution circuit to execute the decoded single instruction to: rotate state element C left by 9 bits to form a rotated state element C, rotate state element D left by 9 bits to form a rotated state element D, rotate state element G left by 19 bits to form a rotated state element G, rotate state element H left by 19 bits to form a rotated state element H, perform two rounds according to the SM3 hashing standard on the input message and state element A, state element B, rotated state element C, rotated state element D, state element E, state element F, rotated state element G, and rotated state element H to generate an updated state element A, an updated state element B, an updated state element E, and an updated state element F, and store the updated state element A, the updated state element B, the updated state element E, and the updated state element F into a location specified by the single instruction.

20.

发明申请
APPARATUS AND METHOD FOR LOW-LATENCY DECOMPRESSION ACCELERATION VIA A SINGLE JOB DESCRIPTOR 有权

公开(公告)号：US20220100526A1

公开(公告)日：2022-03-31

申请号：US17033760

申请日：2020-09-26

Applicant: Intel Corporation

Inventor： James Guilford , George Powley , Vinodh Gopal , Wajdi Feghali

IPC: G06F9/38

Abstract: Apparatus and method for performing low-latency multi-job submission via a single job descriptor is described herein. An apparatus embodiment includes a plurality of descriptor queues to stores job descriptors describing work to be performed and enqueue circuitry to receive a first job descriptor which includes a first field to store a Single Instruction Multiple Data (SIMD) width. If the SIMD width indicates that the first job descriptor is an SIMD job descriptor and open slots are available in the descriptor queues to store new job descriptors, then the enqueue circuitry is to generate a plurality of job descriptors based on fields of the first job descriptor and to store them in the open slots of the descriptor queues. The generated job descriptors are processed by processing pipelines to perform the work described. At least some of the generated job descriptors are processed concurrently or in parallel by different processing pipelines.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification