Patent search ap:("Intel Corporation") AND inv:"Avishaii Abuhatzera" Page 1

1.

发明授权
Accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits 有权

公开(公告)号：US11714998B2

公开(公告)日：2023-08-01

申请号：US16909295

申请日：2020-06-23

Applicant: Intel Corporation

Inventor： Avishaii Abuhatzera , Om Ji Omer , Ritwika Chowdhury , Lance Hacking

IPC: G06N3/063 , G06N3/08 , G06N3/04 , G06N3/088

CPC classification number: G06N3/063 , G06N3/0454 , G06N3/088

Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.

2.

发明授权
Loop support extensions 有权

公开(公告)号：US12112171B2

公开(公告)日：2024-10-08

申请号：US17134367

申请日：2020-12-26

Applicant: Intel Corporation

Inventor： Anant Nori , Shankar Balachandran , Sreenivas Subramoney , Joydeep Rakshit , Vedvyas Shanbhogue , Avishaii Abuhatzera , Belliappa Kuttanna

IPC: G06F8/41 , G06F9/30 , G06F9/38 , G06F9/48

CPC classification number: G06F9/30145 , G06F9/30065 , G06F9/3836 , G06F9/4881

Abstract: Techniques for processing loops are described. An exemplary apparatus at least includes decoder circuitry to decode a single instruction, the single instruction to include a field for an opcode, the opcode to indicate execution circuitry is to perform an operation to configure execution of one or more loops, wherein the one or more loops are to include a plurality of configuration instructions and instructions that are to use metadata generated by ones of the plurality of configuration instructions; and execution circuitry to perform the operation as indicated by the opcode.

3.

发明申请
ACCELERATING NEURAL NETWORKS WITH LOW PRECISION-BASED MULTIPLICATION AND EXPLOITING SPARSITY IN HIGHER ORDER BITS 审中-公开

公开(公告)号：US20200320375A1

公开(公告)日：2020-10-08

申请号：US16909295

申请日：2020-06-23

Applicant: Intel Corporation

Inventor： Avishaii Abuhatzera , Om Ji Omer , Ritwika Chowdhury , Lance Hacking

IPC: G06N3/063 , G06N3/08 , G06N3/04

Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.

4.

发明公开
ACCELERATING NEURAL NETWORKS WITH LOW PRECISION-BASED MULTIPLICATION AND EXPLOITING SPARSITY IN HIGHER ORDER BITS 审中-公开

公开(公告)号：US20240005135A1

公开(公告)日：2024-01-04

申请号：US18135958

申请日：2023-04-18

Applicant: Intel Corporation

Inventor： Avishaii Abuhatzera , Om Ji Omer , Ritwika Chowdhury , Lance Hacking

IPC: G06N3/063 , G06N3/088 , G06N3/045

CPC classification number: G06N3/063 , G06N3/045 , G06N3/088

Abstract: An apparatus to facilitate accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits is disclosed. The apparatus includes a processor comprising a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format. The processor further includes a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.

5.

发明申请
HARDWARE-SOFTWARE CO-DESIGNED MULTI-CAST FOR IN-MEMORY COMPUTING ARCHITECTURES 有权

公开(公告)号：US20220113974A1

公开(公告)日：2022-04-14

申请号：US17561029

申请日：2021-12-23

Applicant: INTEL CORPORATION

Inventor： Om Ji Omer , Gurpreet Singh Kalsi , Anirud Thyagharajan , Saurabh Jain , Kamlesh R. Pillai , Sreenivas Subramoney , Avishaii Abuhatzera

IPC: G06F9/38 , G06F9/30

Abstract: A memory architecture includes processing circuits co-located with memory subarrays for performing computations within the memory architecture. The memory architecture includes a plurality of decoders in hierarchical levels that include a multicast capability for distributing data or compute operations to individual subarrays. The multicast may be configurable with respect to individual fan-outs at each hierarchical level. A computation workflow may be organized into a compute supertile representing one or more “supertiles” of input data to be processed in the compute supertile. The individual data tiles of the input data supertile may be used by multiple compute tiles executed by the processing circuits of the subarrays, and the data tiles multicast to the respective processing circuits for efficient data loading and parallel computation.

Patent Agency Ranking