Patent search ap:("Intel Corporation") AND inv:"Dipankar DAS" Page 1

1.

发明公开
ABSTRACTION LAYERS FOR SCALABLE DISTRIBUTED MACHINE LEARNING 审中-公开

公开(公告)号：US20240070799A1

公开(公告)日：2024-02-29

申请号：US18461038

申请日：2023-09-05

Applicant: Intel Corporation

Inventor： Dhiraj D. KALAMKAR , Karthikeyan VAIDYANATHAN , Srinivas SRIDHARAN , Dipankar DAS

IPC: G06T1/20 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084

CPC classification number: G06T1/20 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.

2.

发明申请
INSTRUCTIONS FOR FUSED MULTIPLY-ADD OPERATIONS WITH VARIABLE PRECISION INPUT OPERANDS 审中-公开

公开(公告)号：US20190042242A1

公开(公告)日：2019-02-07

申请号：US15940774

申请日：2018-03-29

Applicant: Intel Corporation

Inventor： Dipankar DAS , Naveen K. MELLEPUDI , Mrinmay DUTTA , Arun KUMAR , Dheevatsa MUDIGERE , Abhisek KUNDU

IPC: G06F9/30 , G06F7/544 , G06F9/38

Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.

3.

发明申请
SCALABLE HIGH-PERFORMANCE PACKAGE ARCHITECTURE USING PROCESSOR-MEMORY-PHOTONICS MODULES 有权

公开(公告)号：US20220115362A1

公开(公告)日：2022-04-14

申请号：US17067069

申请日：2020-10-09

Applicant: Intel Corporation

Inventor： Debendra MALLIK , Ravindranath MAHAJAN , Dipankar DAS

IPC: H01L25/16 , H01L23/32 , H05K7/10 , G02B6/42

Abstract: A processor package module comprises a processor-memory stack including one or more compute die stacked and interconnected with a memory stack on a substrate. One or more photonic die is on the substrate to transmit and receive optical I/O, the one or more photonic die connected to the processor-memory stack and connected to external components through a fiber array. The substrate is mounted into a socket housing, such as a land grid array (LGA) socket. An array of processor package modules are interconnected on a processor substrate via fiber arrays and optical connectors to form a processor chip complex.

4.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR NEURAL NETWORKS 审中-公开

公开(公告)号：US20190303743A1

公开(公告)日：2019-10-03

申请号：US16317497

申请日：2016-09-27

Applicant: Intel Corporation

Inventor： Swagath VENKATARAMANI , Dipankar DAS , Ashish RANJAN , Subarno BANERJEE , Sasikanth AVANCHA , Ashok JAGANNATHAN , Ajaya V. DURG , Dheemanth NAGARAJ , Bharat KAUL , Anand RAGHUNATHAN

IPC: G06N3/04 , G06N3/08

Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.

5.

发明公开
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION 审中-公开

公开(公告)号：US20240160931A1

公开(公告)日：2024-05-16

申请号：US18532795

申请日：2023-12-07

Applicant: Intel Corporation

Inventor： Abhisek KUNDU , NAVEEN MELLEMPUDI , DHEEVATSA MUDIGERE , Dipankar DAS

IPC: G06N3/08 , G06F9/46 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06N5/04 , G06T15/00

CPC classification number: G06N3/08 , G06F9/46 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06N5/04 , G06T15/005 , G06T17/20

Abstract: One embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. The tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. The instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.

6.

发明公开
APPARATUSES, METHODS, AND SYSTEMS FOR NEURAL NETWORKS 审中-公开

公开(公告)号：US20240118892A1

公开(公告)日：2024-04-11

申请号：US18543357

申请日：2023-12-18

Applicant: Intel Corporation

Inventor： Swagath VENKATARAMANI , Dipankar DAS , Ashish RANJAN , Subarno BANERJEE , Sasikanth AVANCHA , Ashok JAGANNATHAN , Ajaya V. DURG , Dheemanth NAGARAJ , Bharat KAUL , Anand RAGHUNATHAN

IPC: G06F9/30 , G06F9/38 , G06F9/52 , G06N3/04 , G06N3/063 , G06N3/084

CPC classification number: G06F9/30145 , G06F9/3004 , G06F9/30043 , G06F9/30087 , G06F9/3834 , G06F9/52 , G06N3/04 , G06N3/063 , G06N3/084

Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.

7.

发明申请
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION 有权

公开(公告)号：US20230087364A1

公开(公告)日：2023-03-23

申请号：US18060414

申请日：2022-11-30

Applicant: Intel Corporation

Inventor： Abhisek KUNDU , NAVEEN MELLEMPUDI , DHEEVATSA MUDIGERE , Dipankar DAS

IPC: G06N3/08 , G06T15/00 , G06N5/04 , G06N3/04 , G06F9/46 , G06N3/063

Abstract: One embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. The tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. The instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.

8.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR ACCESS SYNCHRONIZATION IN A SHARED MEMORY 有权

公开(公告)号：US20210382719A1

公开(公告)日：2021-12-09

申请号：US17410934

申请日：2021-08-24

Applicant: Intel Corporation

Inventor： Swagath VENKATARAMANI , Dipankar DAS , Sasikanth AVANCHA , Ashish RANJAN , Subarno BANERJEE , Bharat KAUL , Anand RAGHUNATHAN

IPC: G06F9/30 , G06N3/04 , G06N3/08 , G06N3/063 , G06F9/52 , G06F9/38

Abstract: Systems, methods, and apparatuses relating to access synchronization in a shared memory are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction, and an execution unit to execute the decoded instruction to: receive a first input operand of a memory address to be tracked and a second input operand of an allowed sequence of memory accesses to the memory address, and cause a block of a memory access that violates the allowed sequence of memory accesses to the memory address. In one embodiment, a circuit separate from the execution unit compares a memory address for a memory access request to one or more memory addresses in a tracking table, and blocks a memory access for the memory access request when a type of access violates a corresponding allowed sequence of memory accesses to the memory address for the memory access request.

9.

发明申请
PROGRAMMABLE CONVERSION HARDWARE 有权

公开(公告)号：US20210072955A1

公开(公告)日：2021-03-11

申请号：US16562979

申请日：2019-09-06

Applicant: Intel Corporation

Inventor： Naveen MELLEMPUDI , Dipankar DAS , Chunhui MEI , Kristopher WONG , Dhiraj D. KALAMKAR , Hong H. JIANG , Subramaniam Maiyuran , Varghese George

IPC: G06F7/499 , G06F17/16 , G06N3/08 , G06N3/04 , G06T1/20

Abstract: An apparatus to facilitate a computer number format conversion is disclosed. The apparatus comprises a control unit to receive to receive data format information indicating a first precision data format that input data is to be received and converter hardware to receive the input data and convert the first precision data format to a second precision data format based on the data format information.

10.

发明公开
INSTRUCTIONS FOR FUSED MULTIPLY-ADD OPERATIONS WITH VARIABLE PRECISION INPUT OPERANDS 审中-公开

公开(公告)号：US20240126544A1

公开(公告)日：2024-04-18

申请号：US18399578

申请日：2023-12-28

Applicant: Intel Corporation

Inventor： Dipankar DAS , Naveen K. MELLEMPUDI , Mrinmay DUTTA , Arun KUMAR , Dheevatsa MUDIGERE , Abhisek KUNDU

IPC: G06F9/30 , G06F7/483 , G06F7/544 , G06F9/38 , G06N3/063

CPC classification number: G06F9/30014 , G06F7/483 , G06F7/5443 , G06F9/30036 , G06F9/30145 , G06F9/3802 , G06F9/382 , G06F9/384 , G06F9/3887 , G06N3/063 , G06F9/30065 , G06F2207/382

Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification