Patent search ap:("Intel Corporation") AND inv:"DIPANKAR DAS" Page 1

1.

发明申请
SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20190354846A1

公开(公告)日：2019-11-21

申请号：US16526376

申请日：2019-07-30

Applicant: Intel Corporation

Inventor： NAVEEN MELLEMPUDI , DIPANKAR DAS

IPC: G06N3/063 , G06T1/20 , G06F7/544 , G06F7/487 , G06N3/04 , G06N3/08 , G06F5/01

Abstract: A graphics processor is described that includes a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The multiprocessor can execute parallel threads of instructions associated with a command stream, where the multiprocessor includes a set of functional units to execute at least one of the parallel threads of the instructions. The set of functional units can include a mixed precision tensor processor to perform tensor computations to generate loss data. The loss data is stored as a floating-point data type and scaled by a scaling factor to enable a data distribution of a gradient tensor generated based on the loss data to be represented by a 16-bit floating point data type.

2.

发明申请
DYNAMIC PRECISION MANAGEMENT FOR INTEGER DEEP LEARNING PRIMITIVES 有权

公开(公告)号：US20240412318A1

公开(公告)日：2024-12-12

申请号：US18751799

申请日：2024-06-24

Applicant: Intel Corporation

Inventor： Naveen K. MELLEMPUDI , DHEEVATSA MUDIGERE , DIPANKAR DAS , SRINIVAS SRIDHARAN

IPC: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/544 , G06F17/15 , G06F17/16 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to convert elements of a floating-point tensor to convert the floating-point tensor into a fixed-point tensor.

3.

发明公开
SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20230141038A1

公开(公告)日：2023-05-11

申请号：US17960947

申请日：2022-10-06

Applicant: Intel Corporation

Inventor： NAVEEN MELLEMPUDI , DIPANKAR DAS

IPC: G06N3/063 , G06F7/487 , G06F7/544 , G06T1/20 , G06F5/01 , G06N3/084 , G06N3/044 , G06N3/045

CPC classification number: G06N3/063 , G06F7/487 , G06F7/5443 , G06T1/20 , G06F5/012 , G06N3/084 , G06N3/044 , G06N3/045

Abstract: A graphics processor is described that includes a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The multiprocessor can execute parallel threads of instructions associated with a command stream, where the multiprocessor includes a set of functional units to execute at least one of the parallel threads of the instructions. The set of functional units can include a mixed precision tensor processor to perform tensor computations to generate loss data. The loss data is stored as a first floating-point data type and scaled by a scaling factor to enable a data distribution of a gradient tensor generated based on the loss data to be represented by a second floating point data type.

4.

发明申请
SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS 有权

公开(公告)号：US20220269931A1

公开(公告)日：2022-08-25

申请号：US17742138

申请日：2022-05-11

Applicant: Intel Corporation

Inventor： NAVEEN MELLEMPUDI , DIPANKAR DAS

IPC: G06N3/063 , G06F7/487 , G06F7/544 , G06T1/20 , G06N3/04 , G06F5/01 , G06N3/08

Abstract: A graphics processor is described that includes a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The multiprocessor can execute parallel threads of instructions associated with a command stream, where the multiprocessor includes a set of functional units to execute at least one of the parallel threads of the instructions. The set of functional units can include a mixed precision tensor processor to perform tensor computations. The functional units can also include circuitry to analyze statistics for output values of the tensor computations, determine a target format to convert the output values, the target format determined based on the statistics for the output values and a precision associated with a second layer of the neural network, and convert the output values to the target format.

5.

发明申请
SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20180322382A1

公开(公告)日：2018-11-08

申请号：US15869582

申请日：2018-01-12

Applicant: Intel Corporation

Inventor： NAVEEN MELLEMPUDI , DIPANKAR DAS

IPC: G06N3/063 , G06T1/20 , G06F7/544 , G06F7/487

CPC classification number: G06N3/063 , G06F7/487 , G06F7/5443 , G06T1/20

Abstract: One embodiment provides for a machine-learning accelerator device a multiprocessor to execute parallel threads of an instruction stream, the multiprocessor including a compute unit, the compute unit including a set of functional units, each functional unit to execute at least one of the parallel threads of the instruction stream. The compute unit includes compute logic configured to execute a single instruction to scale an input tensor associated with a layer of a neural network according to a scale factor, the input tensor stored in a floating-point data type, the compute logic to scale the input tensor to enable a data distribution of data of the input tensor to be represented by a 16-bit floating point data type.

6.

发明申请
ABSTRACTION LAYERS FOR SCALABLE DISTRIBUTED MACHINE LEARNING 审中-公开

公开(公告)号：US20180293493A1

公开(公告)日：2018-10-11

申请号：US15482953

申请日：2017-04-10

Applicant: Intel Corporation

Inventor： Dhiraj D. Kalamkar , KARTHIKEYAN VAIDYANATHAN , SRINIVAS SRIDHARAN , DIPANKAR DAS

IPC: G06N3/08 , G06T1/60 , G06T1/20

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.

7.

发明申请
ABSTRACTION LIBRARY TO ENABLE SCALABLE DISTRIBUTED MACHINE LEARNING 审中-公开

公开(公告)号：US20180293492A1

公开(公告)日：2018-10-11

申请号：US15482925

申请日：2017-04-10

Applicant: Intel Corporation

Inventor： Dhiraj D. Kalamkar , KARTHIKEYAN VAIDYANATHAN , SRINIVAS SRIDHARAN , DIPANKAR DAS

IPC: G06N3/08

Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.

8.

发明申请
SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS 有权

公开(公告)号：US20250061318A1

公开(公告)日：2025-02-20

申请号：US18818154

申请日：2024-08-28

Applicant: Intel Corporation

Inventor： NAVEEN MELLEMPUDI , DIPANKAR DAS

IPC: G06N3/063 , G06F5/01 , G06F7/487 , G06F7/544 , G06N3/044 , G06N3/045 , G06N3/084 , G06T1/20

Abstract: One embodiment provides for a machine-learning accelerator device a multiprocessor to execute parallel threads of an instruction stream, the multiprocessor including a compute unit, the compute unit including a set of functional units, each functional unit to execute at least one of the parallel threads of the instruction stream. The compute unit includes compute logic configured to execute a single instruction to scale an input tensor associated with a layer of a neural network according to a scale factor, the input tensor stored in a floating-point data type, the compute logic to scale the input tensor to enable a data distribution of data of the input tensor to be represented by a 16-bit floating point data type.

9.

发明公开
DYNAMIC PRECISION MANAGEMENT FOR INTEGER DEEP LEARNING PRIMITIVES 审中-公开

公开(公告)号：US20230351542A1

公开(公告)日：2023-11-02

申请号：US18306033

申请日：2023-04-24

Applicant: Intel Corporation

Inventor： Naveen K. MELLEMPUDI , DHEEVATSA MUDIGERE , DIPANKAR DAS , SRINIVAS SRIDHARAN

IPC: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/544 , G06F17/15 , G06F17/16 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045

CPC classification number: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/5443 , G06F17/153 , G06F17/16 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045 , G06F2207/382 , G06F2207/4824

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to convert elements of a floating-point tensor to convert the floating-point tensor into a fixed-point tensor.

10.

发明申请
ABSTRACTION LIBRARY TO ENABLE SCALABLE DISTRIBUTED MACHINE LEARNING 有权

公开(公告)号：US20210350212A1

公开(公告)日：2021-11-11

申请号：US17328028

申请日：2021-05-24

Applicant: Intel Corporation

Inventor： DHIRAJ D. KALAMKAR , KARTHIKEYAN VAIDYANATHAN , SRINIVAS SRIDHARAN , DIPANKAR DAS

IPC: G06N3/04 , G06N3/063 , G06T1/20 , G06N3/08

Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification