专利检索 ap:("INTERNATIONAL BUSINESS MACHINES CORPORATION") AND inv:"Tung D. Le" 第 1 页

1.

发明申请
ReLU COMPRESSION TO REDUCE GPU MEMORY 有权

公开(公告)号：US20220140841A1

公开(公告)日：2022-05-05

申请号：US17085196

申请日：2020-10-30

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Yasushi Negishi , Tung D. Le , Haruki Imai , Kiyokuni Kawachiya

IPC分类号： H03M7/30 , G06N3/08 , G06N3/04 , G06T1/20

摘要： A method is presented for compressing data of a Rectified Linear Unit (ReLU) function on a graphical processing unit (GPU) employed in a learning process of a deep neural network. The method includes converting an initial data structure including nonzero data and zero data into a compressed data structure including only the nonzero data of the initial data structure as compressed data by generating a nonzero data bitmap region, generating a nonzero data number table region by employing a parallel reduction algorithm, calculating a nonzero data array index per block region of all blocks from the nonzero data number table region by employing a parallel prefix sum scan algorithm, allocating a buffer for the compressed data; and copying the nonzero data from the initial data structure into a nonzero data array region in a compressed data format in parallel.

2.

发明申请
NEURAL NETWORK TRAINING USING A DATA FLOW GRAPH AND DYNAMIC MEMORY MANAGEMENT 有权

公开(公告)号：US20210174190A1

公开(公告)日：2021-06-10

申请号：US16704240

申请日：2019-12-05

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Gradus Janssen , Vladimir Zolotov , Tung D. Le

IPC分类号： G06N3/08 , G06N3/04 , G06N3/06 , G06K9/62 , G06F9/50

摘要： Processing a neural network data flow graph having a set of nodes and a set of edges. An insertion point is determined for a memory reduction or memory restoration operation. The determination is based on computing tensor timing slacks (TTS) for a set of input tensors; compiling a candidate list (SI) of input tensors, from the set of input tensors, using input tensors having corresponding TTS values larger than a threshold value (thTTS); filtering the SI to retain input tensors whose size meets a threshold value (thS); and determining an insertion point for the operation using the SI based on the filtering. A new data flow graph is generated or an existing one is modified using this process.

3.

发明申请
NEURAL PROGRAMMER INTERPRETERS WITH MODELED PRIMITIVES 有权

公开(公告)号：US20210019613A1

公开(公告)日：2021-01-21

申请号：US16514528

申请日：2019-07-17

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Tung D. Le

IPC分类号： G06N3/08

摘要： Methods and systems for generating a program include parameterizing a high-order function to replace data with primitive functions. A neural programmer interpreter (NPI) model is trained for the high-order function. Respective neural network models are trained for each primitive function. The neural network models generate data for the NPI model when called.

4.

发明授权
Neural network training using a data flow graph and dynamic memory management 有权

公开(公告)号：US11521062B2

公开(公告)日：2022-12-06

申请号：US16704240

申请日：2019-12-05

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Gradus Janssen , Vladimir Zolotov , Tung D. Le

IPC分类号： G06F9/455 , G06N3/04 , G06N3/063 , G06N3/08 , G06F9/50 , G06K9/62 , G06N3/06 , G06N3/02

摘要： Processing a neural network data flow graph having a set of nodes and a set of edges. An insertion point is determined for a memory reduction or memory restoration operation. The determination is based on computing tensor timing slacks (TTS) for a set of input tensors; compiling a candidate list (SI) of input tensors, from the set of input tensors, using input tensors having corresponding TTS values larger than a threshold value (thTTS); filtering the SI to retain input tensors whose size meets a threshold value (thS); and determining an insertion point for the operation using the SI based on the filtering. A new data flow graph is generated or an existing one is modified using this process.

5.

发明授权
ReLU compression to reduce GPU memory 有权

公开(公告)号：US11362670B2

公开(公告)日：2022-06-14

申请号：US17085196

申请日：2020-10-30

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Yasushi Negishi , Tung D. Le , Haruki Imai , Kiyokuni Kawachiya

IPC分类号： H03M7/00 , H03M7/30 , G06T1/20 , G06N3/08 , G06N3/04

摘要： A method is presented for compressing data of a Rectified Linear Unit (ReLU) function on a graphical processing unit (GPU) employed in a learning process of a deep neural network. The method includes converting an initial data structure including nonzero data and zero data into a compressed data structure including only the nonzero data of the initial data structure as compressed data by generating a nonzero data bitmap region, generating a nonzero data number table region by employing a parallel reduction algorithm, calculating a nonzero data array index per block region of all blocks from the nonzero data number table region by employing a parallel prefix sum scan algorithm, allocating a buffer for the compressed data; and copying the nonzero data from the initial data structure into a nonzero data array region in a compressed data format in parallel.

6.

发明授权
Real-time resource usage reduction in artificial neural networks 有权

公开(公告)号：US10558914B2

公开(公告)日：2020-02-11

申请号：US16384985

申请日：2019-04-16

申请人： International Business Machines Corporation

发明人： Taro Sekiyama , Kiyokuni Kawachiya , Tung D. Le , Yasushi Negishi

IPC分类号： G06N3/08 , G06N3/04

摘要： A generated algorithm used by a neural network is captured during execution of an iteration of the neural network. A candidate algorithm is identified based on the generated algorithm. A determination is made that the candidate algorithm utilizes less memory than the generated algorithm. Based on the determination the neural network is updated by replacing the generated algorithm with the candidate algorithm.

7.

发明授权
Neural programmer interpreters with modeled primitives 有权

公开(公告)号：US11836613B2

公开(公告)日：2023-12-05

申请号：US16514528

申请日：2019-07-17

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Tung D. Le

IPC分类号： G06F16/23 , G06N3/08

CPC分类号： G06N3/08

摘要： Methods and systems for generating a program include parameterizing a high-order function to replace data with primitive functions. A neural programmer interpreter (NPI) model is trained for the high-order function. Respective neural network models are trained for each primitive function. The neural network models generate data for the NPI model when called.

8.

发明申请
DATA SWAPPING FOR NEURAL NETWORK MEMORY CONSERVATION 有权

公开(公告)号：US20220138580A1

公开(公告)日：2022-05-05

申请号：US17089245

申请日：2020-11-04

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Haruki Imai , Tung D. Le , Yasushi Negishi , Kiyokuni Kawachiya

IPC分类号： G06N3/08

摘要： Methods and systems for training a neural network include identifying units within a neural network, including a first unit for memory swapping and a second unit for re-computation to balance memory efficiency with computational efficiency. Each unit includes at least one layer of the neural network. Each unit has a first layer that is a checkpoint operation. During a feed-forward training stage, feature maps are stored in a first memory. The feature maps are output by the at least one layer of the first unit. The feature maps are swapped from the first memory to a second memory. During a backpropagation stage, the feature maps for the first unit are swapped from the second memory to the first memory. Feature maps for the second unit are re-computed.

9.

发明授权
Multi-GPU deep learning using CPUs 有权

公开(公告)号：US11164079B2

公开(公告)日：2021-11-02

申请号：US15843244

申请日：2017-12-15

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Tung D. Le , Haruki Imai , Taro Sekiyama , Yasushi Negishi

IPC分类号： G06N3/08 , G06T1/20 , G06N3/04

摘要： A computer-implemented method, computer program product, and computer processing system are provided for accelerating neural network data parallel training in multiple graphics processing units (GPUs) using at least one central processing unit (CPU). The method includes forming a set of chunks. Each of the chunks includes a respective group of neural network layers other than a last layer. The method further includes performing one or more chunk-wise synchronization operations during a backward phase of the neural network data parallel training, by each of the multiple GPUs and the at least one CPU.

10.

发明授权
Localizing tree-based convolutional neural networks 有权

公开(公告)号：US11106970B2

公开(公告)日：2021-08-31

申请号：US15815771

申请日：2017-11-17

申请人： International Business Machines Corporation

发明人： Tung D. Le , Taro Sekiyama

IPC分类号： G06N3/04 , G06F8/41 , G06F9/451

摘要： In an approach to localizing tree-based convolutional neural networks, a method includes creating a first tree-based convolution layer (TBCL) corresponding to a tree, where the tree includes a first plurality of nodes and a node that has been indicated to be a first pivotal node. The first TBCL includes a second plurality of nodes and a second pivotal node having a feature vector based on node data from the first pivotal node. The method also includes creating a second TBCL corresponding to the tree. The second TBCL may include a third plurality of nodes. The method further includes determining a feature vector a third pivotal node in the third plurality of nodes based on the feature vectors from: (i) the second pivotal node, (ii) a parent node of the second pivotal node, and (iii) a child node of the second pivotal node.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类