专利检索 ap:("International Business Machines Corporation") AND inv:"Yasushi Negishi" 第 2 页

11.

发明授权
Efficient parallel training of a network model on multiple graphics processing units 有权

公开(公告)号：US10949746B2

公开(公告)日：2021-03-16

申请号：US15423900

申请日：2017-02-03

申请人： International Business Machines Corporation

发明人： Imai Haruki , Tung Duc Le , Yasushi Negishi

IPC分类号： G06N3/08 , G06N3/04

摘要： A system and method provides efficient parallel training of a neural network model on multiple graphics processing units. A training module reduces the time and communication overhead of gradient accumulation and parameter updating of the network model in a neural network by overlapping processes in an advantageous way. In a described embodiment, a training module overlaps backpropagation, gradient transfer and accumulation in a Synchronous Stochastic Gradient Decent algorithm on a convolution neural network. The training module collects gradients of multiple layers during backpropagation of training from a plurality of graphics processing units (GPUs), accumulates the gradients on at least one processor and then delivers the gradients of the layers to the plurality of GPUs during the backpropagation of the training. The whole model parameters can then be updated on the GPUs after receipt of the gradient of the last layer.

12.

发明申请
APPLICATION PERFORMANCE SIMULATOR 审中-公开

公开(公告)号：US20200065214A1

公开(公告)日：2020-02-27

申请号：US16110324

申请日：2018-08-23

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Yasushi Negishi , Kiyokuni Kawachiya , Jun Doi

IPC分类号： G06F11/34 , G06F1/32 , G06F9/50

摘要： A computer-implemented method, system, and computer program product are provided to simulate a target system. The method includes determining system performance metrics for a target system and an execution system. The method also includes generating a ratio of estimation between the system performance metrics for the target system and the execution system. The method additionally includes throttling components in the execution system to adjust all of the system performance metrics of the execution system responsive to the ratio of estimation to create a throttled execution system. The method further includes measuring a throttled execution time while running an application on the throttled execution system. The method also includes estimating a target execution time for the application on the target system responsive to the throttled execution time.

13.

发明申请
ESTIMATING PERFORMANCE OF GPU APPLICATION FOR DIFFERENT GPU-LINK PERFORMANCE RATIO 审中-公开

公开(公告)号：US20190325549A1

公开(公告)日：2019-10-24

申请号：US15956321

申请日：2018-04-18

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Kiyokuni Kawachiya , Yasushi Negishi , Jun Doi

IPC分类号： G06T1/20 , G06F9/38

摘要： A computer-implemented method is provided for estimating the performance of a GPU application on a new computing machine having an increased GPU-link performance ratio relative to a current computing machine having a current GPU-link performance ratio. The method includes adding a delay to CPU-GPU communication on the current computing machine to simulate a delayed-communication environment on the current computing machine. The method further includes executing the target GPU application in the delayed-communication environment. The method also includes measuring the performance of the target GPU application in the delayed-communication environment. The method additionally includes estimating the performance of the new computing machine having the increased higher GPU-link performance ratio, based on the measured performance of the target GPU application in the delayed-communication environment.

14.

发明申请
MEMORY REDUCTION FOR NEURAL NETWORKS WITH FIXED STRUCTURES 审中-公开

公开(公告)号：US20190303025A1

公开(公告)日：2019-10-03

申请号：US15943079

申请日：2018-04-02

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Taro Sekiyama , Haruki Imai , Jun Doi , Yasushi Negishi

IPC分类号： G06F3/06 , G06N3/08

摘要： A method is provided for reducing consumption of a memory in a propagation process for a neural network (NN) having fixed structures for computation order and node data dependency. The memory includes memory segments for allocating to nodes. The method collects, in a NN training iteration, information for each node relating to an allocation, size, and lifetime thereof. The method chooses, responsive to the information, a first node having a maximum memory size relative to remaining nodes, and a second node non-overlapped with the first node lifetime. The method chooses another node non-overlapped with the first node lifetime, responsive to a sum of memory sizes of the second node and the other node not exceeding a first node memory size. The method reallocates a memory segment allocated to the first node to the second node and the other node to be reused by the second node and the other node.

15.

发明授权
Surface-based object identification 有权

公开(公告)号：US10169874B2

公开(公告)日：2019-01-01

申请号：US15608814

申请日：2017-05-30

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Hiroki Nakano , Yasushi Negishi , Masaharu Sakamato , Taro Sekiyama , Kun Zhao

IPC分类号： G06T7/00 , G06T7/11

摘要： A target object may be identified by estimating a distribution of a plurality of orientations of a periphery of a target object, and identifying the target object based on the distribution.

16.

发明授权
Packet communication system, communication method and program 有权
标题翻译：分组通信系统，通信方式和程序

公开(公告)号：US09066289B2

公开(公告)日：2015-06-23

申请号：US13890338

申请日：2013-05-09

申请人： International Business Machines Corporation

发明人： Yasunao Katayama , Yasushi Negishi , Atsuya Okazaki

IPC分类号： H04J3/06 , H04W56/00 , H04L12/721 , H04W40/02 , H04W40/06

CPC分类号： H04W56/00 , H04J3/0658 , H04L45/40 , H04W40/02 , H04W40/06

摘要： A system including multiple nodes performing radio communication, wherein each node stores routing information, uses it to determine a transmission path, and performs cut-through transmission by transmitting and receiving packets to and from a node on the determined path through transmission and reception radio waves given a directivity by controlling their phases. In the system, time synchronization and transmission and reception of packet communication records are performed during a certain time period by carrying out the cut-through transmission while controlling phases of the radio waves so that all of the nodes form one or more closed loops. The node transmits and receives packets in accordance with routing information and a time frame assigned to each of the nodes as a time when each node is allowed to transmit and receive a packet, updates the routing information, and shares it with each node.

摘要翻译： 一种包括执行无线电通信的多个节点的系统，其中每个节点存储路由信息，使用它来确定传输路径，并且通过发送和接收无线电波通过在确定的路径上的节点发送和接收分组来执行直通传输通过控制它们的阶段给出方向性。在该系统中，通过在控制无线电波的相位的同时执行直通传输，使得所有节点形成一个或多个闭环，在一定时间段内执行分组通信记录的时间同步和发送和接收。节点根据路由信息和分配给每个节点的时间帧发送和接收分组，作为允许每个节点发送和接收分组的时间，更新路由信息，并与每个节点共享。

17.

发明申请
DATA SWAPPING FOR NEURAL NETWORK MEMORY CONSERVATION 有权

公开(公告)号：US20220138580A1

公开(公告)日：2022-05-05

申请号：US17089245

申请日：2020-11-04

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Haruki Imai , Tung D. Le , Yasushi Negishi , Kiyokuni Kawachiya

IPC分类号： G06N3/08

摘要： Methods and systems for training a neural network include identifying units within a neural network, including a first unit for memory swapping and a second unit for re-computation to balance memory efficiency with computational efficiency. Each unit includes at least one layer of the neural network. Each unit has a first layer that is a checkpoint operation. During a feed-forward training stage, feature maps are stored in a first memory. The feature maps are output by the at least one layer of the first unit. The feature maps are swapped from the first memory to a second memory. During a backpropagation stage, the feature maps for the first unit are swapped from the second memory to the first memory. Feature maps for the second unit are re-computed.

18.

发明授权
Multi-GPU deep learning using CPUs 有权

公开(公告)号：US11164079B2

公开(公告)日：2021-11-02

申请号：US15843244

申请日：2017-12-15

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Tung D. Le , Haruki Imai , Taro Sekiyama , Yasushi Negishi

IPC分类号： G06N3/08 , G06T1/20 , G06N3/04

摘要： A computer-implemented method, computer program product, and computer processing system are provided for accelerating neural network data parallel training in multiple graphics processing units (GPUs) using at least one central processing unit (CPU). The method includes forming a set of chunks. Each of the chunks includes a respective group of neural network layers other than a last layer. The method further includes performing one or more chunk-wise synchronization operations during a backward phase of the neural network data parallel training, by each of the multiple GPUs and the at least one CPU.

19.

发明申请
INTEGRATING MULTIPLE DISTRIBUTED DATA PROCESSING SERVERS WITH DIFFERENT DATA PARTITIONING AND ROUTING MECHANISMS, RESOURCE SHARING POLICIES AND LIFECYCLES INTO A SINGLE PROCESS 审中-公开

公开(公告)号：US20200174848A1

公开(公告)日：2020-06-04

申请号：US16781992

申请日：2020-02-04

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Kiyokuni Kawachiya , Yasushi Negishi , Mikio Takeuchi , Gaku Yamamoto

IPC分类号： G06F9/50 , H04L29/08 , H04L12/701 , H04L29/06 , G06F16/25 , H04L12/24

摘要： A method is provided for consistent data processing by first and second distributed processing systems having different data partitioning and routing mechanisms such that the first system is without states and the second system is with states. The method includes dividing data in each system into a same number of partitions based on a same key and a same hash function. The method includes mapping partitions between the systems in a one-to-one mapping. The mapping step includes calculating a partition ID based on the hash function and a total number of partitions, and dynamically mapping a partition in the first system to a partition in the second system, responsive to the partition in the first system being unmapped to the partition in the second system.

20.

发明授权
Estimating performance of GPU application for different GPU-link performance ratio 有权

公开(公告)号：US10453167B1

公开(公告)日：2019-10-22

申请号：US15956321

申请日：2018-04-18

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Kiyokuni Kawachiya , Yasushi Negishi , Jun Doi

IPC分类号： G06F11/34 , G06T1/20 , G06F9/38

摘要： A computer-implemented method is provided for estimating the performance of a GPU application on a new computing machine having an increased GPU-link performance ratio relative to a current computing machine having a current GPU-link performance ratio. The method includes adding a delay to CPU-GPU communication on the current computing machine to simulate a delayed-communication environment on the current computing machine. The method further includes executing the target GPU application in the delayed-communication environment. The method also includes measuring the performance of the target GPU application in the delayed-communication environment. The method additionally includes estimating the performance of the new computing machine having the increased higher GPU-link performance ratio, based on the measured performance of the target GPU application in the delayed-communication environment.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类