-
公开(公告)号:US20220398431A1
公开(公告)日:2022-12-15
申请号:US17775549
申请日:2019-11-13
发明人: Kenji Tanaka , Yuki Arikawa , Kenji Kawai , Junichi Kato , Tsuyoshi Ito , Huycu Ngo , Takeshi Sakamoto
摘要: Provided is a distributed deep learning system including a plurality of distributed processing nodes, in which each of the plurality of distributed processing nodes includes a header reading unit configured to read pieces of layer information of headers of a first data frame that has arrived at an own node and a second data frame that has arrived next, and in which the pieces of layer information are compared with each other, calculation processing is executed for a data frame including data that belongs to a layer closer to an input layer, and calculation processing for a data frame including data that belongs to a layer closer to an output layer is skipped.
-
公开(公告)号:US20210357723A1
公开(公告)日:2021-11-18
申请号:US17291229
申请日:2019-10-23
发明人: Kenji Kawai , Junichi Kato , Huycu Ngo , Yuki Arikawa , Kenji Tanaka , Takeshi Sakamoto , Tsuyoshi Ito
摘要: A distributed processing system includes a plurality of lower-order aggregation networks and a higher-order aggregation network. The lower-order aggregation networks include a plurality of distributed processing nodes disposed in a ring form. The distributed processing nodes generate distributed data for each weight of a neural network of an own node. The lower-order aggregation networks aggregate, for each lower-order aggregation network, the distributed data generated by the distributed processing nodes. The higher-order aggregation network generates aggregated data where the aggregation results of the lower-order aggregation networks are further aggregated, and distributes to the lower-order aggregation networks. The lower-order aggregation networks distribute the aggregated data distributed thereto to the distributed processing nodes belonging to the same lower-order aggregation network. The distributed processing nodes update weights of the neural network based on the distributed aggregated data.
-
公开(公告)号:US10193630B2
公开(公告)日:2019-01-29
申请号:US15555960
申请日:2016-03-04
发明人: Shoko Ohteru , Namiko Ikeda , Saki Hatta , Satoshi Shigematsu , Nobuyuki Tanaka , Kenji Kawai , Junichi Kato , Tomoaki Kawamura , Hiroyuki Uzawa , Yuki Arikawa , Naoki Miura
IPC分类号: H04J14/08 , H04B10/27 , H04B10/272 , H04L12/44 , H04B10/038 , H04B10/40 , H04J14/02
摘要: A selection and distribution circuit (13) is provided between N optical transceivers (11) and one PON control circuit (12). The selection and distribution circuit (13) selects the optical transceiver (11) corresponding to an upstream frame that time-divisionally arrives, thereby transferring the upstream frame opto-electrically converted by the transceiver (11) to the PON control circuit (12) and distributing a downstream frame from the PON control circuit (12) to each optical transceiver (11). A power supply control circuit (23) stops power supply to at least one of one of optical transceivers (11) that are not used to transfer the frame of the optical transceivers (11) and a circuit that is not used to transfer the frame in the selection and distribution circuit (13). This can reduce the system cost per ONU in the optical transmission system.
-
公开(公告)号:US12008468B2
公开(公告)日:2024-06-11
申请号:US16967702
申请日:2019-02-06
发明人: Junichi Kato , Kenji Kawai , Huycu Ngo , Yuki Arikawa , Tsuyoshi Ito , Takeshi Sakamoto
摘要: Each of learning nodes calculates gradients of a loss function from an output result obtained by inputting learning data to a learning target neural network, converts a calculation result into a packet, and transmits the packet to a computing interconnect device. The computing interconnect device receives the packet transmitted from each of the learning nodes, acquires a value of the gradients stored in the packet, calculates a sum of the gradients, converts a calculation result into a packet, and transmits the packet to each of the learning nodes. Each of the learning nodes receives the packet transmitted from the computing interconnect device and updates a constituent parameter of a neural network based on a value stored in the packet.
-
公开(公告)号:US11360741B2
公开(公告)日:2022-06-14
申请号:US16959968
申请日:2018-12-18
发明人: Kenji Kawai , Ryo Awata , Kazuhito Takei , Masaaki Iizuka
摘要: An arithmetic circuit includes an LUT generation circuit (1) that, when coefficients c[n] (n=1, . . . , N) are paired two by two, outputs a value calculated for each of the pairs, and a distributed arithmetic circuit (2-m) that calculates values y[m] of product-sum arithmetic, by which data x[m, n] of a data set X[m] containing M pairs of data x[m, n] is multiplied by the coefficients c[n] and the products are summed up, in parallel for each of the M pairs. The distributed arithmetic circuit (2-m) includes a plurality of binomial distributed arithmetic circuits that calculate the value of binomial product-sum arithmetic in parallel for each of the pairs, based on a value obtained by pairing N data x[m, n] corresponding to the circuit two by two, a value obtained by pairing the coefficients c[n] two by two, and the value calculated by the LUT generation circuit (1), and a binomial distributed arithmetic result summing circuit that sums up the values calculated by the binomial distributed arithmetic circuits and outputs the sum as y[m].
-
公开(公告)号:US11240296B2
公开(公告)日:2022-02-01
申请号:US17287063
申请日:2019-10-07
发明人: Kenji Kawai , Junichi Kato , Huycu Ngo , Yuki Arikawa , Tsuyoshi Ito , Takeshi Sakamoto
摘要: A first distributed processing node transmits distributed data to a second distributed processing node as intermediate consolidated data. A third distributed processing node generates intermediate consolidated data after update from received intermediate consolidated data and distributed data, and transmits the intermediate consolidated data to a fourth distributed processing node. The first distributed processing node transmits the received intermediate consolidated data to fifth distributed processing node as consolidated data. The third distributed processing node transmits the received consolidated data to a sixth distributed processing node. When an aggregation communication time period required by each distributed processing node to consolidate the distributed data or an aggregation dispatch communication time period being a total time period of the aggregation communication time period and a time period required by each distributed processing node to dispatch the consolidated data exceeds a predetermined time period, the first distributed processing node issues a warning.
-
公开(公告)号:US11036871B2
公开(公告)日:2021-06-15
申请号:US16332777
申请日:2017-09-13
发明人: Takeshi Sakamoto , Kenji Kawai , Junichi Kato , Kazuhiko Terada , Hiroyuki Uzawa , Nobuyuki Tanaka , Tomoaki Kawamura
IPC分类号: G06F21/60 , H04L12/44 , H04B10/2575 , H04B10/11 , H04B10/27
摘要: An OLT (10) is provided with a priority control bypass circuit (16) and an encryption/decryption bypass circuit (17), or an ONU (20) is provided with a priority control bypass circuit (26) and an encryption/decryption bypass circuit (27), and one or both of encryption/decryption processing and priority control processing are bypassed in accordance with a priority control bypass instruction (BP) and an encryption/decryption bypass instruction (BE), which are set in advance. This reduces a processing delay that occurs in the OLT or the ONU.
-
公开(公告)号:US20210056416A1
公开(公告)日:2021-02-25
申请号:US16979066
申请日:2019-02-25
发明人: Junichi Kato , Kenji Kawai , Huycu Ngo , Yuki Arikawa , Tsuyoshi Ito , Takeshi Sakamoto
摘要: Each of learning nodes calculates a gradient of a loss function from an output result obtained when learning data is input to a neural network to be learned, generates a packet for a plurality of gradient components, and transmits the packet to the computing interconnect device. The computing interconnect device acquires the values of a plurality of gradient components stored in the packet transmitted from each of the learning nodes, performs a calculation process in which configuration values of gradients with respect to the same configuration parameter of the neural network are input on each of a plurality of configuration values of each gradient in parallel, generates a packet for the calculation results, and transmits the packet to each of the learning nodes. Each of the learning nodes updates the configuration parameters of the neural network based on the value stored in the packet.
-
公开(公告)号:US10485008B2
公开(公告)日:2019-11-19
申请号:US15756016
申请日:2016-08-26
发明人: Yuki Arikawa , Hiroyuki Uzawa , Kenji Kawai , Satoshi Shigematsu
摘要: A convergence pattern selection unit (10A) sequentially generates a plurality of different patterns based on designated initial conditions, selects, as a convergence pattern, a pattern in which evaluation value has converged to an extreme value, and repeatedly executes selection of the convergence pattern by changing the initial conditions every time the convergence pattern is selected. A transmission pattern determination unit (10B) selects, as an optimum transmission pattern, one of the convergence patterns obtained by the convergence pattern selection unit (10A), which has the highest evaluation value. This allows searches for an optimum transmission pattern having a better evaluation value.
-
公开(公告)号:US10397133B2
公开(公告)日:2019-08-27
申请号:US15745413
申请日:2016-07-14
发明人: Saki Hatta , Tomoaki Kawamura , Kenji Kawai , Nobuyuki Tanaka , Satoshi Shigematsu , Namiko Ikeda , Shoko Ohteru , Junichi Kato
摘要: An upstream allocation circuit (14) and a downstream allocation circuit (15) are provided in an OLT (1). For example, a superimposed frame obtained by bundling upstream frames (upstream control frames+upstream data frames) from all ONUS is input to the upstream allocation circuit (14) via a frame reproduction circuit (12-1). The superimposed frame may be generated at the stage of optical signals or generated after converting optical signals into electrical signals. The upstream allocation circuit (14) allocates each of the upstream control frames bundled into the superimposed frame to a predetermined PON control circuit (13) based on information (PON port number or LLID) added to the frames. The downstream allocation circuit (15) allocates, to a preset frame reproduction circuit (12), each downstream control frames output from the PON control circuits (13).
-
-
-
-
-
-
-
-
-