-
公开(公告)号:US20240135147A1
公开(公告)日:2024-04-25
申请号:US18450839
申请日:2023-08-15
发明人: Jung Ho AHN , Sun Jung LEE , Jae Wan CHOI
IPC分类号: G06N3/0455
CPC分类号: G06N3/0455
摘要: A device including processors configured to execute instructions and memories storing the instructions, which when executed by the processors configure the processors to perform an operation for training a transformer model having a plurality of encoders and a plurality of decoders by configuring the processors to identify the batches of training data into a plurality of micro-batches, select layer pairs for the plurality of micro-batches, assemble a processing order of the layer pairs, determining resource information to be allocated to the layer pairs, and allocate resources to the layer pairs based on the determined resource information to be allocated to the layer pairs, dependent con the processing order of the layer pairs.
-
公开(公告)号:US20240232581A9
公开(公告)日:2024-07-11
申请号:US18450839
申请日:2023-08-16
发明人: Jung Ho AHN , Sun Jung LEE , Jae Wan CHOI
IPC分类号: G06N3/0455
CPC分类号: G06N3/0455
摘要: A device including processors configured to execute instructions and memories storing the instructions, which when executed by the processors configure the processors to perform an operation for training a transformer model having a plurality of encoders and a plurality of decoders by configuring the processors to identify the batches of training data into a plurality of micro-batches, select layer pairs for the plurality of micro-batches, assemble a processing order of the layer pairs, determining resource information to be allocated to the layer pairs, and allocate resources to the layer pairs based on the determined resource information to be allocated to the layer pairs, dependent con the processing order of the layer pairs.
-
公开(公告)号:US20240184630A1
公开(公告)日:2024-06-06
申请号:US18526603
申请日:2023-12-01
发明人: Jung Ho AHN , Sun Jung LEE , Jae Wan CHOI , Seung Hwan HWANG
CPC分类号: G06F9/5027 , G06F5/01 , G06F15/8046
摘要: A device and method with batch normalization are provided. An accelerator includes: core modules, each core module including a respective plurality of cores configured to perform a first convolution operation using feature map data and a weight; local reduction operation modules adjacent to the respective core modules, each including a respective plurality of local reduction operators configured to perform a first local operation that obtains first local statistical values of the corresponding core module; a global reduction operation module configured to perform a first global operation that generates first global statistical values of the core module based on the first local statistical values of the core modules; and a normalization operation module configured to perform a first normalization operation on the feature map data based on the first global statistical values.
-
-