-
公开(公告)号:US20230297487A1
公开(公告)日:2023-09-21
申请号:US17889135
申请日:2022-08-16
发明人: Myeong Woo KIM , Kunwoo KIM , Changsu KIM , Hanjun KIM
CPC分类号: G06F11/3423 , G06F11/3037
摘要: A method and apparatus for estimating execution time of a neural network are provided, the method of estimating execution time of a neural network in a multi-core accelerator, the method including generating trace information including operation timing information for each core of the multi-core accelerator, and calculating the execution time of the neural network reflecting communication overhead between cores of the multi-core accelerator and memory access time for each core of the cores, based on the trace information.
-
公开(公告)号:US20230140239A1
公开(公告)日:2023-05-04
申请号:US17868361
申请日:2022-07-19
发明人: Myeong Woo KIM , Yongdeok KIM , Narankhuu TUVSHINJARGAL , Gunhee KIM , Seungwon LEE , Changin CHOI
IPC分类号: G06F16/906 , G06F40/20 , G06K9/62
摘要: A processor-implemented method with data loading includes: dividing a training data set into a plurality of subsets based on sizes of a plurality of data files included in the training data set; loading, from each of the plurality of subsets, a portion of data files in the subset to a plurality of processors based on a proportion of a number of data files of the plurality of subsets in the subset and a batch size of distributed training; and reallocating, based on sizes of data files loaded to processors in a same group among the plurality of processors, the loaded data files to the processors in the same group.
-
公开(公告)号:US20240231944A1
公开(公告)日:2024-07-11
申请号:US18351737
申请日:2023-07-13
发明人: Myeong Woo KIM , Yongdeok KIM , Changin CHOI , Seungwon LEE
CPC分类号: G06F9/5055 , G06F16/1724
摘要: A processor-implemented method with data loading includes: based on sizes of a plurality of data files in a training dataset, dividing the training dataset into a plurality of sub-sets; loading some data files in each sub-set into a plurality of processors; determining a packing combination of one or more data files loaded to processors in a same group among the plurality of processors, based on a ratio of a number of data files between the plurality of sub-sets and a batch size of distributed training; determining packed data files by packing the one or more data files according to the packing combination; and reallocating the packed data files to the processors in the same group.
-
公开(公告)号:US20220114426A1
公开(公告)日:2022-04-14
申请号:US17191292
申请日:2021-03-03
发明人: Myeong Woo KIM , Hanwoong JUNG
摘要: A neural network operation apparatus includes: a receiver configured to receive a first input feature map; a controller configured to control multiplier-accumulators (MACs) included in a first MAC array; and a first operation engine comprising the first MAC array and configured to process the first input feature map based on the MACs of which operation states are controlled.
-
-
-