METHOD AND APPARATUS WITH DATA LOADING

    公开(公告)号:US20230140239A1

    公开(公告)日:2023-05-04

    申请号:US17868361

    申请日:2022-07-19

    IPC分类号: G06F16/906 G06F40/20 G06K9/62

    摘要: A processor-implemented method with data loading includes: dividing a training data set into a plurality of subsets based on sizes of a plurality of data files included in the training data set; loading, from each of the plurality of subsets, a portion of data files in the subset to a plurality of processors based on a proportion of a number of data files of the plurality of subsets in the subset and a batch size of distributed training; and reallocating, based on sizes of data files loaded to processors in a same group among the plurality of processors, the loaded data files to the processors in the same group.

    METHOD AND APPARATUS WITH DATA LOADING
    3.
    发明公开

    公开(公告)号:US20240231944A1

    公开(公告)日:2024-07-11

    申请号:US18351737

    申请日:2023-07-13

    IPC分类号: G06F9/50 G06F16/17

    CPC分类号: G06F9/5055 G06F16/1724

    摘要: A processor-implemented method with data loading includes: based on sizes of a plurality of data files in a training dataset, dividing the training dataset into a plurality of sub-sets; loading some data files in each sub-set into a plurality of processors; determining a packing combination of one or more data files loaded to processors in a same group among the plurality of processors, based on a ratio of a number of data files between the plurality of sub-sets and a batch size of distributed training; determining packed data files by packing the one or more data files according to the packing combination; and reallocating the packed data files to the processors in the same group.