METHOD AND APPARATUS WITH NEURAL NETWORK INFERENCE OPTIMIZATION IMPLEMENTATION

    公开(公告)号:US20220083838A1

    公开(公告)日:2022-03-17

    申请号:US17244006

    申请日:2021-04-29

    Abstract: A method includes predicting, for sets of input data, an input data number of a subsequent interval of a first interval using an input data number of the first interval and an input data number of a previous interval of the first interval set in a neural network inference optimization, determining the predicted input data number to be a batch size of the subsequent interval, determining whether pipelining is to be performed in a target device based on a resource state of the target device, and applying, to the target device, an inference policy including the determined batch size and a result of the determining of whether the pipelining is to be performed.

Patent Agency Ranking