ULTRA PIPELINED ACCELERATOR FOR MACHINE LEARNING INFERENCE
Abstract:
A method of pipelining inference of a neural network, which includes an i-th layer (i being an integer greater than zero), an (i+1)-th layer, and an (i+2)-th layer, includes processing a first set of i-th values of the i-th layer to generate (i+1)-th values for the (i+1)-th layer, determining a quantity of the (i+1)-th values as being sufficient for processing, and in response to the determining, processing the (i+1)-th values to generate an output value for the (i+2)-th layer while concurrently processing a second set of i-th values of the i-th layer.
Public/Granted literature
Information query
Patent Agency Ranking
0/0