RUNTIME OPTIMIZATION OF COMPUTATIONS OF AN ARTIFICIAL NEURAL NETWORK COMPILED FOR EXECUTION ON A DEEP LEARNING ACCELERATOR
Abstract:
Systems, devices, and methods related to a Deep Learning Accelerator and memory are described. For example, an integrated circuit device may be configured to execute instructions with matrix operands and configured with random access memory (RAM). A compiler is configured to generate instructions executable by the Deep Learning Accelerator from a description of a target artificial neural network. The instructions may call routines in a runtime library that has an embedded artificial neural network configured to predict optimized execution options available to implement the routines. The prediction is based at least in part on a pattern of data being processed in the target artificial neural network and/or a pattern of usages of the routines by the instructions.
Information query
Patent Agency Ranking
0/0