POWER-EFFICIENT DEEP NEURAL NETWORK MODULE CONFIGURED FOR EXECUTING A LAYER DESCRIPTOR LIST

    公开(公告)号:US20180300614A1

    公开(公告)日:2018-10-18

    申请号:US15951106

    申请日:2018-04-11

    摘要: A deep neural network (DNN) processor is configured to execute descriptors in layer descriptor lists. The descriptors define instructions for performing a pass of a DNN by the DNN processor. Several types of descriptors can be utilized: memory-to-memory move (M2M) descriptors; operation descriptors; host communication descriptors; configuration descriptors; branch descriptors; and synchronization descriptors. A DMA engine uses M2M descriptors to perform multi-dimensional strided DMA operations. Operation descriptors define the type of operation to be performed by neurons in the DNN processor and the activation function to be used by the neurons. M2M descriptors are buffered separately from operation descriptors and can be executed at soon as possible, subject to explicitly set dependencies. As a result, latency can be reduced and, consequently, the neurons can complete their processing faster. The DNN module can then be powered down earlier than it otherwise would have, thereby saving power.

    DYNAMIC SEQUENCING OF DATA PARTITIONS FOR OPTIMIZING MEMORY UTILIZATION AND PERFORMANCE OF NEURAL NETWORKS

    公开(公告)号:US20180300601A1

    公开(公告)日:2018-10-18

    申请号:US15719351

    申请日:2017-09-28

    IPC分类号: G06N3/04 G06F3/06 G06N3/063

    摘要: Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.