Neural network processor for handling differing datatypes

    公开(公告)号:US11880757B2

    公开(公告)日:2024-01-23

    申请号:US18095960

    申请日:2023-01-11

    Applicant: Apple Inc.

    CPC classification number: G06N3/04 G06F7/50 G06F7/523 G06N3/08

    Abstract: Embodiments relate to a neural engine circuit that includes an input buffer circuit, a kernel extract circuit, and a multiply-accumulator (MAC) circuit. The MAC circuit receives input data from the input buffer circuit and a kernel coefficient from the kernel extract circuit. The MAC circuit contains several multiply-add (MAD) circuits and accumulators used to perform neural networking operations on the received input data and kernel coefficients. MAD circuits are configured to support fixed-point precision (e.g., INT8) and floating-point precision (FP16) of operands. In floating-point mode, each MAD circuit multiplies the integer bits of input data and kernel coefficients and adds their exponent bits to determine a binary point for alignment. In fixed-point mode, input data and kernel coefficients are multiplied. In both operation modes, the output data is stored in an accumulator, and may be sent back as accumulated values for further multiply-add operations in subsequent processing cycles.

    CACHE PREFETCH FOR NEURAL PROCESSOR CIRCUIT
    2.
    发明公开

    公开(公告)号:US20230289291A1

    公开(公告)日:2023-09-14

    申请号:US17691609

    申请日:2022-03-10

    Applicant: Apple Inc.

    CPC classification number: G06F12/0862 G06F2212/602

    Abstract: A neural processor may include a system memory access circuit coupled to a system memory. The system memory access circuit is configured to fetch, from the system memory, first input data of a first task associated with a neural network. The neural processor may also include neural engines coupled to the system memory access circuit. The neural engines are configured to perform convolution operations on the first input data in a first set of operating cycles. The neural processor may further include a cache access circuit coupled to a cache. The cache access circuit is configured to instruct the cache to prefetch from the system memory, during the first set of operating cycles corresponding to the first task, second input data of a second task of the neural network. The second task is scheduled for processing in a second set of operating cycles after the first set of operating cycles.

Patent Agency Ranking