-
公开(公告)号:US11880757B2
公开(公告)日:2024-01-23
申请号:US18095960
申请日:2023-01-11
Applicant: Apple Inc.
Inventor: Christopher L Mills
Abstract: Embodiments relate to a neural engine circuit that includes an input buffer circuit, a kernel extract circuit, and a multiply-accumulator (MAC) circuit. The MAC circuit receives input data from the input buffer circuit and a kernel coefficient from the kernel extract circuit. The MAC circuit contains several multiply-add (MAD) circuits and accumulators used to perform neural networking operations on the received input data and kernel coefficients. MAD circuits are configured to support fixed-point precision (e.g., INT8) and floating-point precision (FP16) of operands. In floating-point mode, each MAD circuit multiplies the integer bits of input data and kernel coefficients and adds their exponent bits to determine a binary point for alignment. In fixed-point mode, input data and kernel coefficients are multiplied. In both operation modes, the output data is stored in an accumulator, and may be sent back as accumulated values for further multiply-add operations in subsequent processing cycles.
-
公开(公告)号:US20230289291A1
公开(公告)日:2023-09-14
申请号:US17691609
申请日:2022-03-10
Applicant: Apple Inc.
Inventor: Seungjin Lee , Jaewon Shin , Christopher L Mills
IPC: G06F12/0862
CPC classification number: G06F12/0862 , G06F2212/602
Abstract: A neural processor may include a system memory access circuit coupled to a system memory. The system memory access circuit is configured to fetch, from the system memory, first input data of a first task associated with a neural network. The neural processor may also include neural engines coupled to the system memory access circuit. The neural engines are configured to perform convolution operations on the first input data in a first set of operating cycles. The neural processor may further include a cache access circuit coupled to a cache. The cache access circuit is configured to instruct the cache to prefetch from the system memory, during the first set of operating cycles corresponding to the first task, second input data of a second task of the neural network. The second task is scheduled for processing in a second set of operating cycles after the first set of operating cycles.
-