Compute accelerator with 3D data flows

    公开(公告)号:US11341086B2

    公开(公告)日:2022-05-24

    申请号:US17093227

    申请日:2020-11-09

    Applicant: Rambus Inc.

    Abstract: An array of processing elements are arranged in a three-dimensional array. Each of the processing elements includes or is coupled to a dedicated memory. The processing elements of the array are intercoupled to their nearest neighbor processing elements. A processing element on a first die may be intercoupled to a first processing element on a second die that is located directly above the processing element, a second processing element on a third die that is located directly below the processing element, and the four adjacent processing elements on the first die. This intercoupling allows data to flow from processing element to processing element in the three directions. These dataflows are reconfigurable so that they may be optimized for the task. The data flows of the array may be configured into one or more loops that periodically recycle data in order to accomplish different parts of a calculation.

    SYSTEMS AND METHODS FOR ACCELERATED NEURAL-NETWORK CONVOLUTION AND TRAINING

    公开(公告)号:US20220335283A1

    公开(公告)日:2022-10-20

    申请号:US17845769

    申请日:2022-06-21

    Applicant: Rambus Inc.

    Abstract: An application-specific integrated circuit for an artificial neural network is integrated with a high-bandwidth memory. The neural network includes a systolic array of interconnected processing elements, including upstream processing elements and downstream processing elements. Each processing element includes input/output port pairs for concurrent forward and back propagation. The processing elements can be used for convolution, in which case the input/output port pairs can support the fast and efficient scanning of kernels relative to activations.

Patent Agency Ranking