SELF-ADAPTABLE ACCELERATORS HAVING ALTERNATING PRODUCTION/OPTIMIZING MODES

    公开(公告)号:US20240362031A1

    公开(公告)日:2024-10-31

    申请号:US18308275

    申请日:2023-04-27

    CPC classification number: G06F9/44505 G06F11/3495

    Abstract: Systems and methods are provided for an accelerator system that includes a baseline (production) accelerator, optimizing accelerator, and control hardware accelerator, and an operation of alternatingly switching the production/optimizing accelerators between production and optimizing. With two production/optimizing accelerators, at any given point in time, one accelerator adapts while another accelerator processes data. Once the second accelerator starts doing a better job (e.g., has adapted to data drift), the accelerators change their modes, and the trainable accelerator becomes the “optimized” one. The accelerators do this non-stop, thus maintaining redundancy, providing expected quality of service (QOS) and adapting to data/concept drift.

    MULTI-DIE DOT-PRODUCT ENGINE TO PROVISION LARGE SCALE MACHINE LEARNING INFERENCE APPLICATIONS

    公开(公告)号:US20240211212A1

    公开(公告)日:2024-06-27

    申请号:US18601259

    申请日:2024-03-11

    CPC classification number: G06F7/5443 G06F9/3867 G06F9/522 G06F40/20 G06N3/063

    Abstract: Systems and methods are provided for a multi-die dot-product engine (DPE) to provision large-scale machine learning inference applications. The multi-die DPE leverages a multi-chip architecture. For example, a multi-chip interface can include a plurality of DPE chips, where each DPE chip performs inference computations for performing deep learning operations. A hardware interface between a memory of a host computer and the plurality of DPE chips communicatively connects the plurality of DPE chips to the memory of the host computer system during an inference operation such that the deep learning operations are spanned across the plurality of DPE chips. Due to the multi-die architecture, multiple silicon devices are allowed to be used for inference, thereby enabling power-efficient inference for large-scale machine learning applications and complex deep neural networks. The multi-die DPE can be used to build a multi-device DNN inference system performing specific applications, such as object recognition, with high accuracy.

Patent Agency Ranking