EFFICIENT SCALING OF PARTITIONED NEURAL NETWORK INFERENCE

    公开(公告)号:US20250094823A1

    公开(公告)日:2025-03-20

    申请号:US18368801

    申请日:2023-09-15

    Abstract: In one implementation, a controller determines performance of a partitioned neural network. The controller identifies, based on the performance, a particular partition of the partitioned neural network as a bottleneck. The controller configures a first device to execute a replica of the particular partition. The controller configures a multiplexer that provides an output of the particular partition or the replica of the particular partition as input to a downstream partition of the partitioned neural network.

Patent Agency Ranking