HARDWARE ACCELERATOR OPTIMIZED GROUP CONVOLUTION BASED NEURAL NETWORK MODELS

    公开(公告)号:US20240386260A1

    公开(公告)日:2024-11-21

    申请号:US18693724

    申请日:2021-10-08

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for processing an input image using integrated circuit that implements a convolutional neural network with a group convolution layer. The processing includes determining a mapping of partitions along a channel dimension of an input feature map to multiply accumulate cells (MACs) in a computational unit of the circuit and applying a group convolution to the input feature map. Applying the group convolution includes, for each partition: providing weights for the group convolution layer to a subset of MACs based on the mapping; providing, via an input bus of the circuit, an input of the feature map to each MAC in the subset; and computing, at each MAC in the subset, a product using the input and a weight for the group convolution layer. An output feature map is generated for the group convolution layer based on an accumulation of products.

    NEURAL NETWORK ARCHITECTURE FOR IMPLEMENTING GROUP CONVOLUTIONS

    公开(公告)号:US20250124700A1

    公开(公告)日:2025-04-17

    申请号:US18694626

    申请日:2021-10-08

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for processing an input image using a convolutional neural network (CNN). The CNN includes a sequence of layer blocks. Each of a first subset of the layer blocks in the sequence is configured to perform operations that include: i) receiving an input feature map for the layer block, ii) generating an expanded feature map from the input feature map using a group convolution, and iii) generating a reduced feature map from the expanded feature map. The input feature map is an h w feature map with c1 channels. The expanded feature map is an h w feature map with c2 channels, whereas the reduced feature map is an h w feature map with c1 channels. C2 is greater than c1. An output feature map is generated for the layer block from the reduced feature map.

    NEURAL ARCHITECTURE AND HARDWARE ACCELERATOR SEARCH

    公开(公告)号:US20240005129A1

    公开(公告)日:2024-01-04

    申请号:US18029849

    申请日:2021-10-01

    Applicant: Google LLC

    CPC classification number: G06N3/045 G06N3/092 G06N3/0464 G06N3/044 G06N3/063

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for jointly determining neural network architectures and hardware accelerator architectures. In one aspect, a method includes: generating, using a controller policy, a batch of one or more output sequences, each output sequence in the batch defining a respective architecture of a child neural network and a respective architecture of a hardware accelerator; for each output sequence in the batch: training a respective instance of the child neural network having the architecture defined by the output sequence; evaluating a network performance of the trained instance of the child neural; and evaluating an accelerator performance of a respective instance of the hardware accelerator having the architecture defined by the output sequence to determine an accelerator performance metric for the instance of the hardware accelerator; and using the network performance metrics and the accelerator performance metrics to adjust the controller policy.

Patent Agency Ranking