NEURAL NETWORK LAYER OPTIMIZATION
    1.
    发明公开

    公开(公告)号:US20240062059A1

    公开(公告)日:2024-02-22

    申请号:US18191700

    申请日:2023-03-28

    CPC classification number: G06N3/08

    Abstract: Various examples disclosed herein relate to neural network quantization techniques, and more particularly, to selecting inference precisions for the layers of the neural network. In an example embodiment, a method is provided herein that includes determining an accuracy improvement of a layer of a neural network implemented using a first bit precision relative to using a second bit precision and determining a latency degradation of the layer of the neural network implemented using the first bit precision relative to using the second bit precision. The method further includes selecting, based on the accuracy improvement and the latency degradation, the first bit precision or the second bit precision for use in implementing the layer of the neural network.

Patent Agency Ranking