APPARATUS, METHOD, DEVICE AND MEDIUM FOR LABEL-BALANCED CALIBRATION IN POST-TRAINING QUANTIZATION OF DNN

    公开(公告)号:US20250077861A1

    公开(公告)日:2025-03-06

    申请号:US18574995

    申请日:2021-11-03

    Abstract: The disclosure provides an apparatus, method, device and medium for label-balanced calibration in post-training quantization of DNNs. An apparatus includes interface circuitry configured to receive a training dataset and processor circuitry coupled to the interface circuitry. The processor circuitry is configured to generate a small ground truth dataset by selecting images with a ground truth number of 1 from the training dataset; generate a calibration dataset randomly from the training dataset; if any image in the calibration dataset has the ground truth number of 1, remove the image from the small ground truth dataset; generate a label balanced calibration dataset by replacing an image with a ground truth number greater than a preset threshold in the calibration dataset with a replacing image selected randomly from the small ground truth dataset; and perform calibration using the label balanced calibration dataset in post-training quantization. Other embodiments are disclosed and claimed.

    METHOD AND APPARATUS FOR OPTIMIZING INFERENCE OF DEEP NEURAL NETWORKS

    公开(公告)号:US20240289612A1

    公开(公告)日:2024-08-29

    申请号:US18571150

    申请日:2021-10-26

    CPC classification number: G06N3/08 G06N5/04

    Abstract: The application provides a hardware-aware cost model for optimizing inference of a deep neural network (DNN) comprising: a computation cost estimator configured to compute estimated computation cost based on input tensor, weight tensor and output tensor from the DNN; and a memory/cache cost estimator configured to perform memory/cache cost estimation strategy based on hardware specifications, wherein the hardware-aware cost model is used to perform performance simulation on target hardware to provide dynamic quantization knobs to quantization as required for converting a conventional precision inference model to an optimized inference model based on the result of the performance simulation.

Patent Agency Ranking