COMPRESSING IMAGE-TO-IMAGE MODELS WITH AVERAGE SMOOTHING

    公开(公告)号:US20220292724A1

    公开(公告)日:2022-09-15

    申请号:US17191970

    申请日:2021-03-04

    Abstract: System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.

    Efficientformer vision transformer

    公开(公告)号:US12236668B2

    公开(公告)日:2025-02-25

    申请号:US17865178

    申请日:2022-07-14

    Abstract: A vision transformer network having extremely low latency and usable on mobile devices, such as smart eyewear devices and other augmented reality (AR) and virtual reality (VR) devices. The transformer network processes an input image, and the network includes a convolution stem configured to patch embed the image. A first stack of stages including at least two stages of 4-Dimension (4D) metablocks (MBs) (MB4D) follow the convolution stem. A second stack of stages including at least two stages of 3-Dimension MBs (MB3D) follow the MB4D stages. Each of the MB4D stages and each of the MB3D stages include different layer configurations, and each of the MB4D stages and each of the MB3D stages include a token mixer. The MB3D stages each additionally include a multi-head self attention (MHSA) processing block.

    LAYER FREEZING & DATA SIEVING FOR SPARSE TRAINING

    公开(公告)号:US20240070521A1

    公开(公告)日:2024-02-29

    申请号:US17893241

    申请日:2022-08-23

    CPC classification number: G06N20/00 G06K9/6256

    Abstract: A layer freezing and data sieving technique used in a sparse training domain for object recognition, providing end-to-end dataset-efficient training. The layer freezing and data sieving methods are seamlessly incorporated into a sparse training algorithm to form a generic framework. The generic framework consistently outperforms prior approaches and significantly reduces training floating point operations per second (FLOPs) and memory costs while preserving high accuracy. The reduction in training FLOPs comes from three sources: weight sparsity, frozen layers, and a shrunken dataset. The training acceleration depends on different factors, e.g., the support of the sparse computation, layer type and size, and system overhead. The FLOPs reduction from the frozen layers and shrunken dataset leads to higher actual training acceleration than weight sparsity.

Patent Agency Ranking