HARDWARE-AWARE EFFICIENT ARCHITECTURES FOR TEXT-TO-IMAGE DIFFUSION MODELS

    公开(公告)号:US20250131606A1

    公开(公告)日:2025-04-24

    申请号:US18492572

    申请日:2023-10-23

    Abstract: A processor-implemented method includes receiving a text-semantic input at a first stage of a neural network, including a first convolutional block and no attention layers. The method receives, at a second stage, a first output from the first stage. The second stage comprises a first down sampling block including a first attention layer and a second convolutional block. The method receives, at a third stage, a second output from the second stage. The third stage comprises a first up sampling block including a second attention layer and a first set of convolutional blocks. The method receives, at a fourth stage, the first output from the first stage and a third output from the third stage. The fourth stage comprises a second up sampling block including no attention layers and a second set of convolutional blocks. The method generates an image at the fourth stage, based on the text-semantic input.

    EFFICIENT DIFFUSION MACHINE LEARNING MODELS

    公开(公告)号:US20250124551A1

    公开(公告)日:2025-04-17

    申请号:US18488786

    申请日:2023-10-17

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. During a first iteration of processing data using a first denoising backbone of a teacher diffusion machine learning model, a first latent tensor is generated using a lower resolution block of the first denoising backbone. During a first iteration of processing data using a second denoising backbone of a student diffusion machine learning model, a second latent tensor is generated using an adapter block of the second denoising backbone. A loss is generated based on the first and second latent tensors, and one or more parameters of the adapter block are updated based on the loss.

    DISPARITY-BASED DEPTH REFINEMENT USING CONFIDENCE INFORMATION AND STEREOSCOPIC DEPTH INFORMATION

    公开(公告)号:US20240404093A1

    公开(公告)日:2024-12-05

    申请号:US18327380

    申请日:2023-06-01

    Abstract: Systems and techniques are provided for generating disparity information from two or more images. For example, a process can include obtaining first disparity information corresponding to a pair of images, the pair of images including a first image of a scene and a second image of the scene. The process can include obtaining confidence information associated with the first disparity information. The process can include processing, using a machine learning network, the first disparity information and the confidence information to generate second disparity information corresponding to the pair of images. The process can include combining, based on the confidence information, the first disparity information with the second disparity information to generate a refined disparity map corresponding to the pair of images.

Patent Agency Ranking