GENERATION OF IMAGE CORRESPONDING TO INPUT TEXT USING DYNAMIC VALUE CLIPPING

    公开(公告)号:US20240153152A1

    公开(公告)日:2024-05-09

    申请号:US18052866

    申请日:2022-11-04

    Applicant: Lemon Inc.

    CPC classification number: G06T11/00 G06F40/40 G06T5/002

    Abstract: Systems and methods are provided that include a processor executing a program to receive input text from a user. The processor is further configured to, for a predetermined number of iterations, input an initial image into a diffusion process to generate a processed image, back-propagate the processed image through a text-image match gradient calculator to calculate a gradient against the input text, and update the initial image with an image generated by applying the calculated gradient to the processed image. The pixel values of the processed image during a first portion of the predetermined number of iterations are value clamped to a first range, and pixel values of the processed image during a second portion of the predetermined number of iterations are value clamped to a second range that is a subset of the first range.

    VIDEO GENERATION METHOD, AND TRAINING METHOD FOR VIDEO GENERATION MODEL

    公开(公告)号:US20250131613A1

    公开(公告)日:2025-04-24

    申请号:US18834154

    申请日:2022-12-15

    Applicant: Lemon Inc.

    Abstract: Provided in the embodiments of the present disclosure are a video generation method, and a training method for a video generation model. The video generation method includes: acquiring a first video, wherein the first video includes a first object image; and inputting the first video into a pre-trained video generation model to obtain a second video, wherein the video generation model is obtained by means of performing training on the basis of a target image and a plurality of sample image pairs obtained from a plurality of first sample images, an object image in the second video is generated on the basis of a preset animal image in the target image and the first object image, and a background image of the second video is generated on the basis of a first background image of the first video.

    GENERATION OF IMAGE CORRESPONDING TO INPUT TEXT USING MULTI-TEXT GUIDED IMAGE CROPPING

    公开(公告)号:US20240153153A1

    公开(公告)日:2024-05-09

    申请号:US18052870

    申请日:2022-11-04

    Applicant: Lemon Inc.

    CPC classification number: G06T11/00 G06F40/40 G06T5/002

    Abstract: Systems and methods are provided that include a processor executing a program to receive an input from a user, where the input including a first input text and a second input text. The processor is further configured to provide an initial image and, for a predetermined number of iterations, define a first and second regions of the initial image associated with the first and second input texts, respectively, define a plurality of patches of the initial image, input the initial image into a diffusion process to generate a processed image, back-propagate the processed image through a text-image match gradient calculator by generating an image embedding based on the processed image, generating a text embedding based on the region and the input text that are associated with a patch, and calculating a differential between the image embedding and the text embedding.

    GENERATION OF CURATED TRAINING DATA FOR DIFFUSION MODELS

    公开(公告)号:US20240153194A1

    公开(公告)日:2024-05-09

    申请号:US18052865

    申请日:2022-11-04

    Applicant: Lemon Inc.

    CPC classification number: G06T15/02 G06F40/289 G06T5/002 G06T2207/20081

    Abstract: Systems and methods are provided that include a processor executing a program to match sentences from a sentence dataset with artistic phrases from an artistic phrase dataset to generate a plurality of safe phrases. The processor is further configured to, for each of the safe phrases, generate a safe image by, for a predetermined number of iterations, performing steps to input an initial image into a diffusion process to generate a processed image, wherein the diffusion process includes a first diffusion model, back-propagate the processed image through a text-image match gradient calculator to calculate a gradient against the safe phrase, and update the initial image by applying the gradient to the processed image. The processor is further configured to pair each of the generated safe images with their respective safe phrase to form a plurality of safe phrase-image pairs.

    GENERATION OF IMAGES CORRESPONDING TO INPUT TEXT USING MULTI-ALGORITHM DIFFUSION SAMPLING

    公开(公告)号:US20240153151A1

    公开(公告)日:2024-05-09

    申请号:US18052862

    申请日:2022-11-04

    Applicant: Lemon Inc.

    CPC classification number: G06T11/00 G06F40/40 G06T5/002

    Abstract: Systems and methods are provided that include a processor executing a program to process an initial image through a first diffusion stage to generate a final first stage image, wherein the first diffusion stage includes using a diffusion model, a gradient estimator model smaller than the diffusion model, and a text-image match gradient calculator. The processor further executes the program to process the final first stage image through a second diffusion stage to generate a final second stage image. The second diffusion stage includes, for a second predetermined number of iterations, inputting the final first stage image to through the diffusion model, back-propagate the image through the text-image match gradient calculator to calculate a second stage gradient against the input text, and update the final first stage image by applying the second stage gradient to the final first stage image.

    EXPRESSION DRIVING METHOD AND DEVICE, AND EXPRESSION DRIVING MODEL TRAINING METHOD AND DEVICE

    公开(公告)号:US20250078570A1

    公开(公告)日:2025-03-06

    申请号:US18726709

    申请日:2023-01-04

    Applicant: Lemon Inc.

    Abstract: The present disclosure provides an expression driving method and apparatus, and a training method and apparatus of an expression driving model. The expression driving method includes acquiring a first video; and inputting the first video into a pre-trained expression driving model to obtain a second video. The expression driving model is trained based on a target sample image and a plurality of first sample images. A facial image in the second video is generated based on the target sample image. A gesture expression feature of the facial image in the second video is the same as a gesture expression feature of a facial image in the first video.

Patent Agency Ranking