-
公开(公告)号:US20240364925A1
公开(公告)日:2024-10-31
申请号:US18636126
申请日:2024-04-15
Applicant: QUALCOMM Incorporated
Inventor: Hoang Cong Minh LE , Qiqi HOU , Farzad FARHADZADEH , Amir SAID , Auke Joris WIGGERS , Guillaume Konrad SAUTIERE , Reza POURREZA
IPC: H04N19/597 , H04N19/137 , H04N19/436
CPC classification number: H04N19/597 , H04N19/137 , H04N19/436
Abstract: Systems and techniques are described herein for processing video data. For example, a machine-learning based stereo video coding system can obtain video data including at least a right-view image of a right view of a scene and a left-view image of a left view of the scene. The machine-learning based stereo video coding system can compress the right-view image and the left-view image in parallel to generate a latent representation of the right-view image and the left-view image. The right-view image and the left-view image can be compressed in parallel based on inter-view information between the right-view image and the left-view image, determined using one or more parallel autoencoders.
-
公开(公告)号:US20250166236A1
公开(公告)日:2025-05-22
申请号:US18511692
申请日:2023-11-16
Applicant: QUALCOMM Incorporated
Inventor: Kambiz AZARIAN YAZDI , Fatih Murat PORIKLI , Qiqi HOU , Debasmit DAS
IPC: G06T11/00 , G06F40/284 , G06T5/00
Abstract: Certain aspects of the present disclosure provide techniques for generating an output image based on a text prompt. A method may include receiving the text prompt; providing a user interface comprising one or more input elements associated with one or more words of the text prompt; receiving input corresponding to at least one of the one or more input elements, the input indicating a semantic importance for each of at least one of the one or more words associated with the at least one of the one or more input elements; and generating the output image based on the text prompt and the input.
-
公开(公告)号:US20250131606A1
公开(公告)日:2025-04-24
申请号:US18492572
申请日:2023-10-23
Applicant: QUALCOMM Incorporated
Inventor: Shubhankar Mangesh BORSE , Risheek GARREPALLI , Qiqi HOU , Jisoo JEONG , Shreya KADAMBI , Munawar HAYAT , Fatih Murat PORIKLI
Abstract: A processor-implemented method includes receiving a text-semantic input at a first stage of a neural network, including a first convolutional block and no attention layers. The method receives, at a second stage, a first output from the first stage. The second stage comprises a first down sampling block including a first attention layer and a second convolutional block. The method receives, at a third stage, a second output from the second stage. The third stage comprises a first up sampling block including a second attention layer and a first set of convolutional blocks. The method receives, at a fourth stage, the first output from the first stage and a third output from the third stage. The fourth stage comprises a second up sampling block including no attention layers and a second set of convolutional blocks. The method generates an image at the fourth stage, based on the text-semantic input.
-
公开(公告)号:US20250131325A1
公开(公告)日:2025-04-24
申请号:US18492492
申请日:2023-10-23
Applicant: QUALCOMM Incorporated
Inventor: Risheek GARREPALLI , Shubhankar Mangesh BORSE , Jisoo JEONG , Qiqi HOU , Shreya KADAMBI , Munawar HAYAT , Fatih Murat PORIKLI
IPC: G06N20/00
Abstract: A method for training a diffusion model includes compressing the diffusion model by removing at least one of: one or more model parameters or one or more giga multiply-accumulate operations (GMACs). The method also includes performing guidance conditioning to train the compressed diffusion model, the guidance conditioning combining a conditional output and an unconditional output from respective teacher models. The method further includes performing, after the guidance conditioning, step distillation on the compressed diffusion model.
-
公开(公告)号:US20250131277A1
公开(公告)日:2025-04-24
申请号:US18492529
申请日:2023-10-23
Applicant: QUALCOMM Incorporated
Inventor: Risheek GARREPALLI , Shubhankar Mangesh BORSE , Jisoo JEONG , Qiqi HOU , Shreya KADAMBI , Munawar HAYAT , Fatih Murat PORIKLI
IPC: G06N3/09
Abstract: A method for training a control neural network includes initializing a baseline diffusion model for training the control neural network, each stage of a control neural network training pipeline corresponding to an element of the baseline diffusion model. The method also includes training, the control neural network, in a stage-wise manner, each stage of the control neural network training pipeline receiving an input from a previous stage of the control neural network training pipeline and the corresponding element of the diffusion model.
-
公开(公告)号:US20250131276A1
公开(公告)日:2025-04-24
申请号:US18492508
申请日:2023-10-23
Applicant: QUALCOMM Incorporated
Inventor: Risheek GARREPALLI , Shubhankar Mangesh BORSE , Jisoo JEONG , Qiqi HOU , Shreya KADAMBI , Munawar HAYAT , Fatih Murat PORIKLI
IPC: G06N3/09
Abstract: A method for training a diffusion model includes randomly selecting, for each iteration of a step distillation training process, a teacher model of a group of teacher models. The method also includes applying, at each iteration, a clipped input space within step distillation of the randomly selected teacher model. The method further includes updating, at each iteration, parameters of the diffusion model based on guidance from the randomly selected teacher model.
-
-
-
-
-