Patent search ap:("QUALCOMM Incorporated") AND inv:"Manish Kumar SINGH" Page 1

1.

发明申请
DEPTH ESTIMATION BASED ON FEATURE RECONSTRUCTION WITH ADAPTIVE MASKING AND MOTION PREDICTION 有权

公开(公告)号：US20250148633A1

公开(公告)日：2025-05-08

申请号：US18666502

申请日：2024-05-16

Applicant: QUALCOMM Incorporated

Inventor： Rajeev YASARLA , Hong CAI , Risheek GARREPALLI , Yinhao ZHU , Jisoo JEONG , Yunxiao SHI , Manish Kumar SINGH , Fatih Murat PORIKLI

IPC: G06T7/593 , G06T7/20

Abstract: Systems and techniques are provided for generating depth information. For example, a process can include obtaining a first feature volume including visual features corresponding to each respective frame included in a first set of frames. A first query generator network can generate reconstruction features associated with a reconstructed feature volume corresponding to the first feature volume. Based on the first feature volume, a second query generator network can generate motion features associated with predicted future motion corresponding to the first feature volume. An initial depth prediction can be generated for each respective frame based on cross-attention between features of a depth prediction decoder, the reconstruction features, and the motion features. A refined depth prediction can be generated for each respective based on cross-attention between the initial depth prediction, the reconstruction features, and the motion features.

2.

发明申请
RE-ARRANGING FEED FORWARD NETWORKS (FFNs) IN TRANSFORMER-BASED MODELS 有权

公开(公告)号：US20250094793A1

公开(公告)日：2025-03-20

申请号：US18469909

申请日：2023-09-19

Applicant: QUALCOMM Incorporated

Inventor： Manish Kumar SINGH , Tianyu JIANG , Hsin-Pai CHENG , Kartikeya BHARDWAJ , Hong CAI , Mingu LEE , Munawar HAYAT , Christopher LOTT , Fatih Murat PORIKLI

IPC: G06N3/0499

Abstract: A processor-implemented method for image or text processing includes receiving, by an artificial neural network (ANN) model, a set of tokens corresponding to an input. A token interaction block of the ANN model processes the set of tokens according to each channel of the input to generate a spatial mixture of a set of features for each channel of the input. A feed forward network block of the ANN model generates a mixture of channel features based on the spatial mixture of the set of features for each channel of the input. An attention block of the ANN model determines a set of attended features of the mixture of channel features according to a set of attention weights. In turn, the ANN model generates an inference based on the set of attend features of the mixture of channel features.

3.

发明申请
DEPTH COMPLETION USING ATTENTION-BASED REFINEMENT OF FEATURES 有权

公开(公告)号：US20250148628A1

公开(公告)日：2025-05-08

申请号：US18633302

申请日：2024-04-11

Applicant: QUALCOMM Incorporated

Inventor： Yunxiao SHI , Hong CAI , Manish Kumar SINGH , Shizhong Steve HAN , Yinhao ZHU , Fatih Murat PORIKLI

IPC: G06T7/50 , G06T3/40 , G06V10/44

Abstract: Systems and techniques are provided for generating depth information from one or more images. For example, a process can include obtaining a first depth map corresponding to an input comprising an image of the one or more images and a sparse depth measurement. A three-dimensional (3D) point cloud can be generated based on the first depth map and multi-scale visual features of the input, wherein the 3D point cloud includes a plurality of 3D point features uplifted from the multi-scale visual features. At least a portion of the plurality of 3D point features can be processed using one or more self-attention layers to generate refined 3D point features. A two-dimensional (2D) projection of the refined 3D point features can be generated and a second depth map can be generated based on the 2D projection of the refined 3D point features.

4.

发明申请
TRANSFORMER WITH MULTI-SCALE MULTI-CONTEXT ATTENTIONS 有权

公开(公告)号：US20240428576A1

公开(公告)日：2024-12-26

申请号：US18613263

申请日：2024-03-22

Applicant: QUALCOMM Incorporated

Inventor： Tianyu JIANG , Manish Kumar SINGH , Hsin-Pai CHENG , Hong CAI , Mingu LEE , Kartikeya BHARDWAJ , Christopher LOTT , Fatih Murat PORIKLI

IPC: G06V10/82 , G06V10/70 , G06V10/77

Abstract: Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A transformed version of image pixels is accessed as input to an attention layer of a machine learning model. A number of local attention operations to apply, in one transformer, to the transformed version of image pixels is selected based at least in part on a size of the transformed version of image pixels. A transformer output for the attention layer of the machine learning model is generated based on applying the number of local attention operations and at least one global attention operation to the transformed version of image pixels.

5.

发明申请
TEST-TIME SELF-SUPERVISED GUIDANCE FOR DIFFUSION MODELS 有权

公开(公告)号：US20240412493A1

公开(公告)日：2024-12-12

申请号：US18537404

申请日：2023-12-12

Applicant: QUALCOMM Incorporated

Inventor： Risheek GARREPALLI , Yunxiao SHI , Hong CAI , Yinhao ZHU , Shubhankar Mangesh BORSE , Jisoo JEONG , Debasmit DAS , Manish Kumar SINGH , Rajeev YASARLA , Shizhong Steve HAN , Fatih Murat PORIKLI

IPC: G06V10/776 , G06T7/50 , G06V10/764 , G06V10/82 , G06V20/70

Abstract: Systems and techniques are provided for processing image data. According to some aspects, a computing device can generate a gradient (e.g., a classifier gradient using a trained classifier) associated with a current sample. The computing device can combine the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate. The computing device can predict, using the diffusion machine learning model and based on the score function estimate, a new sample.

Patent Agency Ranking