Patent search ap:("QUALCOMM Incorporated") AND inv:"Mihir JAIN" Page 1

1.

发明申请
ACTOR-DEFORMATION-INVARIANT ACTION PROPOSALS 审中-公开

公开(公告)号：US20190108400A1

公开(公告)日：2019-04-11

申请号：US16152755

申请日：2018-10-05

Applicant: QUALCOMM Incorporated

Inventor： Victor Augusto ESCORCIA , Mihir JAIN , Amirhossein HABIBIAN , Cornelis Gerardus Maria SNOEK

IPC: G06K9/00 , G06K9/62

Abstract: A method for generating action proposals in a sequence of frames comprises determining, at each frame of the sequence of frames, at least one possible action location for a type of actor to be detected. The method also expands, for each frame of the sequence of frames, the at least one possible action location to neighboring regions in neighboring frames from a given frame to identify a similar location between the given frame and each one of the neighboring frames. The method further comprises associating a most similar possible action location over the sequence of frames to generate the action proposals. The method also comprises classifying an action in the sequence of frames based on the action proposals and controlling an action of a device based on the classifying.

2.

发明申请
RECURRENT NETWORKS WITH MOTION-BASED ATTENTION FOR VIDEO UNDERSTANDING 审中-公开

公开(公告)号：US20170262705A1

公开(公告)日：2017-09-14

申请号：US15267621

申请日：2016-09-16

Applicant: QUALCOMM Incorporated

Inventor： Zhenyang LI , Efstratios GAVVES , Mihir JAIN , Cornelis Gerardus Maria SNOEK

IPC: G06K9/00 , G06K9/62

CPC classification number: G06K9/00718 , G06K9/00342 , G06K9/6269 , G06N3/0445 , G06N3/0454

Abstract: A method of predicting action labels for a video stream includes receiving the video stream and calculating an optical flow of consecutive frames of the video stream. An attention map is generated from the current frame of the video stream and the calculated optical flow. An action label is predicted for the current frame based on the optical flow, a previous hidden state and the attention map.

3.

发明申请
ADAPTIVE USE OF VIDEO MODELS FOR HOLISTIC VIDEO UNDERSTANDING 有权

公开(公告)号：US20220318553A1

公开(公告)日：2022-10-06

申请号：US17219460

申请日：2021-03-31

Applicant: QUALCOMM Incorporated

Inventor： Haitam BEN YAHIA , Amir GHODRATI , Mihir JAIN , Amirhossein HABIBIAN

IPC: G06K9/00 , G06K9/62 , G10L25/57 , G06N3/08 , G06K9/46 , G06N3/04

Abstract: Systems and techniques are provided for performing holistic video understanding. For example a process can include obtaining a first video and determining, using a machine learning model decision engine, a first machine learning model from a set of machine learning models to use for processing at least a portion of the first video. The first machine learning model can be determined based on one or more characteristics of at least the portion of the first video. The process can include processing at least the portion of the first video using the first machine learning model.

4.

发明申请
VIDEO ACTION LOCALIZATION FROM PROPOSAL-ATTENTION 审中-公开

公开(公告)号：US20190108399A1

公开(公告)日：2019-04-11

申请号：US16152301

申请日：2018-10-04

Applicant: QUALCOMM Incorporated

Inventor： Victor Augusto ESCORCIA , Mihir JAIN , Amirhossein HABIBIAN , Cornelis Gerardus Maria SNOEK

IPC: G06K9/00

Abstract: A method for processing a sequence of frames includes receiving a sequence of frames and multiple action proposals for the sequence of frames. The method also includes generating a representation of the sequence of frames and pooling the representation around each of the action proposals. The method further includes classifying the action proposals based on the pooled representations and controlling a device based on the classifying.

5.

发明公开
COMMON ACTION LOCALIZATION 审中-公开

公开(公告)号：US20240303987A1

公开(公告)日：2024-09-12

申请号：US18360741

申请日：2023-07-27

Applicant: QUALCOMM Incorporated

Inventor： Juntae LEE , Mihir JAIN , Sungrack YUN

IPC: G06V20/40 , G06F16/732 , G06F16/735 , G06F16/75

CPC classification number: G06V20/48 , G06F16/7328 , G06F16/735 , G06F16/75 , G06V20/41 , G06V10/82

Abstract: Aspects of the disclosure are directed to an apparatus configured to perform common-action localization. In certain aspects, the apparatus may receive a query video comprising a plurality of frames, wherein a first query proposal is determined based on a subset of frames of the plurality of frames, the first query proposal indicative of an action depicted on the subset of frames. In certain aspects, the apparatus may determine a first attendance for a first support video of a plurality of support videos. In certain aspects, the apparatus may determine a second attendance for a second support video of the plurality of support videos after computing the first attendance.

6.

发明申请
MULTI-MODAL REPRESENTATION BASED EVENT LOCALIZATION 有权

公开(公告)号：US20220101087A1

公开(公告)日：2022-03-31

申请号：US17405879

申请日：2021-08-18

Applicant: QUALCOMM Incorporated

Inventor： Juntae LEE , Mihir JAIN , Sungrack YUN , Hyoungwoo PARK , Kyu Woong HWANG

IPC: G06N3/04 , G06K9/62

Abstract: A method performed by an artificial neural network (ANN) includes determining, at a first stage of a multi-stage cross-attention model of the ANN, a first cross-correlation between a first representation of each modality of a number of modalities associated with a sequence of inputs. The method still further includes determining, at each second stage of one or more second stages of the multi-stage cross-attention model, a second cross-correlation between first attended representations of each modality. The method also includes generating a concatenated feature representation associated with a final second stage of the one or more second stages based on the second cross-correlation associated with the final second stage, the first attended representation of each modality, and the first representation of each modality. The method further includes determining a probability distribution between a set of background actions and a set of foreground actions from the concatenated feature representation. The method still further includes localizing an action in the sequence of inputs based on the probability distribution.

7.

发明申请
ACTION LOCALIZATION IN SEQUENTIAL DATA WITH ATTENTION PROPOSALS FROM A RECURRENT NETWORK 审中-公开

公开(公告)号：US20170262996A1

公开(公告)日：2017-09-14

申请号：US15250755

申请日：2016-08-29

Applicant: QUALCOMM Incorporated

Inventor： Mihir JAIN , Zhenyang LI , Efstratios GAVVES , Cornelis Gerardus Maria SNOEK

IPC: G06T7/00 , G06N3/04 , G06K9/62 , G06K9/32 , G06K9/40

CPC classification number: G06T7/0087 , G06K9/00718 , G06K9/3216 , G06K9/3241 , G06K9/40 , G06K9/4671 , G06K9/628 , G06K2009/00738 , G06N3/0445 , G06N3/0454 , G06T7/143 , G06T2207/10016 , G06T2210/12

Abstract: A method generates bounding-boxes within frames of a sequence of frames. The bounding-boxes may be generated via a recurrent neural network (RNN) such as a long short-term memory (LSTM) network. The method includes receiving the sequence of frames and generating an attention feature map for each frame of the sequence of frames. Each attention feature map indicates at least one potential moving object. The method also includes up-sampling each attention feature map to determine an attention saliency for pixels in each frame of the sequence of frames. The method further includes generating a bounding-box within each frame based on the attention saliency and temporally smoothing multiple bounding-boxes along the sequence of frames to obtain a smooth sequence of bounding-boxes. The method still further includes localizing an action location within each frame based on the smooth sequence of bounding-boxes.

8.

发明申请
VIDEO ANALYSIS WITH CONVOLUTIONAL ATTENTION RECURRENT NEURAL NETWORKS 有权

公开(公告)号：US20170262995A1

公开(公告)日：2017-09-14

申请号：US15249280

申请日：2016-08-26

Applicant: QUALCOMM Incorporated

Inventor： Zhenyang LI , Efstratios GAVVES , Mihir JAIN , Cornelis Gerardus Maria SNOEK

IPC: G06T7/00 , G06N3/08 , G06N3/04

CPC classification number: G06T7/11 , G06K9/00335 , G06K9/00718 , G06N3/0445 , G06N3/0454 , G06N3/08 , G06T7/0081 , G06T2207/10004 , G06T2207/20084

Abstract: A method of processing data within a convolutional attention recurrent neural network (RNN) includes generating a current multi-dimensional attention map. The current multi-dimensional attention map indicates areas of interest in a first frame from a sequence of spatio-temporal data. The method further includes receiving a multi-dimensional feature map. The method also includes convolving the current multi-dimensional attention map and the multi-dimensional feature map to obtain a multi-dimensional hidden state and a next multi-dimensional attention map. The method identifies a class of interest in the first frame based on the multi-dimensional hidden state and training data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification