-
公开(公告)号:US20250148768A1
公开(公告)日:2025-05-08
申请号:US18937628
申请日:2024-11-05
Applicant: NEC Laboratories America, Inc.
Inventor: Kai Li , Deep Patel , Renqiang Min , Wentao Bao
Abstract: Methods and systems for action detection include encoding a text feature of an input textual description of an action using a visual language model (VLM). A video feature of an input video is encoded using the VLM. The action in the video is recognized, based on the text feature and the video feature, to localize the action within the video. A person performing the action is located within the video using the VLM.
-
公开(公告)号:US11087174B2
公开(公告)日:2021-08-10
申请号:US16580497
申请日:2019-09-24
Applicant: NEC Laboratories America, Inc.
Inventor: Renqiang Min , Kai Li , Bing Bai , Hans Peter Graf
Abstract: A method is provided for visual inspection. The method includes learning, by a processor, group disentangled visual feature embedding vectors of input images. The input images include defective objects and defect-free objects. The method further includes generating, by the processor using a weight generation network, classification weights from visual features and semantic descriptions. Both the visual features and the semantic descriptions are for predicting defective and defect-free labels. The method also includes calculating, by the processor, a cosine similarity score between the classification weights and the group disentangled visual feature embedding vectors. The method additionally includes episodically training, by the processor, the weight generation network on the input images to update parameters of the weight generation network. The method further includes generating, by the processor using the trained weight generation network, a prediction of a test image as including any of defective objects and defect-free objects.
-
公开(公告)号:US20200097771A1
公开(公告)日:2020-03-26
申请号:US16580497
申请日:2019-09-24
Applicant: NEC Laboratories America, Inc.
Inventor: Renqiang Min , Kai Li , Bing Bai , Hans Peter Graf
Abstract: A method is provided for visual inspection. The method includes learning, by a processor, group disentangled visual feature embedding vectors of input images. The input images include defective objects and defect-free objects. The method further includes generating, by the processor using a weight generation network, classification weights from visual features and semantic descriptions. Both the visual features and the semantic descriptions are for predicting defective and defect-free labels. The method also includes calculating, by the processor, a cosine similarity score between the classification weights and the group disentangled visual feature embedding vectors. The method additionally includes episodically training, by the processor, the weight generation network on the input images to update parameters of the weight generation network. The method further includes generating, by the processor using the trained weight generation network, a prediction of a test image as including any of defective objects and defect-free objects.
-
公开(公告)号:US20240161473A1
公开(公告)日:2024-05-16
申请号:US18504469
申请日:2023-11-08
Applicant: NEC Laboratories America, Inc.
Inventor: Kai Li , Deep Patel , Erik Kruus , Renqiang Min
IPC: G06V10/774 , G06V10/75 , G06V20/40 , G16H15/00
CPC classification number: G06V10/7753 , G06V10/751 , G06V20/44 , G16H15/00
Abstract: Methods and systems for training a model include performing spatial augmentation on an unlabeled input video to generate spatially augmented video. Temporal augmentation is performed on the input video to generate temporally augmented video. Predictions are generated, using a model that was pre-trained on a labeled dataset, for the unlabeled input video, the spatially augmented video, and the temporally augmented video. Parameters of the model are adapted using the predictions while enforcing temporal consistency, temporal consistency, and historical consistency. The model may be used for action recognition in a healthcare context, with recognition results being used for determining whether patients are performing a rehabilitation exercise correctly.
-
公开(公告)号:US20240087179A1
公开(公告)日:2024-03-14
申请号:US18462703
申请日:2023-09-07
Applicant: NEC Laboratories America, Inc.
Inventor: Renqiang Min , Kai Li , Hans Peter Graf , Haomiao Ni
CPC classification number: G06T11/00 , G06T3/0093 , G06V20/46
Abstract: Methods and systems for training a model include training an encoder in an unsupervised fashion based on a backward latent flow between a reference frame and a driving frame taken from a same video. A diffusion model is trained that generates a video sequence responsive to an input image and a text condition, using the trained encoder to determine a latent flow sequence and occlusion map sequence of a labeled training video.
-
公开(公告)号:US20240054783A1
公开(公告)日:2024-02-15
申请号:US18449393
申请日:2023-08-14
Applicant: NEC Laboratories America, Inc.
Inventor: Kai Li , Renqiang Min , Haifeng Xia
IPC: G06V20/40 , G06V10/82 , G06V10/774 , G06T7/246 , G06V10/776
CPC classification number: G06V20/41 , G06V10/82 , G06V10/774 , G06V20/46 , G06T7/246 , G06V10/776 , G06T2207/10016 , G06T2207/20084 , G06T2207/20081
Abstract: Methods and systems for video processing include extracting flow features and appearance features from frames of a video stream. The flow features are processed using a flow model that is trained on a first set of training data. An output of the flow model is processed using a sub-network that is trained on the first set of training data and a second set of domain-specific training data to generate a flow parameter. The appearance features are processed using an appearance model that is trained on the first set of training data and that further processes the appearance features using the flow parameter, to classify the frames of the video stream. An action is performed responsive to the classified frames.
-
公开(公告)号:US20240087196A1
公开(公告)日:2024-03-14
申请号:US18463784
申请日:2023-09-08
Applicant: NEC Laboratories America, Inc.
Inventor: Renqiang Min , Kai Li , Shaobo Han , Hans Peter Graf , Changhao Shi
IPC: G06T11/60 , G06T9/00 , G06V10/764 , G06V10/774
CPC classification number: G06T11/60 , G06T9/002 , G06V10/764 , G06V10/774
Abstract: Methods and systems for image generation include generating a latent representation of an image, modifying the latent representation of the image based on a trained attribute classifier and a specified attribute input, and decoding the modified latent representation to generate an output image that matches the specified attribute input.
-
公开(公告)号:US20240054782A1
公开(公告)日:2024-02-15
申请号:US18366931
申请日:2023-08-08
Applicant: NEC Laboratories America, Inc.
Inventor: Kai Li , Renqiang Min , Haifeng Xia
IPC: G06V20/40 , G06V10/774
CPC classification number: G06V20/41 , G06V20/46 , G06V10/774 , G06V20/48
Abstract: Methods and systems for video processing include enriching an input video feature from an input video frame set using a meta-action bank video sub-actions to generate enriched features. Reinforced image representation is performed using reinforcement learning to compare support image frames and query image frames and determine an importance of the input video frame. A classification is performed on the input video frame based on the importance and the enriched features to generate a label. An action is performed responsive to the generated label.
-
公开(公告)号:US20240046606A1
公开(公告)日:2024-02-08
申请号:US18363175
申请日:2023-08-01
Applicant: NEC Laboratories America, Inc.
Inventor: Kai Li , Renqiang Min , Deep Patel , Erik Kruus , Xin Hu
IPC: G06V10/62 , G06V20/40 , G06V10/82 , G06V10/774 , G06V10/776 , G06V10/77
CPC classification number: G06V10/62 , G06V20/41 , G06V20/46 , G06V10/82 , G06V10/774 , G06V10/776 , G06V10/7715
Abstract: Methods and systems for temporal action localization include processing a video stream to identify an action and a start time and a stop time for the action using a neural network model that separately processes information of appearance and motion modalities from the video stream using transformer branches that include a self-attention and a cross-attention between the appearance and motion modalities. An action is performed responsive to the identified action.
-
公开(公告)号:US20200097757A1
公开(公告)日:2020-03-26
申请号:US16580199
申请日:2019-09-24
Applicant: NEC Laboratories America, Inc.
Inventor: Renqiang Min , Kai Li , Bing Bai , Hans Peter Graf
Abstract: A computer-implemented method and system are provided for training a model for New Class Categorization (NCC) of a test image. The method includes decoupling, by a hardware processor, a feature extraction part from a classifier part of a deep classification model by reparametrizing learnable weight variables of the classifier part as a combination of learnable variables of the feature extraction part and of a classification weight generator of the classifier part. The method further includes training, by the hardware processor, the deep classification model to obtain a trained deep classification model by (i) learning the feature extraction part as a multiclass classification task, and (ii) episodically training the classifier part by learning a classification weight generator which outputs classification weights given a training image.
-
-
-
-
-
-
-
-
-