-
公开(公告)号:US20150254834A1
公开(公告)日:2015-09-10
申请号:US14639536
申请日:2015-03-05
Applicant: NEC Laboratories America, Inc.
Inventor: Manmohan Chandraker , Shiyu Song
IPC: G06T7/00
CPC classification number: G06T7/0071 , G06T7/20 , G06T7/579 , G06T7/70 , G06T2207/10016 , G06T2207/10028
Abstract: Methods and systems for moving object localization include estimating a ground plane in a video frame based on a detected object within the video frame and monocular structure-from-motion (SFM) information; computing object pose for objects in the frame based on the SFM information using dense feature tracking; and determining a three-dimensional location for the detected object based on the estimated ground plane and the computed object pose.
Abstract translation: 用于移动对象定位的方法和系统包括基于视频帧内的检测对象和单目动作(SFM)信息来估计视频帧中的接地平面; 基于使用密集特征跟踪的SFM信息计算帧中对象的物体姿态; 以及基于估计的接地平面和所计算的物体姿态来确定被检测物体的三维位置。
-
公开(公告)号:US20140139635A1
公开(公告)日:2014-05-22
申请号:US13858041
申请日:2013-04-06
Applicant: NEC Laboratories America, Inc.
Inventor: Manmohan Chandraker , Shiyu Song
IPC: H04N13/02
CPC classification number: H04N13/204 , G06T7/579 , G06T7/74 , G06T2207/30244 , G06T2207/30252
Abstract: Systems and methods are disclosed for multithreaded navigation assistance by acquired with a single camera on-board a vehicle; using 2D-3D correspondences for continuous pose estimation; and combining the pose estimation with 2D-2D epipolar search to replenish 3D points.
Abstract translation: 公开了通过用车上单个摄像机采集的用于多线程导航辅助的系统和方法; 使用2D-3D对应连续姿态估计; 并将姿态估计与2D-2D对极搜索相结合,以补充3D点。
-
公开(公告)号:US20140132604A1
公开(公告)日:2014-05-15
申请号:US14073726
申请日:2013-11-06
Applicant: NEC Laboratories America, Inc.
Inventor: Yingze Bao , Manmohan Chandraker , Yuanqing Lin , Silvio Savarese
IPC: G06T17/00
CPC classification number: G06T17/00 , G06T3/00 , G06T7/579 , G06T2207/20076 , G06T2207/20081
Abstract: A method to reconstruct 3D model of an object includes receiving with a processor a set of training data including images of the object from various viewpoints; learning a prior comprised of a mean shape describing a commonality of shapes across a category and a set of weighted anchor points encoding similarities between instances in appearance and spatial consistency; matching anchor points across instances to enable learning a mean shape for the category; and modeling the shape of an object instance as a warped version of a category mean, along with instance-specific details.
Abstract translation: 一种重建对象的3D模型的方法包括:利用处理器从各种视点接收包括对象的图像的一组训练数据; 学习一个先前的包括一个描述一个类别的形状的共同性的平均形状,以及编码外观和空间一致性之间的实例之间的相似性的一组加权锚点; 在实例之间匹配锚点,以便学习类别的平均形状; 并将对象实例的形状建模为类别的翘曲版本,以及实例特定的细节。
-
134.
公开(公告)号:US20130156327A1
公开(公告)日:2013-06-20
申请号:US13716294
申请日:2012-12-17
Applicant: NEC LABORATORIES AMERICA, INC.
Inventor: Manmohan Chandraker , Kai Yu
IPC: G06K9/62
CPC classification number: G06K9/6267 , G06T7/579 , G06T2207/10016 , G06T2207/10024 , G06T2207/10152 , G06T2207/30244
Abstract: A computer implemented method for determining shape from differential motion with unknown reflectance includes deriving a general relation that relates spatial and temporal image derivatives to bidirectional reflectance distribution function BRDF derivatives, responsive to 3D points and relative camera poses from images and feature tracks of an object in motion under colocated and unknown directional light conditions, employing a rank deficiency in image sequences from the deriving for shape determinations, under predetermined multiple camera and lighting conditions, to eliminate BDRF terms; and recovering a surface depth for determining a shape of the object.
Abstract translation: 用于根据具有未知反射的差分运动来确定形状的计算机实现方法包括导出将空间和时间图像导数与双向反射分布函数BRDF导数相关联的一般关系,其响应于来自图像中的3D点和相对相机姿态的对象的图像和特征轨迹 在预定的多个照相机和照明条件下,在共定位和未知的定向光条件下运动,在来自形状确定的图像序列中使用秩缺陷,以消除BDRF项; 以及恢复用于确定所述物体的形状的表面深度。
-
公开(公告)号:US20250148757A1
公开(公告)日:2025-05-08
申请号:US18931681
申请日:2024-10-30
Applicant: NEC Laboratories America, Inc.
Inventor: Jong-Chyi Su , Sparsh Garg , Samuel Schulter , Manmohan Chandraker , Mingfu Liang
Abstract: Systems and methods for a self-improving data engine for autonomous vehicles is presented. To train the self-improving data engine for autonomous vehicles (SIDE), multi-modality dense captioning (MMDC) models can detect unrecognized classes from diversified descriptions for input images. A vision-language-model (VLM) can generate textual features from the diversified descriptions and image features from corresponding images to the diversified descriptions. Curated features, including curated textual features and curated image features, can be obtained by comparing similarity scores between the textual features and top-ranked image features based on their likelihood scores. Generate annotations, including bounding boxes and labels, can be generated for the curated features by comparing the similarity scores of labels generated by a zero-shot classifier and the curated textual features. The SIDE can be trained using the curated features, annotations, and feedback.
-
公开(公告)号:US20250148697A1
公开(公告)日:2025-05-08
申请号:US18936290
申请日:2024-11-04
Applicant: NEC Laboratories America, Inc.
Inventor: Ziyu Jiang , Bingbing Zhuang , Manmohan Chandraker
IPC: G06T15/08
Abstract: Methods and systems include training a model for rendering a three-dimensional volume using a loss function that includes a depth loss term and a distribution loss term that regularize an output of the model to produce realistic scenarios. A simulated scenario is generated based on an original scenario, with the simulated scenario including a different position and pose relative to the original scenario in a three-dimensional (3D) scene that is generated by the model from the original scenario. A self-driving model is trained for an autonomous vehicle using the simulated scenario.
-
公开(公告)号:US20250139527A1
公开(公告)日:2025-05-01
申请号:US18930402
申请日:2024-10-29
Applicant: NEC Laboratories America, Inc.
IPC: G06N20/00
Abstract: Systems and methods for a self-improving model for agentic visual program synthesis. An agent can be continuously trained using an optimal training tuple to perform a corrective action to a monitored entity which in turn generates new input data for the training. To train the agent, an input question can be decomposed into vision model tasks to generate task outputs. The task outputs can be corrected based on feedback to obtain corrected task outputs. The optimal training tuple can be generated by comparing an optimal tuple threshold with a similarity score of the input image, the input question, and the corrected task outputs.
-
公开(公告)号:US20250115250A1
公开(公告)日:2025-04-10
申请号:US18903538
申请日:2024-10-01
Applicant: NEC Laboratories America, Inc.
Inventor: Bingbing Zhuang , Manmohan Chandraker , Di Liu
Abstract: Methods and systems for motion detection include performing a first prediction to predict voxel occupancy based on a sequence of input point clouds including a current point cloud and a set of previous point clouds. A second prediction is performed to predict voxel occupancy for the sequence of input point clouds using predicted voxel occupancy between the input point clouds. Motion detection is performed based on the completed voxel occupancy. An action is performed responsive to a detected motion.
-
公开(公告)号:US12205324B2
公开(公告)日:2025-01-21
申请号:US17519894
申请日:2021-11-05
Applicant: NEC Laboratories America, Inc.
Inventor: Bingbing Zhuang , Manmohan Chandraker
Abstract: A computer-implemented method for fusing geometrical and Convolutional Neural Network (CNN) relative camera pose is provided. The method includes receiving two images having different camera poses. The method further includes inputting the two images into a geometric solver branch to return, as a first solution, an estimated camera pose and an associated pose uncertainty value determined from a Jacobian of a reproduction error function. The method also includes inputting the two images into a CNN branch to return, as a second solution, a predicted camera pose and an associated pose uncertainty value. The method additionally includes fusing, by a processor device, the first solution and the second solution in a probabilistic manner using Bayes' rule to obtain a fused pose.
-
公开(公告)号:US20240354336A1
公开(公告)日:2024-10-24
申请号:US18639500
申请日:2024-04-18
Applicant: NEC Laboratories America, Inc.
IPC: G06F16/538 , G06F16/532 , G06F40/30 , G06V10/40 , G06V10/74 , G06V10/94
CPC classification number: G06F16/538 , G06F16/532 , G06F40/30 , G06V10/40 , G06V10/761 , G06V10/945
Abstract: Systems and methods are provided for identifying and retrieving semantically similar images from a database. Semantic analysis is performed on an input query utilizing a vision language model to identify semantic concepts associated with the input query. A preliminary set of images is retrieved from the database for semantic concepts identified. Relevant concepts are extracted for images with a tokenizer by comparing images against a predefined label space to identify relevant concepts. A ranked list of relevant concepts is generated based on occurrence frequency within the set. The preliminary set of images is refined based on selecting specific relevant concepts from the ranked list by the user by combining the input query with the specific relevant concepts. Additional semantic analysis is iteratively performed to retrieve additional sets of images semantically similar to the combined input query and selection of the specific relevant concepts until a threshold condition is met.
-
-
-
-
-
-
-
-
-