HIGH ACCURACY MONOCULAR MOVING OBJECT LOCALIZATION
    131.
    发明申请
    HIGH ACCURACY MONOCULAR MOVING OBJECT LOCALIZATION 有权
    高精度单相移动物体定位

    公开(公告)号:US20150254834A1

    公开(公告)日:2015-09-10

    申请号:US14639536

    申请日:2015-03-05

    Abstract: Methods and systems for moving object localization include estimating a ground plane in a video frame based on a detected object within the video frame and monocular structure-from-motion (SFM) information; computing object pose for objects in the frame based on the SFM information using dense feature tracking; and determining a three-dimensional location for the detected object based on the estimated ground plane and the computed object pose.

    Abstract translation: 用于移动对象定位的方法和系统包括基于视频帧内的检测对象和单目动作(SFM)信息来估计视频帧中的接地平面; 基于使用密集特征跟踪的SFM信息计算帧中对象的物体姿态; 以及基于估计的接地平面和所计算的物体姿态来确定被检测物体的三维位置。

    REAL-TIME MONOCULAR STRUCTURE FROM MOTION
    132.
    发明申请
    REAL-TIME MONOCULAR STRUCTURE FROM MOTION 审中-公开
    运动中的实时单轨结构

    公开(公告)号:US20140139635A1

    公开(公告)日:2014-05-22

    申请号:US13858041

    申请日:2013-04-06

    Abstract: Systems and methods are disclosed for multithreaded navigation assistance by acquired with a single camera on-board a vehicle; using 2D-3D correspondences for continuous pose estimation; and combining the pose estimation with 2D-2D epipolar search to replenish 3D points.

    Abstract translation: 公开了通过用车上单个摄像机采集的用于多线程导航辅助的系统和方法; 使用2D-3D对应连续姿态估计; 并将姿态估计与2D-2D对极搜索相结合,以补充3D点。

    Semantic Dense 3D Reconstruction
    133.
    发明申请
    Semantic Dense 3D Reconstruction 有权
    语义密集3D重建

    公开(公告)号:US20140132604A1

    公开(公告)日:2014-05-15

    申请号:US14073726

    申请日:2013-11-06

    Abstract: A method to reconstruct 3D model of an object includes receiving with a processor a set of training data including images of the object from various viewpoints; learning a prior comprised of a mean shape describing a commonality of shapes across a category and a set of weighted anchor points encoding similarities between instances in appearance and spatial consistency; matching anchor points across instances to enable learning a mean shape for the category; and modeling the shape of an object instance as a warped version of a category mean, along with instance-specific details.

    Abstract translation: 一种重建对象的3D模型的方法包括:利用处理器从各种视点接收包括对象的图像的一组训练数据; 学习一个先前的包括一个描述一个类别的形状的共同性的平均形状,以及编码外观和空间一致性之间的实例之间的相似性的一组加权锚点; 在实例之间匹配锚点,以便学习类别的平均形状; 并将对象实例的形状建模为类别的翘曲版本,以及实例特定的细节。

    Shape from Differential Motion with Unknown Reflectance
    134.
    发明申请
    Shape from Differential Motion with Unknown Reflectance 有权
    具有未知反射率的差分运动形状

    公开(公告)号:US20130156327A1

    公开(公告)日:2013-06-20

    申请号:US13716294

    申请日:2012-12-17

    Abstract: A computer implemented method for determining shape from differential motion with unknown reflectance includes deriving a general relation that relates spatial and temporal image derivatives to bidirectional reflectance distribution function BRDF derivatives, responsive to 3D points and relative camera poses from images and feature tracks of an object in motion under colocated and unknown directional light conditions, employing a rank deficiency in image sequences from the deriving for shape determinations, under predetermined multiple camera and lighting conditions, to eliminate BDRF terms; and recovering a surface depth for determining a shape of the object.

    Abstract translation: 用于根据具有未知反射的差分运动来确定形状的计算机实现方法包括导出将空间和时间图像导数与双向反射分布函数BRDF导数相关联的一般关系,其响应于来自图像中的3D点和相对相机姿态的对象的图像和特征轨迹 在预定的多个照相机和照明条件下,在共定位和未知的定向光条件下运动,在来自形状确定的图像序列中使用秩缺陷,以消除BDRF项; 以及恢复用于确定所述物体的形状的表面深度。

    SELF-IMPROVING DATA ENGINE FOR AUTONOMOUS VEHICLES

    公开(公告)号:US20250148757A1

    公开(公告)日:2025-05-08

    申请号:US18931681

    申请日:2024-10-30

    Abstract: Systems and methods for a self-improving data engine for autonomous vehicles is presented. To train the self-improving data engine for autonomous vehicles (SIDE), multi-modality dense captioning (MMDC) models can detect unrecognized classes from diversified descriptions for input images. A vision-language-model (VLM) can generate textual features from the diversified descriptions and image features from corresponding images to the diversified descriptions. Curated features, including curated textual features and curated image features, can be obtained by comparing similarity scores between the textual features and top-ranked image features based on their likelihood scores. Generate annotations, including bounding boxes and labels, can be generated for the curated features by comparing the similarity scores of labels generated by a zero-shot classifier and the curated textual features. The SIDE can be trained using the curated features, annotations, and feedback.

    PHOTOREALISTIC TRAINING DATA AUGMENTATION

    公开(公告)号:US20250148697A1

    公开(公告)日:2025-05-08

    申请号:US18936290

    申请日:2024-11-04

    Abstract: Methods and systems include training a model for rendering a three-dimensional volume using a loss function that includes a depth loss term and a distribution loss term that regularize an output of the model to produce realistic scenarios. A simulated scenario is generated based on an original scenario, with the simulated scenario including a different position and pose relative to the original scenario in a three-dimensional (3D) scene that is generated by the model from the original scenario. A self-driving model is trained for an autonomous vehicle using the simulated scenario.

    SELF-IMPROVING MODELS FOR AGENTIC VISUAL PROGRAM SYNTHESIS

    公开(公告)号:US20250139527A1

    公开(公告)日:2025-05-01

    申请号:US18930402

    申请日:2024-10-29

    Abstract: Systems and methods for a self-improving model for agentic visual program synthesis. An agent can be continuously trained using an optimal training tuple to perform a corrective action to a monitored entity which in turn generates new input data for the training. To train the agent, an input question can be decomposed into vision model tasks to generate task outputs. The task outputs can be corrected based on feedback to obtain corrected task outputs. The optimal training tuple can be generated by comparing an optimal tuple threshold with a similarity score of the input image, the input question, and the corrected task outputs.

    Learning to fuse geometrical and CNN relative camera pose via uncertainty

    公开(公告)号:US12205324B2

    公开(公告)日:2025-01-21

    申请号:US17519894

    申请日:2021-11-05

    Abstract: A computer-implemented method for fusing geometrical and Convolutional Neural Network (CNN) relative camera pose is provided. The method includes receiving two images having different camera poses. The method further includes inputting the two images into a geometric solver branch to return, as a first solution, an estimated camera pose and an associated pose uncertainty value determined from a Jacobian of a reproduction error function. The method also includes inputting the two images into a CNN branch to return, as a second solution, a predicted camera pose and an associated pose uncertainty value. The method additionally includes fusing, by a processor device, the first solution and the second solution in a probabilistic manner using Bayes' rule to obtain a fused pose.

    MULTIMODAL SEMANTIC ANALYSIS AND IMAGE RETRIEVAL

    公开(公告)号:US20240354336A1

    公开(公告)日:2024-10-24

    申请号:US18639500

    申请日:2024-04-18

    Abstract: Systems and methods are provided for identifying and retrieving semantically similar images from a database. Semantic analysis is performed on an input query utilizing a vision language model to identify semantic concepts associated with the input query. A preliminary set of images is retrieved from the database for semantic concepts identified. Relevant concepts are extracted for images with a tokenizer by comparing images against a predefined label space to identify relevant concepts. A ranked list of relevant concepts is generated based on occurrence frequency within the set. The preliminary set of images is refined based on selecting specific relevant concepts from the ranked list by the user by combining the input query with the specific relevant concepts. Additional semantic analysis is iteratively performed to retrieve additional sets of images semantically similar to the combined input query and selection of the specific relevant concepts until a threshold condition is met.

Patent Agency Ranking