IMAGE AND LIDAR ADAPTIVE TRANSFORMER FOR FUSION-BASED PERCEPTION

    公开(公告)号:US20250060481A1

    公开(公告)日:2025-02-20

    申请号:US18452279

    申请日:2023-08-18

    Abstract: An apparatus includes a memory and processing circuitry in communication with the memory. The processing circuitry is configured to apply, based on a positional encoding model, a first feature conditioning module to a set of bird's eye view (BEV) position data features corresponding to position data to generate a set of conditioned BEV position data features, and apply, based on the position encoding model, a second feature conditioning module to a set of perspective image data features corresponding to image data to generate a set of conditioned perspective image data features. The processing circuitry is also configured to generate, based on the positional encoding model, the set of conditioned BEV position data features, and the set of conditioned perspective image data features, a weighted summation. Additionally, the processing circuitry is configured to generate, based on the weighted summation, a set of BEV image data features.

    GATED LOAD BALANCING FOR UNCERTAINTY AWARE CAMERA-LIDAR FUSION

    公开(公告)号:US20250058789A1

    公开(公告)日:2025-02-20

    申请号:US18452292

    申请日:2023-08-18

    Abstract: A system for processing image data and position data, the system comprising: a memory for storing the image data and the position data; and processing circuitry in communication with the memory. The processing circuitry is configured to: apply a first encoder to extract, from the image data, a first set of features; apply a first decoder to determine, based on the first set of features, a first uncertainty score. Additionally, the processing circuitry is configured to apply a second encoder to extract, from the position data, a second set of features; apply a second decoder to determine, based on the second set of features, a second uncertainty score; and fuse the first set of features and the second set of features based on the first uncertainty score and the second uncertainty score.

    ROBUST FEATURE EXTRACTION FROM OCCLUDED IMAGE FRAMES FOR VEHICLE APPLICATIONS

    公开(公告)号:US20240395007A1

    公开(公告)日:2024-11-28

    申请号:US18321520

    申请日:2023-05-22

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of image processing includes receiving a plurality of image frames by a computing device and using machine learning models to identify corrupted or occluded image frames. A first machine learning model may identify corrupted image frames, while a second machine learning model may identify partially occluded image frames. The method may further include generating updated versions of image frames captured by vehicle cameras, such as based on feature vectors from the first and second machine learning models. The feature vectors may be fused and provided to a third machine learning model to generate updated versions of occluded image frames. The method may further include determining vehicle control instructions based on the updated versions. Other aspects and features are also claimed and described.

    OCCLUDED OBJECT DETECTION AND CORRECTION FOR VEHICLE APPLICATIONS

    公开(公告)号:US20240371168A1

    公开(公告)日:2024-11-07

    申请号:US18311784

    申请日:2023-05-03

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method is provided that includes generating a top view image of an object using a plurality of images captured from different views. The method involves determining portions of the images that depict the object and generating novel views of the object from at least one novel view not present within the plurality of images. Corresponding portions containing an occluded view and an unobstructed view of the object are identified and corrected views for occluded views are determined based on corresponding unobstructed views using a machine learning model. A top view image may be then generated based on the corrected views. The invention enables improved visibility for autonomous driving systems in situations where objects are occluded or partially obstructed. Other aspects and features are also claimed and described.

    FEATURE FUSION FOR NEAR FIELD AND FAR FIELD IMAGES FOR VEHICLE APPLICATIONS

    公开(公告)号:US20240371147A1

    公开(公告)日:2024-11-07

    申请号:US18313287

    申请日:2023-05-05

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of fusing features from near-field images and far-field images is provided that includes determining feature vectors and spatial locations for received images from near-field and far-field image sensors. A first set of weighted feature vectors may be determined based on spatial locations of the features and a second set of weighted feature vectors may be determined based on corresponding features between the feature vectors. Fused feature vectors may then be determined based on the weighted feature vectors, such as using a transformer attention process trained to select and combine features from both sets of weighted feature vectors. Vehicle control instructions may be determined based on the fused feature vectors. Other aspects and features are also claimed and described.

    ASYNCHRONOUS MULTIMODAL FEATURE FUSION

    公开(公告)号:US20250157204A1

    公开(公告)日:2025-05-15

    申请号:US18509026

    申请日:2023-11-14

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of image processing includes receiving image data from an image sensor; receiving ranging data from a ranging sensor; embedding first spatial features of the image data with first temporal information associated with the image data; embedding second spatial features of the ranging data with second temporal information associated with the ranging data; determining first bird's-eye-view (BEV) features based on the first spatial features embedded with first temporal information; determining second BEV features based on the second spatial features embedded with second temporal information; and determining, based on the first and second BEV features, a feature set for processing by a transformer network. The feature set includes at least a portion of both the first and second BEV features. Other aspects and features are also claimed and described.

    EARLY FUSION OF NEURAL RAY GRAPH NETWORKS FOR MULTI-VIEW CAMERA SETUPS

    公开(公告)号:US20250157178A1

    公开(公告)日:2025-05-15

    申请号:US18506507

    申请日:2023-11-10

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, an image processing method includes receiving image frames; determining an ordered set of neural rays based on the image frames; determining a graph network that represents each neural ray of the ordered set of neural rays as a sequence of points; and determining a feature set based on the graph network. Each neural ray of the ordered set of neural rays represents three-dimensional positions of pixels of an image frame. Each point on the graph network is associated with a node of a plurality of nodes of the graph network. The feature set includes features of each of the image frames. Other aspects and features are also claimed and described.

    CYLINDRICAL PARTITIONING FOR THREE-DIMENSIONAL (3D) PERCEPTION OPERATIONS

    公开(公告)号:US20250139882A1

    公开(公告)日:2025-05-01

    申请号:US18498995

    申请日:2023-10-31

    Abstract: In some aspects of the disclosure, an apparatus includes a processing system that includes one or more processors and one or more memories coupled to the one or more processors. The processing system is configured to receive sensor data associated with a scene and to generate a cylindrical representation associated with the scene. The processing system is further configured to modify the cylindrical representation based on detecting a feature of the cylindrical representation being included in a first region of the cylindrical representation. Modifying the cylindrical representation includes relocating the feature from the first region to a second region that is different than the first region. The processing system is further configured to perform, based on the modified cylindrical representation, one or more three-dimensional (3D) perception operations associated with the scene.

    TRAINING A NEURAL NETWORK TO GENERATE A DENSE DEPTH MAP FROM IMAGE AND LIDAR DATA

    公开(公告)号:US20250095173A1

    公开(公告)日:2025-03-20

    申请号:US18467035

    申请日:2023-09-14

    Abstract: An example device for training a neural network includes a memory configured to store a neural network model for the neural network; and a processing system comprising one or more processors implemented in circuitry, the processing system being configured to: extract image features from an image of an area, the image features representing objects in the area; extract point cloud features from a point cloud representation of the area, the point cloud features representing the objects in the area; add Gaussian noise to a ground truth depth map for the area to generate a noisy ground truth depth map, the ground truth depth map representing accurate positions of the objects in the area; and train the neural network using the image features, the point cloud features, and the noisy ground truth depth map to generate a depth map.

Patent Agency Ranking