STOCHASTIC DYNAMIC FIELD OF VIEW FOR MULTI-CAMERA BIRD’S EYE VIEW PERCEPTION IN AUTONOMOUS DRIVING

    公开(公告)号:US20250156997A1

    公开(公告)日:2025-05-15

    申请号:US18505923

    申请日:2023-11-09

    Abstract: An apparatus for processing image data includes a memory for storing the image data, wherein the image data comprises a first set of image data collected by a first camera comprising a first field of view (FOV) and a second set of image data collected by a second camera comprising a second FOV; and processing circuitry in communication with the memory. The processing circuitry is configured to: apply an encoder to extract, from the first set of image data, a first set of perspective view features; apply the encoder to extract, from the second set of image data, a second set of perspective view features; and project the first set of perspective view features and the second set of perspective view features onto a grid to generate a set of bird's eye view (BEV) features.

    VOXEL-LEVEL FEATURE FUSION WITH GRAPH NEURAL NETWORKS AND DIFFUSION FOR 3D OBJECT DETECTION

    公开(公告)号:US20250095354A1

    公开(公告)日:2025-03-20

    申请号:US18467657

    申请日:2023-09-14

    Abstract: An apparatus includes a memory and processing circuitry in communication with the memory. The processing circuitry is configured to process a joint graph representation using a graph neural network (GNN) to form an enhanced graph representation. The joint graph representation includes first features from a voxelized point cloud, and second features from a plurality of camera images. The enhanced graph representation includes enhanced first features and enhanced second features. The processing circuitry is further configured to perform a diffusion processes on the enhanced first features and the enhanced second features of the enhanced graph representation to form a denoised graph representation having denoised first features and denoised second features, and fuse the denoised first features and the denoised second features of the denoised graph representation using a graph attention network (GAT) to form a fused point cloud having fused features.

    GRAPH NEURAL NETWORK (GNN) IMPLEMENTED MULTI-MODAL SPATIOTEMPORAL FUSION

    公开(公告)号:US20250086979A1

    公开(公告)日:2025-03-13

    申请号:US18463109

    申请日:2023-09-07

    Abstract: Systems that support graph neural network (GNN) implemented multi-modal spatiotemporal fusion are provided. Identifying and tracking an object in images captured by an imaging system is facilitated by generating a graph based on multimodal data received from a plurality of sensors. The graph encodes spatial components and spatial data associated with the images and encodes temporal data associated with the images. Pooled features are generated, through application of a first graph attention network (GAT), by pooling spatial features and temporal features. The spatial features are based on the spatial component and on the spatial relationship, and the temporal features are based on the temporal relationship. A three dimensional bounding box associated with the object is decoded by propagating the pooled features through a fully connected layer.

    RADAR AND CAMERA FUSION FOR VEHICLE APPLICATIONS

    公开(公告)号:US20250085413A1

    公开(公告)日:2025-03-13

    申请号:US18463049

    申请日:2023-09-07

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of image processing includes receiving image BEV features and receiving first radio detection and ranging (RADAR) BEV features. The first RADAR BEV features that are received are determined based on first RADAR data associated with a first data type. First normalized RADAR BEV features are determined, which includes rescaling the first RADAR BEV features using a first attention mechanism based on the image BEV features and the first RADAR BEV features. Fused data is determined that combines the first normalized RADAR BEV features and the image BEV features. Other aspects and features are also claimed and described.

    CAMERA SOILING DETECTION USING ATTENTION-GUIDED CAMERA DEPTH AND LIDAR RANGE CONSISTENCY GATING

    公开(公告)号:US20250085407A1

    公开(公告)日:2025-03-13

    申请号:US18464769

    申请日:2023-09-11

    Abstract: A method includes receiving a plurality of images, wherein a first image of the one or more images comprises a range image and a second image comprises a camera image and filtering the first image to generate a filtered first image. The method also includes generating a plurality of depth estimates based on the second image and generating an attention map by combining the filtered first image and the plurality of depth estimates. Additionally, the method includes generating a consistency score indicative of a consistency of depth estimates between the first image and the second image based on the attention map, modulating one or more features extracted from the second image based on the consistency score using a gating mechanism to generate modulated one or more features, and generating a classification of one or more soiled regions in the second image based on the modulated one or more features.

    NERF-BASED MULTI-SENSOR DATA FUSION FOR VEHICLE APPLICATIONS

    公开(公告)号:US20250029393A1

    公开(公告)日:2025-01-23

    申请号:US18356504

    申请日:2023-07-21

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of image processing includes receiving a plurality of image frames representative of a scene; receiving point cloud data representative of the scene; determining, using a NeRF model, a three-dimensional reconstruction of the scene based on the plurality of image frames; and outputting fused data that combines first BEV features of the three-dimensional reconstruction of the scene and second BEV features of the point cloud data. Other aspects and features are also claimed and described.

    MULTI-MODAL ENCODER CHANNEL FUSION WITH CROSS-MODALITY AWARENESS

    公开(公告)号:US20250029355A1

    公开(公告)日:2025-01-23

    申请号:US18354074

    申请日:2023-07-18

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method includes receiving an image frame representing a scene; receiving point cloud data representing the scene; determining first sets of image frame features; determining second sets of point cloud data features based on a plurality of voxels representing the point cloud data; determining a third set of features of the image frame based on a first set of features of the plurality of first sets of features of the image frame and a second set of features of the plurality of second sets of features of the point cloud data; and outputting fused data that combines the third set of features of the image frame and a fourth set of features of the point cloud data. Other aspects and features are also claimed and described.

    ONLINE ADAPTIVE MULTI-SENSOR FUSION

    公开(公告)号:US20240412494A1

    公开(公告)日:2024-12-12

    申请号:US18332394

    申请日:2023-06-09

    Abstract: This disclosure provides systems, methods, and devices that support image processing. In a first aspect, a method for multi-sensor fusion includes receiving first information indicative of a first set of BEV features of image data captured by an image sensor; receiving second information indicative of a second set of BEV features of non-image sensor data captured by a non-image sensor; and determining fused data that combines the image data and the non-image sensor data based on the first information, the second information, and third information indicative of differences between BEV features of training data and the first set of BEV features and the second set of BEV features. The BEV features of the training data include a third set of BEV features associated with the image sensor and a fourth set of BEV features associated with the non-image sensor. Other aspects and features are also claimed and described.

    ADAPTIVE BEV FEATURE MAPPING FOR VEHICLE APPLICATIONS

    公开(公告)号:US20240412486A1

    公开(公告)日:2024-12-12

    申请号:US18330113

    申请日:2023-06-06

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In a first aspect, a method of image processing includes receiving an image frame from an image sensor of a camera; receiving an indicator associated with a type of lens of the camera; determining a first tensor grid associated with the indicator, the first tensor grid including a plurality of image framework positions associated with the type of lens; and determining, using a machine learning model, a BEV feature map corresponding to the image frame based on features of the image frame and the first tensor grid. Other aspects and features are also claimed and described.

Patent Agency Ranking