ADAPTIVE LEARNABLE POOLING FOR OVERLAPPING CAMERA FEATURES

    公开(公告)号:US20250157179A1

    公开(公告)日:2025-05-15

    申请号:US18506583

    申请日:2023-11-10

    Abstract: An apparatus is configured to fuse features from a plurality of camera images using a set of learnable parameters. The apparatus may extract features from a respective image from each camera of a plurality of cameras, and fuse the features into a fused image having a grid structure. To fuse the features the apparatus may determine a contribution of the respective image from each of the plurality of cameras to a respective cell of the grid structure, and aggregate, based on the contribution to the respective cell and a respective set of learnable parameters for each cell, the features from each of the respective images to each respective cell of the fused image to generate aggregated features.

    FREE SPACE DETECTION FOR PARKING AND DRIVING IN PUDDLE AREAS WITH PERMUTED FUSION NETWORK

    公开(公告)号:US20250078294A1

    公开(公告)日:2025-03-06

    申请号:US18458654

    申请日:2023-08-30

    Abstract: A method includes receiving one or more images, wherein at least one of the one or more images depicts a water region and analyzing, by one or more processors, the one or more images using a first machine learning model to determine a depth of the water region. The method also includes analyzing, by the one or more processors, the one or more images using a second machine learning model to determine a surface normal of the water region and performing, by the one or more processors, using a third machine learning model, multi-class segmentation of the one or more images. Additionally, the method includes performing one or more fusion operations on outputs of at least two of the first machine learning model, the second machine learning model and the third machine learning model to generate a classification of the water region.

    UPSAMPLING FOR POINT CLOUD FEATURES

    公开(公告)号:US20250069184A1

    公开(公告)日:2025-02-27

    申请号:US18454940

    申请日:2023-08-24

    Abstract: A method of processing image content includes constructing a first graph representation having a first level of point sparsity from a first point cloud data, and performing diffusion-based upsampling on the first graph representation to generate a second graph representation having a second level of point sparsity. Performing diffusion-based upsampling includes inputting the first graph representation into a diffusion-based trained model to generate a first intermediate graph representation having a first intermediate level of point sparsity, inputting the first intermediate graph representation into the diffusion-based trained model to generate a second intermediate graph representation having a second intermediate level of point sparsity, and generating the second graph representation based on at least on the second intermediate graph representation. The method includes generating second point cloud data having the second level of point sparsity based on the second graph representation having the second level of point sparsity.

    MULTIMODAL 3D OBJECT DETECTION USING TEMPORAL AND STRUCTURE CONSISTENCY IN VOXEL FEATURE SPACE

    公开(公告)号:US20250166216A1

    公开(公告)日:2025-05-22

    申请号:US18516590

    申请日:2023-11-21

    Abstract: An example device for detecting objects through processing of media data, such as image data and point cloud data, includes a processing system configured to form voxel representations of a real-world three-dimensional (3D) space using images and point clouds captured for the 3D space at consecutive time steps, extract image and/or point cloud features for voxels in voxel representations of the 3D space, determine correspondences between the voxels at consecutive time steps according to similarities between the extracted features, and determine positions of objects in the 3D space using the correspondences between the voxels. For example, the processing system may perform triangulation according to positions of a moving object to positions of the voxels at the time steps. In this manner, the processing system may generate an accurate bird's eye view (BEV) representation of the real-world 3D space.

    MOTION FORECASTING FOR SCENE FLOW ESTIMATION

    公开(公告)号:US20250080685A1

    公开(公告)日:2025-03-06

    申请号:US18462191

    申请日:2023-09-06

    Abstract: A method of image processing includes receiving first feature data from image content captured with a sensor, the first feature data having a first set of states with values that change non-linearly over time, generating second feature data based at least in part on the first feature data, the second feature data having a second set of states with values that change approximately linearly over time relative to a linear operator, wherein the second set of states is greater than the first set of states, and predicting movement of one or more objects in the image content based at least in part on the second feature data.

    ALIGNING SENSOR DATA FOR VEHICLE APPLICATIONS

    公开(公告)号:US20250050894A1

    公开(公告)日:2025-02-13

    申请号:US18448034

    申请日:2023-08-10

    Abstract: This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. A method is disclosed for aligning top-down features from two sensor arrangements and generating vehicle control instructions. The method includes receiving first sensor data from a first sensor arrangement and second sensor data from a second sensor arrangement. The method further includes determining a first set of top-down and a second set of top-down features based on the sensor data. A transformation is determined based on the first set of top-down features and the second set of top-down features to align the second set of top-down features with the first set of top-down features. Finally, vehicle control instructions for a vehicle are determined based on the transformation. Other aspects and features are also claimed and described.

    DEGRADED IMAGE FRAME CORRECTION
    10.
    发明申请

    公开(公告)号:US20240420293A1

    公开(公告)日:2024-12-19

    申请号:US18336580

    申请日:2023-06-16

    Abstract: A method of processing image data includes receiving, with a frame correction machine-learning (ML) model executing on processing circuitry, an image frame captured from a first camera of a plurality of cameras; performing, with the frame correction ML model executing on the processing circuitry, image frame correction to generate a corrected image frame based on weights or biases of the frame correction ML model applied to two or more of: samples of the image frame, samples of previously captured image frames from the first camera, or samples from image frames from other cameras of the plurality of cameras; and performing, with the processing circuitry, post-processing based on the corrected image frame.

Patent Agency Ranking