Scalable semantic image retrieval with deep template matching

    公开(公告)号:US12272148B2

    公开(公告)日:2025-04-08

    申请号:US17226584

    申请日:2021-04-09

    Abstract: Approaches presented herein provide for semantic data matching, as may be useful for selecting data from a large unlabeled dataset to train a neural network. For an object detection use case, such a process can identify images within an unlabeled set even when an object of interest represents a relatively small portion of an image or there are many other objects in the image. A query image can be processed to extract image features or feature maps from only one or more regions of interest in that image, as may correspond to objects of interest. These features are compared with images in an unlabeled dataset, with similarity scores being calculated between the features of the region(s) of interest and individual images in the unlabeled set. One or more highest scored images can be selected as training images showing objects that are semantically similar to the object in the query image.

    BI-DIRECTIONAL FEATURE PROJECTION FOR 3D PERCEPTION SYSTEMS AND APPLICATIONS

    公开(公告)号:US20240378799A1

    公开(公告)日:2024-11-14

    申请号:US18642531

    申请日:2024-04-22

    Abstract: In various examples, bi-directional projection techniques may be used to generate enhanced Bird's-Eye View (BEV) representations. For example, a system(s) may generate one or more BEV features associated with a BEV of an environment using a projection process that associates 2D image features to one or more first locations of a 3D space. At least partially using the BEV feature(s), the system(s) may determine one or more second locations of the 3D space that correspond to one or more regions of interest in the environment. The system(s) may then generate one or more additional BEV features corresponding to the second location(s) using a different projection process that associates the second location(s) from the 3D space to at least a portion of the 2D image features. The system(s) may then generate an updated BEV of the environment based at least on the BEV feature(s) and/or the additional BEV feature(s).

    CLASS AGNOSTIC OBJECT MASK GENERATION
    17.
    发明公开

    公开(公告)号:US20240169545A1

    公开(公告)日:2024-05-23

    申请号:US18355856

    申请日:2023-07-20

    Abstract: Class agnostic object mask generation uses a vision transformer-based auto-labeling framework requiring only images and object bounding boxes to generate object (segmentation) masks. The generated object masks, images, and object labels may then be used to train instance segmentation models or other neural networks to localize and segment objects with pixel-level accuracy. The generated object masks may supplement or replace conventional human generated annotations. The human generated annotations may be misaligned compared with the object boundaries, resulting in poor quality labeled segmentation masks. In contrast with conventional techniques, the generated object masks are class agnostic and are automatically generated based only on a bounding box image region without relying on either labels or semantic information.

Patent Agency Ranking