TECHNIQUES FOR TRAINING VISION FOUNDATION MODELS VIA MULTI-TEACHER DISTILLATION

    公开(公告)号:US20250165777A1

    公开(公告)日:2025-05-22

    申请号:US18740294

    申请日:2024-06-11

    Abstract: One embodiment of a method for training a first machine learning model includes processing first data via a plurality of trained machine learning models to generate a plurality of first outputs, processing the first data via the first machine learning model to generate a second output, processing the second output via a plurality of projection heads to generate a plurality of third outputs, computing a plurality of losses based on the plurality of first outputs and the plurality of third outputs, and performing one or more operations to update one or more parameters of the first machine learning model and one or more parameters of the plurality of projection heads based on the plurality of losses.

    TECHNIQUES FOR GENERATING DEPTH MAPS FROM VIDEOS

    公开(公告)号:US20240303840A1

    公开(公告)日:2024-09-12

    申请号:US18508139

    申请日:2023-11-13

    CPC classification number: G06T7/50 G06T7/20 G06V10/762

    Abstract: The disclosed method for generating a first depth map for a first frame of a video includes performing one or more operations to generate a first intermediate depth map based on the first frame and a second frame preceding the first frame within the video, performing one or more operations to generate a second intermediate depth map based on the first frame, and performing one or more operations to combine the first intermediate depth map and the second intermediate depth map to generate the first depth map.

    IMAGE STITCHING WITH DYNAMIC SEAM PLACEMENT BASED ON OBJECT SALIENCY FOR SURROUND VIEW VISUALIZATION

    公开(公告)号:US20230316458A1

    公开(公告)日:2023-10-05

    申请号:US18173589

    申请日:2023-02-23

    CPC classification number: G06T3/4038 G06T7/74

    Abstract: In various examples, dynamic seam placement is used to position seams in regions of overlapping image data to avoid crossing salient objects or regions. Objects may be detected from image frames representing overlapping views of an environment surrounding an ego-object such as a vehicle. The images may be aligned to create an aligned composite image or surface (e.g., a panorama, a 360° image, bowl shaped surface) with regions of overlapping image data, and a representation of the detected objects and/or salient regions (e.g., a saliency mask) may be generated and projected onto the aligned composite image or surface. Seams may be positioned in the overlapping regions to avoid or minimize crossing salient pixels represented in the projected masks, and the image data may be blended at the seams to create a stitched image or surface (e.g., a stitched panorama, stitched 360° image, stitched textured surface).

    ADAPTIVE TOKEN DEPTH ADJUSTMENT IN TRANSFORMER NEURAL NETWORKS

    公开(公告)号:US20230186077A1

    公开(公告)日:2023-06-15

    申请号:US17841577

    申请日:2022-06-15

    CPC classification number: G06N3/08 G06N3/0481

    Abstract: One embodiment of the present invention sets forth a technique for executing a transformer neural network. The technique includes computing a first set of halting scores for a first set of tokens that has been input into a first layer of the transformer neural network. The technique also includes determining that a first halting score included in the first set of halting scores exceeds a threshold value. The technique further includes in response to the first halting score exceeding the threshold value, causing a first token that is included in the first set of tokens and is associated with the first halting score not to be processed by one or more layers within the transformer neural network that are subsequent to the first layer.

    TRAINING ENERGY-BASED VARIATIONAL AUTOENCODERS

    公开(公告)号:US20220101145A1

    公开(公告)日:2022-03-31

    申请号:US17357738

    申请日:2021-06-24

    Abstract: One embodiment sets forth a technique for creating a generative model. The technique includes generating a trained generative model with a first component that converts data points in the training dataset into latent variable values, a second component that learns a distribution of the latent variable values, and a third component that converts the latent variable values into output distributions. The technique also includes training an energy-based model to learn an energy function based on values sampled from a first distribution associated with the training dataset and values sampled from a second distribution during operation of the trained generative model. The technique further includes creating a joint model that includes one or more portions of the trained generative model and the energy-based model, and that applies energy values from the energy-based model to samples from the second distribution to produce additional values used to generate a new data point.

    LEARNING AND PROPAGATING VISUAL ATTRIBUTES

    公开(公告)号:US20220076128A1

    公开(公告)日:2022-03-10

    申请号:US17017597

    申请日:2020-09-10

    Abstract: One embodiment of the present invention sets forth a technique for performing spatial propagation. The technique includes generating a first directed acyclic graph (DAG) by connecting spatially adjacent points included in a set of unstructured points via directed edges along a first direction. The technique also includes applying a first set of neural network layers to one or more images associated with the set of unstructured points to generate (i) a set of features for the set of unstructured points and (ii) a set of pairwise affinities between the spatially adjacent points connected by the directed edges. The technique further includes generating a set of labels for the set of unstructured points by propagating the set of features across the first DAG based on the set of pairwise affinities.

Patent Agency Ranking