SOURCE-FREE CROSS DOMAIN DETECTION METHOD WITH STRONG DATA AUGMENTATION AND SELF-TRAINED MEAN TEACHER MODELING

    公开(公告)号:US20230154167A1

    公开(公告)日:2023-05-18

    申请号:US17966017

    申请日:2022-10-14

    CPC classification number: G06V10/7747 G06V10/25 G06V10/765 G06V2201/07

    Abstract: A method for implementing source-free domain adaptive detection is presented. The method includes, in a pretraining phase, applying strong data augmentation to labeled source images to produce perturbed labeled source images and training an object detection model by using the perturbed labeled source images to generate a source-only model. The method further includes, in an adaptation phase, training a self-trained mean teacher model by generating a weakly augmented image and multiple strongly augmented images from unlabeled target images, generating a plurality of region proposals from the weakly augmented image, selecting a region proposal from the plurality of region proposals as a pseudo ground truth, detecting, by the self-trained mean teacher model, object boxes and selecting pseudo ground truth boxes by employing a confidence constraint and a consistency constraint, and training a student model by using one of the multiple strongly augmented images jointly with an object detection loss.

    COMPOSITIONAL TEXT-TO-IMAGE SYNTHESIS WITH PRETRAINED MODELS

    公开(公告)号:US20230153606A1

    公开(公告)日:2023-05-18

    申请号:US17968923

    申请日:2022-10-19

    Abstract: A method is provided that includes training a CLIP model to learn embeddings of images and text from matched image-text pairs. The text represents image attributes. The method trains a StyleGAN on images in a training dataset of matched image-text pairs. The method also trains, using a CLIP model guided contrastive loss which attracts matched text embedding pairs and repels unmatched pairs, a text-to-direction model to predict a text direction that is semantically aligned with an input text responsive to the input text and a random latent code. A triplet loss is used to learn text directions using the embeddings learned by the trained CLIP model. The method generates, by the trained StyleGAN, positive and negative synthesized images by respectively adding and subtracting the text direction in the latent space of the trained StyleGAN corresponding to a word for each of the words in the training dataset.

    Network reparameterization for new class categorization

    公开(公告)号:US11087184B2

    公开(公告)日:2021-08-10

    申请号:US16580199

    申请日:2019-09-24

    Abstract: A computer-implemented method and system are provided for training a model for New Class Categorization (NCC) of a test image. The method includes decoupling, by a hardware processor, a feature extraction part from a classifier part of a deep classification model by reparametrizing learnable weight variables of the classifier part as a combination of learnable variables of the feature extraction part and of a classification weight generator of the classifier part. The method further includes training, by the hardware processor, the deep classification model to obtain a trained deep classification model by (i) learning the feature extraction part as a multiclass classification task, and (ii) episodically training the classifier part by learning a classification weight generator which outputs classification weights given a training image.

Patent Agency Ranking