SALIENCY-GUIDED MIXUP WITH OPTIMAL RE-ARRANGEMENTS FOR EFFICIENT DATA AUGMENTATION

    公开(公告)号:US20240144652A1

    公开(公告)日:2024-05-02

    申请号:US18201521

    申请日:2023-05-24

    CPC classification number: G06V10/771 G06V10/774 G06V10/80

    Abstract: The present disclosure provides methods, apparatuses, and computer-readable mediums for performing data augmentation. In some embodiments, a method of performing data augmentation by a device includes obtaining a plurality of images from a dataset. The method further includes computing, for each image of the plurality of images, a corresponding saliency map based on a gradient of a full loss function of that image. The method further includes selecting, from a subset of arrangements of a plurality of possible arrangements, a rearrangement offset that maximizes an overall saliency of a resulting image combining the plurality of images. The method further includes generating, using the rearrangement offset and a plurality of mixing ratios, a new mixed image from the plurality of images and a new mixed label from corresponding labels of the plurality of images. The method further includes augmenting the dataset with the new mixed image and the new mixed label.

    METHOD OF PERSONALIZED IMAGE AND VIDEO SEARCHING BASED ON A NATURAL LANGUAGE QUERY, AND AN APPARATUS FOR THE SAME

    公开(公告)号:US20230394079A1

    公开(公告)日:2023-12-07

    申请号:US18453838

    申请日:2023-08-22

    CPC classification number: G06F16/535 G06N20/00

    Abstract: A method of personalized image retrieval includes obtaining a natural language query including a name; replacing the name in the natural language query with a generic term to provide an anonymized query and named entity information; obtaining a plurality of initial ranking scores and a plurality of attention weights corresponding to a plurality of images using a trained scoring model that inputs the anonymized query and the plurality of images; obtaining a plurality of delta scores corresponding to the plurality of images using a re-scoring model that inputs the plurality of attention weights and the named entity information; and obtaining a plurality of final ranking scores by modifying the plurality of initial ranking scores based on the plurality of delta scores. The trained scoring model performs semantic based searching and the re-scoring model determines a probability that faces detected in the plurality of images correspond to the name.

    PROBABILISTIC PROCEDURE PLANNING FOR INSTRUCTIONAL VIDEOS

    公开(公告)号:US20230153344A1

    公开(公告)日:2023-05-18

    申请号:US17984685

    申请日:2022-11-10

    CPC classification number: G06F16/532 G06N7/01

    Abstract: The present disclosure provides methods and apparatuses for probabilistic procedure planning for generating a plan based on a goal relating to an end state. In some embodiments, a method includes receiving a request from a user to generate an action plan comprising T intermediate actions between a start state and the end state. The method further includes constructing an input query matrix based on T, the start state, the end state, positional encodings, and pseudo-random noise information. The method further includes generating, using a machine learning transformer decoder, the action plan based on the input query matrix and a plurality of learnable vectors. The method further includes providing the action plan to the user. The action plan indicates a probability distribution of a plurality of distinct action sequences, to be performed by the user, that transform the start state to the end state.

    APPARATUS FOR DEEP REPRESENTATION LEARNING AND METHOD THEREOF

    公开(公告)号:US20200380358A1

    公开(公告)日:2020-12-03

    申请号:US16805051

    申请日:2020-02-28

    Abstract: An apparatus for providing similar contents, using a neural network, includes a memory storing instructions, and a processor configured to execute the instructions to obtain a plurality of similarity values between a user query and a plurality of images, using a similarity neural network, obtain a rank of each the obtained plurality of similarity values, and provide, as a most similar image to the user query, at least one among the plurality of images that has a respective one among the plurality of similarity values that corresponds to a highest rank among the obtained rank of each of the plurality of similarity values. The similarity neural network is trained with a divergence neural network for outputting a divergence between a first distribution of first similarity values for positive pairs, among the plurality of similarity values, and a second distribution of second similarity values for negative pairs, among the plurality of similarity values.

Patent Agency Ranking