METHOD OF PERSONALIZED IMAGE AND VIDEO SEARCHING BASED ON A NATURAL LANGUAGE QUERY, AND AN APPARATUS FOR THE SAME

    公开(公告)号:US20230394079A1

    公开(公告)日:2023-12-07

    申请号:US18453838

    申请日:2023-08-22

    CPC classification number: G06F16/535 G06N20/00

    Abstract: A method of personalized image retrieval includes obtaining a natural language query including a name; replacing the name in the natural language query with a generic term to provide an anonymized query and named entity information; obtaining a plurality of initial ranking scores and a plurality of attention weights corresponding to a plurality of images using a trained scoring model that inputs the anonymized query and the plurality of images; obtaining a plurality of delta scores corresponding to the plurality of images using a re-scoring model that inputs the plurality of attention weights and the named entity information; and obtaining a plurality of final ranking scores by modifying the plurality of initial ranking scores based on the plurality of delta scores. The trained scoring model performs semantic based searching and the re-scoring model determines a probability that faces detected in the plurality of images correspond to the name.

    REPRESENTING 3D SHAPES WITH PROBABILISTIC DIRECTED DISTANCE FIELDS

    公开(公告)号:US20230154102A1

    公开(公告)日:2023-05-18

    申请号:US17984521

    申请日:2022-11-10

    CPC classification number: G06T15/06 G06T15/10

    Abstract: The present disclosure provides methods, apparatuses, and computer-readable mediums for representing shapes with probabilistic directed distance fields. In some embodiments, a method includes obtaining a camera representation and a latent shape vector representation of a scene. The camera representation indicates position information and direction information of a view of the scene. The method further includes calculating, based on the latent shape vector representation of the scene, a visibility score and a depth for each ray of a plurality of rays emanating from a corresponding plurality of positions and directions. The plurality of positions and directions are determined from the camera representation of the scene. The method further includes generating renders of geometric information of the scene using the visibility score and the depth of the plurality of rays.

    APPARATUS FOR VIDEO SEARCHING USING MULTI-MODAL CRITERIA AND METHOD THEREOF

    公开(公告)号:US20210193187A1

    公开(公告)日:2021-06-24

    申请号:US16725609

    申请日:2019-12-23

    Abstract: An apparatus for video searching, includes a memory storing instructions, and a processor configured to execute the instructions to split a video into scenes, obtain, from the scenes into which the video is split, one or more textual descriptors describing each of the scenes, encode the obtained one or more textual descriptors describing each of the scenes into a video scene vector of each of the scenes, encode a user query into a query vector having a same semantic representation as that of the video scene vector of each of the scenes into which the one or more textual descriptors describing each of the scenes are encoded, and identify whether the video scene vector of at least one among the scenes corresponds to the query vector into which the user query is encoded.

Patent Agency Ranking