CONFIDENCE-BASED INTERACTABLE NEURAL-SYMBOLIC VISUAL QUESTION ANSWERING

    公开(公告)号:US20240160842A1

    公开(公告)日:2024-05-16

    申请号:US18387728

    申请日:2023-11-07

    CPC classification number: G06F40/205 G06F16/24578

    Abstract: A method of performing visual question answering (VQA), including: obtaining an image and a question corresponding to the image; generating a plurality of feature predictions about at least one object included in the image by providing the image to an artificial intelligence (AI) scene perception model; generating a plurality of symbolic programs and a plurality of program confidence scores by providing the question to an AI question parsing model; selecting a symbolic program associated with a program confidence score which is highest among the plurality of program confidence scores; executing the selected symbolic program by performing a set of logic operations included in the selected symbolic program on the plurality of feature predictions; and determining a natural language answer to the question based on a result of the set of logic operations.

Patent Agency Ranking