Invention Grant
- Patent Title: Interpretable counting in visual question answering
-
Application No.: US16781179Application Date: 2020-02-04
-
Publication No.: US11270145B2Publication Date: 2022-03-08
- Inventor: Alexander Richard Trott , Caiming Xiong , Richard Socher
- Applicant: salesforce.com, inc.
- Applicant Address: US CA San Francisco
- Assignee: salesforce.com, inc.
- Current Assignee: salesforce.com, inc.
- Current Assignee Address: US CA San Francisco
- Agency: Haynes and Boone, LLP
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06K9/46 ; G06F16/332 ; G06N5/04 ; G06N3/04

Abstract:
Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.
Public/Granted literature
- US20200175305A1 Interpretable Counting in Visual Question Answering Public/Granted day:2020-06-04
Information query