-
公开(公告)号:US11854283B2
公开(公告)日:2023-12-26
申请号:US17169112
申请日:2021-02-05
Inventor: Pengyuan Lv , Xiaoqiang Zhang , Shanshan Liu , Chengquan Zhang , Qiming Peng , Sijin Wu , Hua Lu , Yongfeng Chen
IPC: G06V30/262 , G06T7/70 , G06V30/413 , G06V20/62 , G06F16/33 , G06V30/19 , G06V10/82 , G06V30/416
CPC classification number: G06V30/274 , G06F16/3344 , G06T7/70 , G06V10/82 , G06V20/62 , G06V30/19173 , G06V30/413 , G06V30/416 , G06T2207/30176
Abstract: The present disclosure provides a method for visual question answering, which relates to fields of computer vision and natural language processing. The method includes: acquiring an input image and an input question; detecting visual information and position information of each of at least one text region in the input image; determining semantic information and attribute information of each of the at least one text region based on the visual information and the position information; determining a global feature of the input image based on the visual information, the position information, the semantic information, and the attribute information; determining a question feature based on the input question; and generating a predicted answer for the input image and the input question based on the global feature and the question feature. The present disclosure further provides a device for visual question answering, a computer device and a medium.
-
公开(公告)号:US20220188509A1
公开(公告)日:2022-06-16
申请号:US17456765
申请日:2021-11-29
IPC: G06F40/205 , G06F16/93 , G06F16/31 , G06F16/33
Abstract: The disclosure provides a method and an apparatus for extracting content from a document, an electronic device, and a storage medium, which relates to the field of artificial intelligence (AI) technologies such as natural language processing (NLP), deep learning (DL), knowledge graph (KG). The detailed implementation scheme is: obtaining the document; performing anchor search on the document to obtain anchor information corresponding to the document; determining region information of content to be extracted based on the anchor information; and extracting the content to be extracted from the document based on the region information.
-