-
1.
公开(公告)号:US20220139096A1
公开(公告)日:2022-05-05
申请号:US17578735
申请日:2022-01-19
Inventor: Pengyuan Lv , Chengquan Zhang , Kun Yao , Junyu Han
Abstract: A character recognition method, a model training method, a related apparatus and an electronic device are provided. The specific solution is: obtaining a target picture; performing feature encoding on the target picture to obtain a visual feature of the target picture; performing feature mapping on the visual feature to obtain a first target feature of the target picture, where the first target feature is a feature that has a matching space with a feature of character semantic information of the target picture; inputting the first target feature into a character recognition model for character recognition to obtain a first character recognition result of the target picture.
-
公开(公告)号:US11854283B2
公开(公告)日:2023-12-26
申请号:US17169112
申请日:2021-02-05
Inventor: Pengyuan Lv , Xiaoqiang Zhang , Shanshan Liu , Chengquan Zhang , Qiming Peng , Sijin Wu , Hua Lu , Yongfeng Chen
IPC: G06V30/262 , G06T7/70 , G06V30/413 , G06V20/62 , G06F16/33 , G06V30/19 , G06V10/82 , G06V30/416
CPC classification number: G06V30/274 , G06F16/3344 , G06T7/70 , G06V10/82 , G06V20/62 , G06V30/19173 , G06V30/413 , G06V30/416 , G06T2207/30176
Abstract: The present disclosure provides a method for visual question answering, which relates to fields of computer vision and natural language processing. The method includes: acquiring an input image and an input question; detecting visual information and position information of each of at least one text region in the input image; determining semantic information and attribute information of each of the at least one text region based on the visual information and the position information; determining a global feature of the input image based on the visual information, the position information, the semantic information, and the attribute information; determining a question feature based on the input question; and generating a predicted answer for the input image and the input question based on the global feature and the question feature. The present disclosure further provides a device for visual question answering, a computer device and a medium.
-
3.
公开(公告)号:US20230123327A1
公开(公告)日:2023-04-20
申请号:US18068149
申请日:2022-12-19
Inventor: Chengquan Zhang , Pengyuan Lv , Kun Yao , Junyu Han , Jingtuo Liu
Abstract: A method for recognizing text includes: obtaining an image sequence feature of an image to be recognized; obtaining a full text string of the image to be recognized by decoding the image sequence feature; obtaining a text sequence feature by performing a semantic enhancement process on the full text string, in which the image sequence feature, the full text string and the text sequence feature are of the same length; and determining text content of the image to be recognized based on the full text string and the text sequence feature.
-
-