-
1.
公开(公告)号:EP3839818A3
公开(公告)日:2021-10-06
申请号:EP21162002.6
申请日:2021-03-11
发明人: LI, Yulin , QIN, Xiameng , ZHANG, Chengquan , HAN, Junyu , DING, Errui , WU, Tian , WANG, Haifeng
摘要: Embodiments of the present disclosure disclose a method and apparatus for performing a structured extraction on a text, a device and a storage medium, and relate to the field of artificial intelligence such as computer vision, deep learning, and natural language processing. A specific implementation of the method includes: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line. According to the implementation, a method for performing a structured extraction on a text based on category and relationship reasoning is provided, which is suitable for large-scale and automated processing and has a wide application range and a strong versatility.
-
公开(公告)号:EP3885935A1
公开(公告)日:2021-09-29
申请号:EP21275029.3
申请日:2021-03-16
发明人: QIN, Xiameng , LI, Yulin , HUANG, Ju , XIE, Qunyi , HAN, Junyu
IPC分类号: G06F16/583 , G06F16/332
摘要: The present application discloses an image questioning and answering method, apparatus, device and storage medium, relating to the technical field of image processing, computer vision, deep learning and natural language processing. The specific implementation solution is as follows: constructing a question graph with a topological structure and extracting a question feature of a query sentence, according to the query sentence; constructing a visual graph with a topological structure and a text graph with a topological structure according to a target image corresponding to the query sentence; performing fusion on the visual graph, the text graph and the question graph by using a fusion model, to obtain a final fusion graph; and determining reply information of the query sentence according to a reasoning feature extracted from the final fusion graph and the question feature.
-
3.
公开(公告)号:EP3968287A2
公开(公告)日:2022-03-16
申请号:EP22151884.8
申请日:2022-01-17
发明人: QIN, Xiameng , LI, Yulin , HUANG, Ju , XIE, Qunyi , ZHANG, Chengquan , YAO, Kun , LIU, Jingtuo , HAN, Junyu
IPC分类号: G06V30/41
摘要: Provided are a method and apparatus for extracting information about a negotiable instrument, an electronic device and a storage medium. The method includes inputting (S101) a to-be-recognized negotiable instrument into a pretrained deep learning network and obtaining a visual image corresponding to the to-be-recognized negotiable instrument through the deep learning network; matching (S102) the visual image corresponding to the to-be-recognized negotiable instrument with a visual image corresponding to each negotiable-instrument template in a preconstructed base template library; and in response to the visual image corresponding to the to-be-recognized negotiable instrument successfully matching a visual image corresponding to one negotiable-instrument template in the base template library, extracting (S103) structured information of the to-be-recognized negotiable instrument by using the negotiable-instrument template.
-
公开(公告)号:EP3882817A3
公开(公告)日:2022-01-05
申请号:EP21180801.9
申请日:2021-06-22
发明人: HUANG, Ju , XIE, Qunyi , LI, Yulin , QIN, Xiameng , YAO, Kun , HAN, Junyu
摘要: The present disclosure discloses a method, apparatus and device for recognizing a bill, and a storage medium. The method comprises: acquiring a bill image; inputting the bill image into a feature extraction network layer of a pre-trained bill recognition model, to obtain a bill key field feature map and a bill key field value feature map of the bill image; inputting the bill key field feature map into a first head network layer of the bill recognition model, to obtain a bill key field; processing the bill key field value feature map by a second head network layer of the bill recognition model, to obtain a bill key field value, the feature extraction network layer being respectively connected with the first head network layer and the second head network layer; and generating structured information of the bill image based on the bill key field and the bill key field value.
-
公开(公告)号:EP3882817A2
公开(公告)日:2021-09-22
申请号:EP21180801.9
申请日:2021-06-22
发明人: HUANG, Ju , XIE, Qunyi , LI, Yulin , QIN, Xiameng , YAO, Kun , HAN, Junyu
摘要: The present disclosure discloses a method, apparatus and device for recognizing a bill, and a storage medium. The method comprises: acquiring a bill image; inputting the bill image into a feature extraction network layer of a pre-trained bill recognition model, to obtain a bill key field feature map and a bill key field value feature map of the bill image; inputting the bill key field feature map into a first head network layer of the bill recognition model, to obtain a bill key field; processing the bill key field value feature map by a second head network layer of the bill recognition model, to obtain a bill key field value, the feature extraction network layer being respectively connected with the first head network layer and the second head network layer; and generating structured information of the bill image based on the bill key field and the bill key field value.
-
公开(公告)号:EP3869398A2
公开(公告)日:2021-08-25
申请号:EP21180877.9
申请日:2021-06-22
发明人: ZHANG, Chengquan , EN, Mengyi , HUANG, Ju , XIE, Qunyi , QIN, Xiameng , YAO, Kun , HAN, Junyu , LIU, Jingtuo , DING, Errui
摘要: A method and apparatus for processing an image, a device and a storage medium. An implementation of the method includes: acquiring a template image, the template image including at least one region of interest; determining a first feature map corresponding to each region of interest in the template image; acquiring a target image; determining a second feature map of the target image; and determining at least one region of interest in the target image according to the first feature map and the second feature map.
-
7.
公开(公告)号:EP3839818A2
公开(公告)日:2021-06-23
申请号:EP21162002.6
申请日:2021-03-11
发明人: LI, Yulin , QIN, Xiameng , ZHANG, Chengquan , HAN, Junyu , DING, Errui , WU, Tian , WANG, Haifeng
摘要: Embodiments of the present disclosure disclose a method and apparatus for performing a structured extraction on a text, a device and a storage medium, and relate to the field of artificial intelligence such as computer vision, deep learning, and natural language processing. A specific implementation of the method includes: performing a text detection on an entity text image to obtain a position and content of a text line of the entity text image; extracting multivariate information of the text line based on the position and the content of the text line; performing a feature fusion on the multivariate information of the text line to obtain a multimodal fusion feature of the text line; performing category and relationship reasoning based on the multimodal fusion feature of the text line to obtain a category and a relationship probability matrix of the text line; and constructing structured information of the entity text image based on the category and the relationship probability matrix of the text line. According to the implementation, a method for performing a structured extraction on a text based on category and relationship reasoning is provided, which is suitable for large-scale and automated processing and has a wide application range and a strong versatility.
-
公开(公告)号:EP4040401A1
公开(公告)日:2022-08-10
申请号:EP21197863.0
申请日:2021-09-21
发明人: LI, Yulin , HUANG, Ju , XIE, Qunyi , QIN, Xiameng , ZHANG, Chengquan , LIU, Jingtuo
IPC分类号: G06V10/82 , G06V30/413
摘要: The present disclosure discloses an image processing method and apparatus, a device and a storage medium, and relates to the field of artificial intelligence technologies, and particularly to the fields of computer vision technologies, deep learning technologies, or the like. The image processing method includes: acquiring a multi-modal feature of each of at least one text region in an image, the multi-modal feature including features in plural dimensions; performing a global attention processing operation on the multi-modal feature of each text region to obtain a global attention feature of each text region; determining a category of each text region based on the global attention feature of each text region; and constructing structured information based on text content and the category of each text region. The present disclosure may provide a more universal construction scheme for structured information in an image.
-
9.
公开(公告)号:EP3968287A3
公开(公告)日:2022-07-13
申请号:EP22151884.8
申请日:2022-01-17
发明人: QIN, Xiameng , LI, Yulin , HUANG, Ju , XIE, Qunyi , ZHANG, Chengquan , YAO, Kun , LIU, Jingtuo , HAN, Junyu
摘要: Provided are a method and apparatus for extracting information about a negotiable instrument, an electronic device and a storage medium. The method includes inputting (S101) a to-be-recognized negotiable instrument into a pretrained deep learning network and obtaining a visual image corresponding to the to-be-recognized negotiable instrument through the deep learning network; matching (S102) the visual image corresponding to the to-be-recognized negotiable instrument with a visual image corresponding to each negotiable-instrument template in a preconstructed base template library; and in response to the visual image corresponding to the to-be-recognized negotiable instrument successfully matching a visual image corresponding to one negotiable-instrument template in the base template library, extracting (S103) structured information of the to-be-recognized negotiable instrument by using the negotiable-instrument template.
-
公开(公告)号:EP3923185A3
公开(公告)日:2022-04-27
申请号:EP21202754.4
申请日:2021-10-14
发明人: YU, Yuechen , ZHANG, Chengquan , LI, Yulin , ZHANG, Xiaoqiang , HUANG,, Ju , QIN, Xiameng , YAO, Kun , LIU, Jingtuo , HAN, Junyu , DING, Errui
摘要: Provided are an image classification method and apparatus, an electronic device and a storage medium, relating to the field of artificial intelligence and, in particular, to computer vision and deep learning. The method includes inputting (S101, S201, S301) a to-be-classified document image into a pretrained neural network and obtaining a feature submap of each text box of the to-be-classified document image by use of the neural network; inputting (S102, S202, S302) the feature submap of each text box, a semantic feature corresponding to preobtained text information of each text box and a position feature corresponding to preobtained position information of each text box into a pretrained multimodal feature fusion model and fusing, by use of the multimodal feature fusion model, the three into a multimodal feature corresponding to each text box; and classifying (S103) the to-be-classified document image based on the multimodal feature corresponding to each text box. The semantic feature and position feature in the document image are well used so that the object of improving the classification accuracy of the document image is achieved.
-
-
-
-
-
-
-
-
-