IMAGE PROCESSING METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM

发明公开

EP4040401A1 IMAGE PROCESSING METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM 审中-公开

请登陆查看更多内容

专利标题： IMAGE PROCESSING METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM
申请号： EP21197863.0

申请日： 2021-09-21
公开(公告)号： EP4040401A1

公开(公告)日： 2022-08-10
发明人: LI, Yulin , HUANG, Ju , XIE, Qunyi , QIN, Xiameng , ZHANG, Chengquan , LIU, Jingtuo
申请人： BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO. LTD.
申请人地址： CN 100085 Beijing 2/F Baidu Campus, No. 10, Shangdi 10th, Haidian District
代理机构： dompatent von Kreisler Selting Werner - Partnerschaft von Patent- und Rechtsanwälten mbB
优先权： CN202110156565 20210204
主分类号： G06V10/82
IPC分类号： G06V10/82 ; G06V30/413

IMAGE PROCESSING METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM

摘要：

The present disclosure discloses an image processing method and apparatus, a device and a storage medium, and relates to the field of artificial intelligence technologies, and particularly to the fields of computer vision technologies, deep learning technologies, or the like. The image processing method includes: acquiring a multi-modal feature of each of at least one text region in an image, the multi-modal feature including features in plural dimensions; performing a global attention processing operation on the multi-modal feature of each text region to obtain a global attention feature of each text region; determining a category of each text region based on the global attention feature of each text region; and constructing structured information based on text content and the category of each text region. The present disclosure may provide a more universal construction scheme for structured information in an image.

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06V	图像或视频识别或理解
G06V10/00	图像或视频识别或理解的安排（图像或视频中的字符识别 G06V30/10）
G06V10/70	.使用模式识别或机器学习（光学模式识别或电子计算 G06V10/88）
G06V10/82	..使用神经网络