一种基于边界点检测的场景文本端到端识别方法

发明公开

CN110837835A 一种基于边界点检测的场景文本端到端识别方法有权

请登陆查看更多内容

专利标题： 一种基于边界点检测的场景文本端到端识别方法
专利标题（英）： Scene text end-to-end identification method based on boundary point detection
申请号： CN201911038568.1

申请日： 2019-10-29
公开(公告)号： CN110837835A

公开(公告)日： 2020-02-25
发明人: 刘文予 , 白翔 , 许永超 , 王豪 , 卢普 , 张辉 , 杨明锟 , 何梦超 , 王永攀
申请人： 华中科技大学
申请人地址： 湖北省武汉市洪山区珞喻路1037号
专利权人： 华中科技大学
当前专利权人： 华中科技大学
当前专利权人地址： 湖北省武汉市洪山区珞喻路1037号
代理机构： 深圳市六加知识产权代理有限公司
代理商 向彬
主分类号： G06K9/34
IPC分类号： G06K9/34 ; G06K9/46 ; G06K9/62

摘要：

本发明公开了一种基于边界点检测的场景文本端到端识别方法，通过特征金字塔网络提取文本特征，用于区域提取网络生成候选文本框；然后通过多方向矩形检测网络检测文本实例的更为精准的多方向包围盒；其次在多方向包围盒内检测文本的上下两条边界点序列；最后利用检测到的边界点序列将任意形状的文本转化为水平文本供后续的基于注意力机制的序列识别网络进行识别，最后利用集束搜索算法找到给定词典中预测序列的最匹配单词得到最终的文本识别结果。该方法可以在不需要字符级别的标注情况下同时检测和识别自然图像中任意形状的场景文本，包括水平文本、多方向文本和曲形文本，并且可以完全地进行端到端训练。

摘要（英）：

The invention discloses a scene text end-to-end recognition method based on boundary point detection, and the method comprises the steps: extracting text features through a feature pyramid network, and generating candidate textboxes through a region extraction network; detecting a more accurate multi-directional bounding box of the text instance through a multi-directional rectangular detection network; secondly, detecting an upper boundary point sequence and a lower boundary point sequence of the text in the multi-directional bounding box; and finally, converting the text in any shape into ahorizontal text by utilizing the detected boundary point sequence for the subsequent attention mechanism-based sequence recognition network to performing recognizing, and finally, finding out the mostmatched word of the prediction sequence in the given dictionary by utilizing a cluster search algorithm to obtain a final text recognition result. According to the method, the scene text in any shapein the natural image can be detected and recognized at the same time under the condition that character-level labeling is not needed, the scene text comprises the horizontal text, the multi-directiontext and the curved text, and end-to-end training can be completely carried out.

公开/授权文献

CN110837835B 一种基于边界点检测的场景文本端到端识别方法公开/授权日：2022-11-08

信息查询

中国专利公布公告 Global Dossier Espacenet