专利检索 ap:("Tsinghua University" OR "HYUNDAI MOTOR COMPANY" OR "Kia Corporation") AND inv:"Jong Ub Suk" 第 1 页

1.

发明授权
Scene text detection method and system based on sequential deformation 有权

公开(公告)号：US12080084B2

公开(公告)日：2024-09-03

申请号：US17407549

申请日：2021-08-20

申请人： Tsinghua University , HYUNDAI MOTOR COMPANY , Kia Corporation

发明人： Liangrui Peng , Shanyu Xiao , Ruijie Yan , Gang Yao , Shengjin Wang , Jaesik Min , Jong Ub Suk

IPC分类号： G06V20/62 , G06F18/214 , G06N3/04 , G06N3/084 , G06V10/22 , G06V10/40 , G06V30/10

CPC分类号： G06V20/63 , G06F18/214 , G06N3/04 , G06N3/084 , G06V10/225 , G06V10/40 , G06V30/10

摘要： A method and a system for detecting a scene text may include extracting a first feature map for a scene image input based on a convolutional neural network, and delivering the first feature map to a sequential deformation module; obtaining sampled feature maps corresponding to sampling positions by performing iterative sampling for the first feature map, obtaining a second feature map by performing a concatenation operation in deep learning according to a channel dimension for the first feature map and the sampled feature maps; obtaining a third feature map by performing a feature aggregation operation for the second feature map in the channel dimension, and delivering the third feature map to the object detection baseline network; and performing text area candidate box extraction for the third feature map and obtaining a text area prediction result as a scene text detection result through regression fitting.

2.

发明申请
MULTI-DIRECTIONAL SCENE TEXT RECOGNITION METHOD AND SYSTEM BASED ON MULTI-ELEMENT ATTENTION MECHANISM 有权

公开(公告)号：US20220121871A1

公开(公告)日：2022-04-21

申请号：US17502533

申请日：2021-10-15

申请人： Tsinghua University , Hyundai Motor Company , Kia Corporation

发明人： Liangrui Peng , Ruijie Yan , Shanyu Xiao , Gang Yao , Shengjin Wang , Jaesik Min , Jong Ub Suk

IPC分类号： G06K9/32 , G06K9/46 , G06K9/00 , G06K9/62 , G06N3/04

摘要： A method and a system of multi-directional scene text recognition based on multi-element attention mechanism are provided. The method includes: performing normalization processing for a text row/column image I output from an external text detection module by a feature extractor, extracting a feature for the normalized image by using a deep convolutional neural network to acquire an initial feature map F0, and adding a 2-dimensional directional positional encoding P to an initial feature map F0 in order to output a multi-channel feature map F; converting the multi-channel feature map F output from a feature extractor by an encoder into a hidden representation H; and converting the hidden representation H output from the encoder into a recognized text by a decoder and using the recognized text as the output result. The method and the system of multi-directional scene text recognition based on multi-element attention mechanism provided by the present invention are applied to multi-oriented scene text images including horizontal text, vertical text, and curved text etc., and have achieved high applicability.

3.

发明授权
Multi-directional scene text recognition method and system based on multi-element attention mechanism 有权

公开(公告)号：US11881038B2

公开(公告)日：2024-01-23

申请号：US17502533

申请日：2021-10-15

申请人： Tsinghua University , Hyundai Motor Company , Kia Corporation

发明人： Liangrui Peng , Ruijie Yan , Shanyu Xiao , Gang Yao , Shengjin Wang , Jaesik Min , Jong Ub Suk

IPC分类号： G06V20/00 , G06V20/62 , G06V10/40 , G06V10/94 , G06F18/213 , G06N3/045 , G06V30/10

CPC分类号： G06V20/62 , G06F18/213 , G06N3/045 , G06V10/40 , G06V10/95 , G06V30/10

摘要： A method of multi-directional scene text recognition based on multi-element attention mechanism include: performing normalization processing for a text row/column image output from an external text detection module by a feature extractor, extracting a feature for the normalized image by using a deep convolutional neural network to acquire an initial feature map, adding a 2-dimensional directional positional encoding P to the initial feature map in order to output a multi-channel feature map, converting the multi-channel feature map output from a feature extractor by an encoder into a hidden representation, and converting the hidden representation output from the encoder into a recognized text by a decoder and using the recognized text as the output result.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类