-
公开(公告)号:US12080084B2
公开(公告)日:2024-09-03
申请号:US17407549
申请日:2021-08-20
发明人: Liangrui Peng , Shanyu Xiao , Ruijie Yan , Gang Yao , Shengjin Wang , Jaesik Min , Jong Ub Suk
CPC分类号: G06V20/63 , G06F18/214 , G06N3/04 , G06N3/084 , G06V10/225 , G06V10/40 , G06V30/10
摘要: A method and a system for detecting a scene text may include extracting a first feature map for a scene image input based on a convolutional neural network, and delivering the first feature map to a sequential deformation module; obtaining sampled feature maps corresponding to sampling positions by performing iterative sampling for the first feature map, obtaining a second feature map by performing a concatenation operation in deep learning according to a channel dimension for the first feature map and the sampled feature maps; obtaining a third feature map by performing a feature aggregation operation for the second feature map in the channel dimension, and delivering the third feature map to the object detection baseline network; and performing text area candidate box extraction for the third feature map and obtaining a text area prediction result as a scene text detection result through regression fitting.
-
公开(公告)号:US20220121871A1
公开(公告)日:2022-04-21
申请号:US17502533
申请日:2021-10-15
发明人: Liangrui Peng , Ruijie Yan , Shanyu Xiao , Gang Yao , Shengjin Wang , Jaesik Min , Jong Ub Suk
摘要: A method and a system of multi-directional scene text recognition based on multi-element attention mechanism are provided. The method includes: performing normalization processing for a text row/column image I output from an external text detection module by a feature extractor, extracting a feature for the normalized image by using a deep convolutional neural network to acquire an initial feature map F0, and adding a 2-dimensional directional positional encoding P to an initial feature map F0 in order to output a multi-channel feature map F; converting the multi-channel feature map F output from a feature extractor by an encoder into a hidden representation H; and converting the hidden representation H output from the encoder into a recognized text by a decoder and using the recognized text as the output result. The method and the system of multi-directional scene text recognition based on multi-element attention mechanism provided by the present invention are applied to multi-oriented scene text images including horizontal text, vertical text, and curved text etc., and have achieved high applicability.
-
公开(公告)号:US11881038B2
公开(公告)日:2024-01-23
申请号:US17502533
申请日:2021-10-15
发明人: Liangrui Peng , Ruijie Yan , Shanyu Xiao , Gang Yao , Shengjin Wang , Jaesik Min , Jong Ub Suk
摘要: A method of multi-directional scene text recognition based on multi-element attention mechanism include: performing normalization processing for a text row/column image output from an external text detection module by a feature extractor, extracting a feature for the normalized image by using a deep convolutional neural network to acquire an initial feature map, adding a 2-dimensional directional positional encoding P to the initial feature map in order to output a multi-channel feature map, converting the multi-channel feature map output from a feature extractor by an encoder into a hidden representation, and converting the hidden representation output from the encoder into a recognized text by a decoder and using the recognized text as the output result.
-
-