TEXT DETECTION METHOD, TEXT RECOGNITION METHOD AND APPARATUS

    公开(公告)号:US20230045715A1

    公开(公告)日:2023-02-09

    申请号:US17966112

    申请日:2022-10-14

    Abstract: The present disclosure provides a text detection method, a text recognition method and an apparatus, which relate to the field of artificial intelligence technology, in particular to the field of deep learning and computer vision technologies, and can be applied to scenarios such as optical character recognition. The text detection method is: acquiring an image feature of a text strip in a to-be-recognized image; performing visual enhancement processing on the to-be-recognized image to obtain an enhanced feature map of the to-be-recognized image; comparing the image feature of the text strip with the enhanced feature map for similarity to obtain a target bounding box of the text strip on the enhanced feature map.

    IMAGE CLASSIFICATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20220027611A1

    公开(公告)日:2022-01-27

    申请号:US17498226

    申请日:2021-10-11

    Abstract: Provided are an image classification method and apparatus, an electronic device and a storage medium, relating to the field of artificial intelligence and, in particular, to computer vision and deep learning. The method includes inputting a to-be-classified document image into a pretrained neural network and obtaining a feature submap of each text box of the to-be-classified document image by use of the neural network; inputting the feature submap of each text box, a semantic feature corresponding to preobtained text information of each text box and a position feature corresponding to preobtained position information of each text box into a pretrained multimodal feature fusion model and fusing, by use of the multimodal feature fusion model, the three into a multimodal feature corresponding to each text box; and classifying the to-be-classified document image based on the multimodal feature corresponding to each text box.

    METHOD FOR TRAINING IMAGE RECOGNITION MODEL BASED ON SEMANTIC ENHANCEMENT

    公开(公告)号:US20220392205A1

    公开(公告)日:2022-12-08

    申请号:US17892669

    申请日:2022-08-22

    Abstract: Embodiments of the present disclosure provide a method and apparatus for training an image recognition model based on a semantic enhancement, a method and apparatus for recognizing an image, an electronic device, and a computer readable storage medium. The method for training an image recognition model based on a semantic enhancement comprises: extracting, from an inputted first image being unannotated and having no textual description, a first feature representation of the first image; calculating a first loss function based on the first feature representation; extracting, from an inputted second image being unannotated and having an original textual description, a second feature representation of the second image; calculating a second loss function based on the second feature representation, and training an image recognition model based on a fusion of the first loss function and the second loss function.

    TABLE GENERATING METHOD AND APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM AND PRODUCT

    公开(公告)号:US20220301334A1

    公开(公告)日:2022-09-22

    申请号:US17832735

    申请日:2022-06-06

    Abstract: The present disclosure provides a table generating method and apparatus, an electronic device, a storage medium and a product. A specific implementation is: recognizing at least one table object in a to-be-recognized image and obtaining a table property respectively corresponding to the at least one table object, where the table property of any table object includes a cell property or a non-cell property; determining at least one target object with the cell property in the at least one table object; determining a cell region respectively corresponding to the at least one target object to obtain cell position information respectively corresponding to the at least one target object; generating a spreadsheet corresponding to the to-be-recognized image according to the cell position information respectively corresponding to the at least one target object.

    METHOD OF TRAINING TEXT DETECTION MODEL, METHOD OF DETECTING TEXT, AND DEVICE

    公开(公告)号:US20240265718A1

    公开(公告)日:2024-08-08

    申请号:US18041370

    申请日:2022-04-22

    CPC classification number: G06V30/19127 G06V10/7715 G06V10/82

    Abstract: A method training a text detection model and a method of detecting a text. The training method includes: inputting a sample image into a text feature extraction sub-model of a text detection model to obtain a text feature of a text in the sample image, the sample image having a label indicating an actual position information and an actual category; inputting a predetermined text vector into a text encoding sub-model of the text detection model to obtain a text reference feature; inputting the text feature and the text reference feature into a decoding sub-model of the text detection model to obtain a text sequence vector; inputting the text sequence vector into an output sub-model of the text detection model to obtain a predicted position information and a predicted category; and training the text detection model based on the predicted and actual categories, the predicted and actual position information.

Patent Agency Ranking