Rotation and scaling for optical character recognition using end-to-end deep learning

    公开(公告)号:US11302108B2

    公开(公告)日:2022-04-12

    申请号:US16565614

    申请日:2019-09-10

    Applicant: SAP SE

    Abstract: Disclosed herein are system, method, and computer program product embodiments for optical character recognition (OCR) pre-processing using machine learning. In an embodiment, a neural network may be trained to identify a standardized document rotation and scale expected by an OCR service performing character recognition. The neural network may then analyze a received document image to identify a corresponding rotation and scale of the document image relative to the expected standardized values. In response to this identification, the document image may be modified in the inverse to standardize the rotation and scale of the document image to match the format expected by the OCR service. In some embodiments, a neural network may perform the standardization as well as the character recognition using a shared computation graph.

    ROTATION AND SCALING FOR OPTICAL CHARACTER RECOGNITION USING END-TO-END DEEP LEARNING

    公开(公告)号:US20210073566A1

    公开(公告)日:2021-03-11

    申请号:US16565614

    申请日:2019-09-10

    Applicant: SAP SE

    Abstract: Disclosed herein are system, method, and computer program product embodiments for optical character recognition (OCR) pre-processing using machine learning. In an embodiment, a neural network may be trained to identify a standardized document rotation and scale expected by an OCR service performing character recognition. The neural network may then analyze a received document image to identify a corresponding rotation and scale of the document image relative to the expected standardized values. In response to this identification, the document image may be modified in the inverse to standardize the rotation and scale of the document image to match the format expected by the OCR service. In some embodiments, a neural network may perform the standardization as well as the character recognition using a shared computation graph.

    TWO-DIMENSIONAL DOCUMENT PROCESSING
    13.
    发明申请

    公开(公告)号:US20190354818A1

    公开(公告)日:2019-11-21

    申请号:US15983489

    申请日:2018-05-18

    Applicant: SAP SE

    Abstract: Disclosed herein are system, method, and computer program product embodiments for processing a document. In an embodiment, a document processing system may receive a document. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters. The document processing system may generate a down-sampled two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid to obtain semantic meaning for the document. The convolutional neural network may produce a segmentation mask and bounding boxes to correspond to the document.

    CLASSIFICATION OF DANGEROUS GOODS VIA MACHINE LEARNING

    公开(公告)号:US20230274324A1

    公开(公告)日:2023-08-31

    申请号:US18136394

    申请日:2023-04-19

    Applicant: SAP SE

    Abstract: Provided is a system and method that can identify whether an item is a dangerous good. The system can determine whether a product belongs in any of a number of different classes of dangerous goods from among a plurality of different regulations based on a machine learning algorithm which performs a text-based classification. In one example, the method may include receiving an identification of an object, retrieving a plurality of descriptive attributes of the object from a data store and converting the plurality of descriptive attributes into an input string, predicting whether the object is a dangerous object via execution of a text-based machine learning algorithm that receives the input string as an input, and outputting information about the prediction of the object for display via a user interface.

    Contextualized character recognition system

    公开(公告)号:US11301627B2

    公开(公告)日:2022-04-12

    申请号:US16734880

    申请日:2020-01-06

    Applicant: SAP SE

    Abstract: System, method, and various embodiments for providing contextualized character recognition system are described herein. An embodiment operates by determining a plurality of predicted words of an image. An accuracy measure or each of the plurality of predicted words is identified and a replaceable word with an accuracy measure below a threshold is identified. A plurality of candidate words associated with the replaceable word are identified and a probability for each of the candidate words is calculated based on a contextual analysis. One of the candidate words with a highest probability is selected. The plurality of predicted words including the selected candidate word with the highest probability replacing the replaceable word is output.

    VISUALLY-AWARE ENCODINGS FOR CHARACTERS

    公开(公告)号:US20210174141A1

    公开(公告)日:2021-06-10

    申请号:US16704940

    申请日:2019-12-05

    Applicant: SAP SE

    Abstract: In some embodiments, a method inputs a set of images into a network and trains the network based on a classification of the set of images to one or more characters in a set of characters. The method obtains a set of encodings for the one or more characters based on a layer of the network that restricts the output of the layer to a number of values. Then, the method stores the set of encodings for the one or more characters, wherein an encoding in the set of encodings is retrievable when a corresponding character is determined.

    Visually similar scene retrieval using coordinate data

    公开(公告)号:US10783377B2

    公开(公告)日:2020-09-22

    申请号:US16218067

    申请日:2018-12-12

    Applicant: SAP SE

    Abstract: Aspects of the present disclosure therefore involve systems and methods for identifying a set of visually similar scenes to a target scene selected or otherwise identified by a match analyst. A scene retrieval platform performs operations for: receiving an input that comprises an identification of a scene; retrieving a set of coordinates based on the scene identified by the input, where the set of coordinates identify positions of the entities depicted within the frames; generating a set of vector values based on the coordinates of the entities depicted within each of the frames; concatenating the set of vector values to generate a concatenated vector value that represents the scene; generating a visual representation of the concatenated vector value; and identifying one or more similar scenes to the scene identified by the input based on the visual representation of the concatenated vector value.

    Two-dimensional document processing

    公开(公告)号:US10540579B2

    公开(公告)日:2020-01-21

    申请号:US15983489

    申请日:2018-05-18

    Applicant: SAP SE

    Abstract: Disclosed herein are system, method, and computer program product embodiments for processing a document. In an embodiment, a document processing system may receive a document. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters. The document processing system may generate a down-sampled two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid to obtain semantic meaning for the document. The convolutional neural network may produce a segmentation mask and bounding boxes to correspond to the document.

Patent Agency Ranking