-
公开(公告)号:US11244208B2
公开(公告)日:2022-02-08
申请号:US16711978
申请日:2019-12-12
Applicant: SAP SE
Inventor: Christian Reisswig , Anoop Raveendra Katti , Steffen Bickel , Johannes Hoehne , Jean Baptiste Faddoul
Abstract: Disclosed herein are system, method, and computer program product embodiments for processing a document. In an embodiment, a document processing system may receive a document. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters. The document processing system may generate a down-sampled two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid to obtain semantic meaning for the document. The convolutional neural network may produce a segmentation mask and bounding boxes to correspond to the document.
-
公开(公告)号:US10915788B2
公开(公告)日:2021-02-09
申请号:US16123177
申请日:2018-09-06
Applicant: SAP SE
Inventor: Johannes Hoehne , Anoop Raveendra Katti , Christian Reisswig
IPC: G06K9/34 , G06K9/62 , G06N3/08 , G06F40/279
Abstract: Disclosed herein are system, method, and computer program product embodiments for optical character recognition using end-to-end deep learning. In an embodiment, an optical character recognition system may train a neural network to identify characters of pixel images and to assign index values to the characters. The neural network may also be trained to identify groups of characters and to generate bounding boxes to group these characters. The optical character recognition system may then analyze documents to identify character information based on the pixel data and produce a segmentation mask and one or more bounding box masks. The optical character recognition system may supply these masks as an output or may combine the masks to generate a version of the received document having optically recognized characters.
-
公开(公告)号:US20200279128A1
公开(公告)日:2020-09-03
申请号:US16288357
申请日:2019-02-28
Applicant: SAP SE
Inventor: Johannes Hoehne , Anoop Raveendra Katti , Christian Reisswig , Marco Spinaci
IPC: G06K9/62
Abstract: Disclosed herein are system, method, and computer program product embodiments for providing object detection and filtering operations. An embodiment operates by receiving an image comprising a plurality of pixels and pixel information for each pixel. The pixel information indicates a bounding box corresponding to an object within the image associated with a respective pixel and a confidence score associated with the bounding box for the respective pixel. Pixels that do not correspond to a center of at least one of the bounding boxes are iteratively removed from the plurality of pixels until a subset of pixels each of which correspond to a center of at least one of the bounding boxes remains. Based on the subset, a final bounding box associated with each object of the image is determined based on an overlapping of the bounding boxes of the subset of pixels and the corresponding confidence scores.
-
公开(公告)号:US20200258498A1
公开(公告)日:2020-08-13
申请号:US16270328
申请日:2019-02-07
Applicant: SAP SE
Inventor: Christian Reisswig , Darko Velkoski , Sohyeong Kim , Hung Tu Dinh , Faisal El Hussein
IPC: G10L15/06 , G10L15/22 , G10L15/183
Abstract: Various examples described herein are directed to systems and methods for analyzing text. A computing device may train an autoencoder language model using a plurality of language model training samples. The autoencoder language mode may comprise a first convolutional layer. Also, a first language model training sample of the plurality of language model training samples may comprise a first set of ordered strings comprising a masked string, a first string preceding the masked string in the first set of ordered strings, and a second string after the masked string in the first set of ordered strings. The computing device may generate a first feature vector using an input sample and the autoencoder language model. The computing device may also generate a descriptor of the input sample using a target model, the input sample, and the first feature vector.
-
公开(公告)号:US12204860B2
公开(公告)日:2025-01-21
申请号:US18112969
申请日:2023-02-22
Applicant: SAP SE
Inventor: Christian Reisswig
IPC: G06F40/295 , G06F16/35 , G06F40/14 , G06F40/284 , G06N3/04 , G06N20/20
Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content. A single token can belong to multiple higher level structures according to its various relationships. Examples and variations are disclosed.
-
公开(公告)号:US20230206000A1
公开(公告)日:2023-06-29
申请号:US18112969
申请日:2023-02-22
Applicant: SAP SE
Inventor: Christian Reisswig
IPC: G06F40/295 , G06N20/20 , G06F16/35 , G06F40/284 , G06N3/04 , G06F40/14
CPC classification number: G06F40/295 , G06N20/20 , G06F16/355 , G06F40/284 , G06N3/04 , G06F40/14
Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content. A single token can belong to multiple higher level structures according to its various relationships. Examples and variations are disclosed.
-
公开(公告)号:US11615246B2
公开(公告)日:2023-03-28
申请号:US16891819
申请日:2020-06-03
Applicant: SAP SE
Inventor: Christian Reisswig
IPC: G06F40/295 , G06N20/20 , G06F16/35 , G06F40/284 , G06N3/04 , G06F40/14
Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content. A single token can belong to multiple higher level structures according to its various relationships. Examples and variations are disclosed.
-
公开(公告)号:US20220366301A1
公开(公告)日:2022-11-17
申请号:US17354202
申请日:2021-06-22
Applicant: SAP SE
Inventor: Nurzat Rakhmanberdieva , Alexey Streltsov , Christian Reisswig
Abstract: In an example embodiment, a confidence score is computed for a predicted label (from a first model) for information extracted from a document. The confidence score is computed using a machine learned model different than the first model which is based on a Sliding-Window method. The Sliding-Window method may be based on convolutional neural networks classification, using sliding windows. It receives as input (1) the string of extracted information from an independent previous information extracted step (the “input text”), (2) the string's predicted class label, (3) the string's coordinate location in the document, and (4) the text of the document (for additional context information). The Sliding-Window method's task is to predict the confidence score to determine the correctness of the predicted label for the information.
-
公开(公告)号:US20210383067A1
公开(公告)日:2021-12-09
申请号:US16891819
申请日:2020-06-03
Applicant: SAP SE
Inventor: Christian Reisswig
IPC: G06F40/295 , G06F40/284 , G06F16/35 , G06N20/20 , G06N3/04
Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content. A single token can belong to multiple higher level structures according to its various relationships. Examples and variations are disclosed.
-
公开(公告)号:US11003861B2
公开(公告)日:2021-05-11
申请号:US16275025
申请日:2019-02-13
Applicant: SAP SE
Inventor: Christian Reisswig , Darko Velkoski , Sohyeong Kim , Hung Tu Dinh
IPC: G06F40/00 , G06F40/30 , G06F40/211 , G06F40/284
Abstract: Various examples are directed to systems and methods for classifying text. A computing device may access, from a database, an input sample comprising a first set of ordered words. The computing device may generate a first language model feature vector for the input sample using a word level language model and a second language model feature vector for the input sample using a partial word level language model. The computing device may generate a descriptor of the input sample using a target model, the input sample, the first language model feature vector, and the second language model feature vector and write the descriptor of the input sample to the database.
-
-
-
-
-
-
-
-
-