SELF-SUPERVISED PRETRAINING THROUGH TEXT ALIGNMENT

    公开(公告)号:US20220215287A1

    公开(公告)日:2022-07-07

    申请号:US17140815

    申请日:2021-01-04

    Applicant: SAP SE

    Abstract: Machine learning models, trained on labeled training data, may be used to categorize documents. To convert data from human-readable text to a form usable by a machine-learning model, a mapping of words to vectors is performed. Learning the mapping to be used is often part of training a machine learning model that operates on text input. A self-supervised pretraining step is performed that aligns the vectors for two or more fields of each document. In this way, when training on the labeled data begins, the vectors used for transforming the text will already be pretrained to give similar values for the two fields. In applications where the two fields are expected to have similar meanings, this pretraining can improve the quality of the resulting model, reduce the amount of training needed, or both.

    Document Information Extraction Without Additional Annotations

    公开(公告)号:US20220129671A1

    公开(公告)日:2022-04-28

    申请号:US17077568

    申请日:2020-10-22

    Applicant: SAP SE

    Abstract: Disclosed herein are system, method, and computer program product embodiments for document information extraction without additional annotations. An embodiment operates by receiving an input representing a document and a key. The embodiment processes the input using a convolutional neural network to obtain a feature map. The embodiment combines the feature map with positional information to obtain a spatial-aware feature map. The embodiment then repeatedly performs the following decoding process: generate attention weights, generate a context vector based on the spatial-aware feature map and the generated attention weights using an attention layer, process the context vector, the key, and an input vector using a recurrent neural network (RNN) to obtain a RNN state, and generate an output vector based on the RNN state and the context vector using a projection layer. The embodiment then extracts a field based on the result of the decoding process.

    MULTI-LANGUAGE DOCUMENT FIELD EXTRACTION
    3.
    发明公开

    公开(公告)号:US20240273290A1

    公开(公告)日:2024-08-15

    申请号:US18168450

    申请日:2023-02-13

    Applicant: SAP SE

    CPC classification number: G06F40/279 G06F40/126 G06F40/263 G06N3/088

    Abstract: A method for multi-language document field extraction may include determining, based on a received document including a plurality of key fields and a plurality of value fields, a plurality of key-value pairs. The method also includes determining whether an encoding of a key field is within a threshold distance from a predetermined encoding of a predefined key field associated with a predefined field type. The method further includes assigning, based on determining the encoding of the key field is within the threshold distance, the predefined field type to the corresponding key-value pair. The method also includes performing a document processing operation based on each key-value pair and the predefined field type assigned to each key-value pair. Related systems and methods are provided.

    MULTI-MODE IDENTIFICATION OF DOCUMENT LAYOUTS

    公开(公告)号:US20240193979A1

    公开(公告)日:2024-06-13

    申请号:US18064710

    申请日:2022-12-12

    Applicant: SAP SE

    CPC classification number: G06V30/414 G06V30/416 G06V30/418 G06V2201/09

    Abstract: A method is provided for multi-mode identification of document layouts. The method may include determining, based on a received document, a plurality of layout characteristics including a spatial position of one or more document features included in the received document and/or a numeric representation of the one or more document features included in the received document. The method may include generating an aggregated similarity score by at least comparing the plurality of layout characteristics to a first plurality of predefined layout characteristics of a first predefined layout of a plurality of predefined layouts. The method may further include identifying a layout of the received document as the first predefined layout of the plurality of predefined layouts based on the aggregated similarity score meeting a threshold score. The method may also include performing a document processing operation based on the identified layout. Related systems and methods are provided.

    Densely connected convolutional neural network for service ticket classification

    公开(公告)号:US11551053B2

    公开(公告)日:2023-01-10

    申请号:US16541963

    申请日:2019-08-15

    Applicant: SAP SE

    Abstract: A method may include classifying a text by applying a dense convolutional neural network trained to classify the text. The dense convolutional neural network may include one or more dense convolution blocks, each of which including a plurality of convolution layers. Each dense convolution block may be configured to operate on a different quantity of consecutive tokens from the text. Moreover, each of the plurality of convolution layers in a dense convolution block may operate an input to the dense convolution block as well as an output from all preceding convolution layers in the dense convolution block. The text may correspond to an issue associated with a service ticket system. A response for addressing the issue associated with the test may be determined based on the classification of the text. Related systems and articles of manufacture are also provided.

    DENSELY CONNECTED CONVOLUTIONAL NEURAL NETWORK FOR SERVICE TICKET CLASSIFICATION

    公开(公告)号:US20210049443A1

    公开(公告)日:2021-02-18

    申请号:US16541963

    申请日:2019-08-15

    Applicant: SAP SE

    Abstract: A method may include classifying a text by applying a dense convolutional neural network trained to classify the text. The dense convolutional neural network may include one or more dense convolution blocks, each of which including a plurality of convolution layers. Each dense convolution block may be configured to operate on a different quantity of consecutive tokens from the text. Moreover, each of the plurality of convolution layers in a dense convolution block may operate an input to the dense convolution block as well as an output from all preceding convolution layers in the dense convolution block. The text may correspond to an issue associated with a service ticket system. A response for addressing the issue associated with the test may be determined based on the classification of the text. Related systems and articles of manufacture are also provided.

    Visually similar scene retrieval using coordinate data

    公开(公告)号:US10783377B2

    公开(公告)日:2020-09-22

    申请号:US16218067

    申请日:2018-12-12

    Applicant: SAP SE

    Abstract: Aspects of the present disclosure therefore involve systems and methods for identifying a set of visually similar scenes to a target scene selected or otherwise identified by a match analyst. A scene retrieval platform performs operations for: receiving an input that comprises an identification of a scene; retrieving a set of coordinates based on the scene identified by the input, where the set of coordinates identify positions of the entities depicted within the frames; generating a set of vector values based on the coordinates of the entities depicted within each of the frames; concatenating the set of vector values to generate a concatenated vector value that represents the scene; generating a visual representation of the concatenated vector value; and identifying one or more similar scenes to the scene identified by the input based on the visual representation of the concatenated vector value.

Patent Agency Ranking