Patent search ap:("SAP SE") AND inv:"Alexey Streltsov" Page 1

1.

发明申请
GENERATING DATA REGULATION COMPLIANT DATA FROM APPLICATION INTERFACE DATA 有权

公开(公告)号：US20230053109A1

公开(公告)日：2023-02-16

申请号：US17402940

申请日：2021-08-16

Applicant: SAP SE

Inventor： Igor Schukovets , Alexey Streltsov

IPC: G06F16/93

Abstract: The present disclosure involves systems, software, and computer-implemented methods for generating data regulation-compliant data from application interface data. One example method includes receiving a request for creation of document data. The request includes personal data of a user. Document data, including at least some of the personal data, is created based on the request. The document data is encoded into an encoded document that does not include any personal data of the user and includes structural information that describes the structure of the document data. A request to use the encoded document is received and the encoded document is decoded. A synthetic document is generated using the structural information included in the encoded document. Generation of the synthetic document includes insertion of synthetic user data into the synthetic document at positions in the synthetic document that correspond to positions of personal data within the document data.

2.

发明公开
AUGMENTING ELECTRONIC DOCUMENTS TO GENERATE SYNTHETIC TRAINING DATA SETS 审中-公开

公开(公告)号：US20230334309A1

公开(公告)日：2023-10-19

申请号：US17720658

申请日：2022-04-14

Applicant: SAP SE

Inventor： Alexey Streltsov , Monit Shah Singh , Dhananjay Tomar , Christian Reisswig , Minh Duc Bui

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: Systems, methods, and computer-readable media for generating a synthetic training data set from an original unstructured electronic document are disclosed. The synthetic training data set may be used to train a deep learning model to extract data from the original electronic document. The original electronic document may comprise annotated data fields. Each annotated data field may comprise a bounding box and a label. The original electronic document may comprise a header, a table, and a footer. Macro augmentation operations may be applied to the original electronic document to create sub-templates representative of distinct page layouts in the original electronic document. The synthetic training data set may be generated by applying geometric and semantic data augmentations to the sub-templates and the original electronic documents. The synthetic training data set may then be provided the deep learning model for training.

3.

发明申请
MODEL-INDEPENDENT CONFIDENCE VALUE PREDICTION MACHINE LEARNED MODEL 有权

公开(公告)号：US20220366301A1

公开(公告)日：2022-11-17

申请号：US17354202

申请日：2021-06-22

Applicant: SAP SE

Inventor： Nurzat Rakhmanberdieva , Alexey Streltsov , Christian Reisswig

IPC: G06N20/00 , G06K9/62 , G06K9/00 , G06N3/02

Abstract: In an example embodiment, a confidence score is computed for a predicted label (from a first model) for information extracted from a document. The confidence score is computed using a machine learned model different than the first model which is based on a Sliding-Window method. The Sliding-Window method may be based on convolutional neural networks classification, using sliding windows. It receives as input (1) the string of extracted information from an independent previous information extracted step (the “input text”), (2) the string's predicted class label, (3) the string's coordinate location in the document, and (4) the text of the document (for additional context information). The Sliding-Window method's task is to predict the confidence score to determine the correctness of the predicted label for the information.

4.

发明授权
Generating data regulation compliant data from application interface data 有权

公开(公告)号：US12079284B2

公开(公告)日：2024-09-03

申请号：US17402940

申请日：2021-08-16

Applicant: SAP SE

Inventor： Igor Schukovets , Alexey Streltsov

IPC: G06F16/93

CPC classification number: G06F16/93

Abstract: The present disclosure involves systems, software, and computer-implemented methods for generating data regulation-compliant data from application interface data. One example method includes receiving a request for creation of document data. The request includes personal data of a user. Document data, including at least some of the personal data, is created based on the request. The document data is encoded into an encoded document that does not include any personal data of the user and includes structural information that describes the structure of the document data. A request to use the encoded document is received and the encoded document is decoded. A synthetic document is generated using the structural information included in the encoded document. Generation of the synthetic document includes insertion of synthetic user data into the synthetic document at positions in the synthetic document that correspond to positions of personal data within the document data.

5.

发明申请
DEEP NEURAL NETWORK FOR MATCHING ENTITIES IN SEMI-STRUCTURED DATA 有权

公开(公告)号：US20220092405A1

公开(公告)日：2022-03-24

申请号：US17025845

申请日：2020-09-18

Applicant: SAP SE

Inventor： Matthias Frank , Hoang-Vu Nguyen , Stefan Klaus Baur , Alexey Streltsov , Jasmin Mankad , Cordula Guder , Konrad Schenk , Philipp Lukas Jamscikov , Rohit Kumar Gupta

IPC: G06N3/08 , G06N3/04 , G06F16/81

Abstract: In an example embodiment, a deep neural network may be utilized to determine matches between candidate pairs of entities, as well as confidence scores that reflect how certain the deep neural network is about the corresponding match. The deep neural network is also able to find these matches without requiring domain knowledge that would be required if features for a machine-learned model were handcrafted, which is a drawback of prior art machine-learned models used to match entities in multiple tables. Thus, the deep neural network improves on the functioning of prior art machine learned models designed to perform the same tasks. Specifically, the deep neural network learns the relationships of tabular fields and the patterns that define a match from historical data alone, making this approach generic and applicable independent of the context.

Patent Agency Ranking