Patent search ap:("SAP SE") AND inv:"Manuel Zeise" Page 1

1.

发明授权
Hyper-parameter space optimization for machine learning data processing pipeline 有权

公开(公告)号：US11544136B1

公开(公告)日：2023-01-03

申请号：US17395094

申请日：2021-08-05

Applicant: SAP SE

Inventor： Isil Pekel , Steven Jaeger , Manuel Zeise

IPC: G06F11/07 , G06N20/00 , G06F11/36 , G06N20/20

Abstract: A data processing pipeline may be generated to include an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset. The executor node may execute machine learning trials by applying, to the training dataset, a machine learning model and/or a different set of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, a machine learning model for performing a task. Data associated with the execution of the data processing pipeline may be collected for storage in a tracking database. A report including de-normalized and enriched data from the tracking database may be generated. The hyper-parameter space of the machine learning model may be analyzed based on the report. A root cause of at least one fault associated with the execution of the data processing pipeline may be identified based on the analysis.

2.

发明申请
RUNTIME ESTIMATION FOR MACHINE LEARNING DATA PROCESSING PIPELINE 有权

公开(公告)号：US20220092470A1

公开(公告)日：2022-03-24

申请号：US17031661

申请日：2020-09-24

Applicant: SAP SE

Inventor： Steven Jaeger , Isil Pekel , Manuel Zeise

IPC: G06N20/00

Abstract: Inputs may be received for constructing a data processing pipeline configured to implement an process to generate a machine learning model for performing a task associated with an input dataset. The process may include a plurality of machine learning trials, each of which applying, to a training dataset and/or a validation dataset generated based on the input dataset, a different type of machine learning model and/or a different set of trial parameters. The machine learning model being generated based on a result of the plurality of machine learning trials. A runtime estimate for the process to generate the machine learning model may be determined. The runtime estimate may enable the allocation of a sufficient time budget for the process. Moreover, the process may be executed if the runtime of the process does not exceed the available time budget.

3.

发明公开
MULTI-LANGUAGE DOCUMENT FIELD EXTRACTION 审中-公开

公开(公告)号：US20240273290A1

公开(公告)日：2024-08-15

申请号：US18168450

申请日：2023-02-13

Applicant: SAP SE

Inventor： Manuel Zeise , Marius Lehne

IPC: G06F40/279 , G06F40/126 , G06F40/263

CPC classification number: G06F40/279 , G06F40/126 , G06F40/263 , G06N3/088

Abstract: A method for multi-language document field extraction may include determining, based on a received document including a plurality of key fields and a plurality of value fields, a plurality of key-value pairs. The method also includes determining whether an encoding of a key field is within a threshold distance from a predetermined encoding of a predefined key field associated with a predefined field type. The method further includes assigning, based on determining the encoding of the key field is within the threshold distance, the predefined field type to the corresponding key-value pair. The method also includes performing a document processing operation based on each key-value pair and the predefined field type assigned to each key-value pair. Related systems and methods are provided.

4.

发明授权
Preparing data for machine learning processing 有权

公开(公告)号：US11886961B2

公开(公告)日：2024-01-30

申请号：US16582950

申请日：2019-09-25

Applicant: SAP SE

Inventor： Manuel Zeise , Isil Pekel , Steven Jaeger

IPC: G06N20/00 , G06N20/10 , G06N3/04 , G06N20/20 , G06F18/214 , G06F18/2413 , G06N3/08

CPC classification number: G06N20/00 , G06F18/214 , G06F18/2414 , G06N3/04 , G06N3/08 , G06N20/10 , G06N20/20

Abstract: Data for processing by a machine learning model may be prepared by encoding a first portion of the data including a spatial data. The spatial data may include a spatial coordinate including one or more values identifying a geographical location. The encoding of the first portion of the data may include mapping, to a cell in a grid system, the spatial coordinate such that the spatial coordinate is represented by an identifier of the cell instead of the one or more values. The data may be further prepared by embedding a second portion of the data including textual data, preparing a third portion of the data including hierarchical data, and/or preparing a fourth portion of the data including numerical data. The machine learning model may be applied to the prepared data in order to train, validate, test, and/or deploy the machine learning model to perform a cognitive task.

5.

发明公开
MULTI-MODE IDENTIFICATION OF DOCUMENT LAYOUTS 审中-公开

公开(公告)号：US20240193979A1

公开(公告)日：2024-06-13

申请号：US18064710

申请日：2022-12-12

Applicant: SAP SE

Inventor： Manuel Zeise , Marius Lehne

IPC: G06V30/414 , G06V30/416 , G06V30/418

CPC classification number: G06V30/414 , G06V30/416 , G06V30/418 , G06V2201/09

Abstract: A method is provided for multi-mode identification of document layouts. The method may include determining, based on a received document, a plurality of layout characteristics including a spatial position of one or more document features included in the received document and/or a numeric representation of the one or more document features included in the received document. The method may include generating an aggregated similarity score by at least comparing the plurality of layout characteristics to a first plurality of predefined layout characteristics of a first predefined layout of a plurality of predefined layouts. The method may further include identifying a layout of the received document as the first predefined layout of the plurality of predefined layouts based on the aggregated similarity score meeting a threshold score. The method may also include performing a document processing operation based on the identified layout. Related systems and methods are provided.

6.

发明授权
Machine learning data processing pipeline 有权

公开(公告)号：US11443234B2

公开(公告)日：2022-09-13

申请号：US16582946

申请日：2019-09-25

Applicant: SAP SE

Inventor： Manuel Zeise , Isil Pekel , Steven Jaeger

IPC: G06F9/44 , G06N20/00 , G06F16/901 , G06F11/34

Abstract: A user interface may be generated to receive inputs for constructing a data processing pipeline that includes an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset and a validation dataset for a machine learning model. The executor node may execute machine learning trials by applying, to the training dataset and the validation dataset, machine learning models having different sets of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, an optimal machine learning model for performing a task. The data processing pipeline may be adapted dynamically based on the input dataset and/or computational resource budget. The optimal machine learning model for performing the task may be generated by executing, based on the graph, the data processing pipeline the orchestrator node, the preparator node, and the executor node.

7.

发明申请
MACHINE LEARNING DATA PROCESSING PIPELINE 有权

公开(公告)号：US20210089961A1

公开(公告)日：2021-03-25

申请号：US16582946

申请日：2019-09-25

Applicant: SAP SE

Inventor： Manuel Zeise , Isil Pekel , Steven Jaeger

IPC: G06N20/00 , G06F11/34 , G06F16/901

Abstract: A user interface may be generated to receive inputs for constructing a data processing pipeline that includes an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset and a validation dataset for a machine learning model. The executor node may execute machine learning trials by applying, to the training dataset and the validation dataset, machine learning models having different sets of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, an optimal machine learning model for performing a task. The data processing pipeline may be adapted dynamically based on the input dataset and/or computational resource budget. The optimal machine learning model for performing the task may be generated by executing, based on the graph, the data processing pipeline the orchestrator node, the preparator node, and the executor node.

8.

发明公开
Self-Attentive Key-Value Extraction 审中-公开

公开(公告)号：US20240289557A1

公开(公告)日：2024-08-29

申请号：US18113903

申请日：2023-02-24

Applicant: SAP SE

Inventor： Eduardo Vellasques , Xiang Yu , Stefan Klaus Baur , Manuel Zeise

IPC: G06F40/40 , G06F16/33 , G06F40/284

CPC classification number: G06F40/40 , G06F16/3347 , G06F40/284

Abstract: Systems and methods are provided for automated identification of key-value pairs in documents. A document including readable text is received. The document is processed to determine, from the readable text, a plurality of tokens. Pairs of vectors corresponding to the plurality of tokens are determined, each pair of vectors comprising a query vector and a key vector. Attention scores are determined for the plurality of tokens by using the pairs of vectors. The attention scores are normalized to generate normalized attention scores. Connected tokens are identified in the plurality of tokens using the normalized attention scores.

9.

发明授权
Optimizations for machine learning data processing pipeline 有权

公开(公告)号：US11797885B2

公开(公告)日：2023-10-24

申请号：US17031665

申请日：2020-09-24

Applicant: SAP SE

Inventor： Steven Jaeger , Isil Pekel , Manuel Zeise

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: A data processing pipeline may be generated to include an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset. The executor node may execute machine learning trials by applying, to the training dataset, a machine learning model and/or a different set of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, a machine learning model for performing a task. The execution of the data processing pipeline may be optimized. Examples of optimizations include pooling multiple machine learning trials for execution at a single executor node, executing at least some machine learning trials using a sub-sample of the training dataset, and adjusting a proportion of trial parameters sampled from a uniform distribution to avoid a premature convergence to a local minima within the hyper-parameter space for generating the machine learning model.

10.

发明申请
OPTIMIZATIONS FOR MACHINE LEARNING DATA PROCESSING PIPELINE 有权

公开(公告)号：US20220092471A1

公开(公告)日：2022-03-24

申请号：US17031665

申请日：2020-09-24

Applicant: SAP SE

Inventor： Steven Jaeger , Isil Pekel , Manuel Zeise

IPC: G06N20/00

Abstract: A data processing pipeline may be generated to include an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset. The executor node may execute machine learning trials by applying, to the training dataset, a machine learning model and/or a different set of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, a machine learning model for performing a task. The execution of the data processing pipeline may be optimized. Examples of optimizations include pooling multiple machine learning trials for execution at a single executor node, executing at least some machine learning trials using a sub-sample of the training dataset, and adjusting a proportion of trial parameters sampled from a uniform distribution to avoid a premature convergence to a local minima within the hyper-parameter space for generating the machine learning model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification