Database Calculation Engine With Nested Multiprovider Merging
    2.
    发明申请
    Database Calculation Engine With Nested Multiprovider Merging 审中-公开
    数据库计算引擎与嵌套多重编译器合并

    公开(公告)号:US20160350374A1

    公开(公告)日:2016-12-01

    申请号:US14723205

    申请日:2015-05-27

    Applicant: SAP SE

    CPC classification number: G06F17/30466

    Abstract: A query is received by a database server from a remote application server that is associated with a calculation scenario that defines a data flow model including one or more calculation nodes including stacked multiproviders. Subsequently, the database server instantiates the calculation scenario and afterwards optimizes the calculation scenario. As part of the optimization, the calculation scenario is optimized by merging the two multiproviders. Thereafter, the operations defined by the calculation nodes of the optimized calculation scenario can be executed to result in a responsive data set. Next, the data set is provided to the application server by the database server.

    Abstract translation: 来自远程应用服务器的数据库服务器接收到查询,该远程应用程序服务器与定义包括一个或多个计算节点(包括堆叠的多维数据集)的数据流模型的计算方案相关联。 随后,数据库服务器实例化计算方案,然后优化计算方案。 作为优化的一部分,通过合并两个多维数据集优化计算方案。 此后,可以执行由优化的计算场景的计算节点定义的操作以产生响应数据集。 接下来,由数据库服务器将数据集提供给应用服务器。

    Preparing data for machine learning processing

    公开(公告)号:US11886961B2

    公开(公告)日:2024-01-30

    申请号:US16582950

    申请日:2019-09-25

    Applicant: SAP SE

    Abstract: Data for processing by a machine learning model may be prepared by encoding a first portion of the data including a spatial data. The spatial data may include a spatial coordinate including one or more values identifying a geographical location. The encoding of the first portion of the data may include mapping, to a cell in a grid system, the spatial coordinate such that the spatial coordinate is represented by an identifier of the cell instead of the one or more values. The data may be further prepared by embedding a second portion of the data including textual data, preparing a third portion of the data including hierarchical data, and/or preparing a fourth portion of the data including numerical data. The machine learning model may be applied to the prepared data in order to train, validate, test, and/or deploy the machine learning model to perform a cognitive task.

    Database calculation engine with nested multiprovider merging

    公开(公告)号:US10324930B2

    公开(公告)日:2019-06-18

    申请号:US14723205

    申请日:2015-05-27

    Applicant: SAP SE

    Abstract: A query is received by a database server from a remote application server that is associated with a calculation scenario that defines a data flow model including one or more calculation nodes including stacked multiproviders. Subsequently, the database server instantiates the calculation scenario and afterwards optimizes the calculation scenario. As part of the optimization, the calculation scenario is optimized by merging the two multiproviders. Thereafter, the operations defined by the calculation nodes of the optimized calculation scenario can be executed to result in a responsive data set. Next, the data set is provided to the application server by the database server.

    Data-driven union pruning in a database semantic layer

    公开(公告)号:US10324927B2

    公开(公告)日:2019-06-18

    申请号:US14946658

    申请日:2015-11-19

    Applicant: SAP SE

    Abstract: Methods and apparatus, including computer program products, are provided for union node pruning. In one aspect, there is provided a method, which may include receiving, by a calculation engine, a query; processing a calculation scenario including a union node; accessing a pruning table associated with the union node, wherein the pruning table includes semantic information describing the first input from the first data source node and the second input from the second data source node; determining whether the first data source node and the second data source node can be pruned by at least comparing the semantic information to at least one filter of the query; and pruning, based on a result of the determining, at least one the first data source node or the second data source node. Related apparatus, systems, methods, and articles are also described.

    Hyper-parameter space optimization for machine learning data processing pipeline

    公开(公告)号:US11544136B1

    公开(公告)日:2023-01-03

    申请号:US17395094

    申请日:2021-08-05

    Applicant: SAP SE

    Abstract: A data processing pipeline may be generated to include an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset. The executor node may execute machine learning trials by applying, to the training dataset, a machine learning model and/or a different set of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, a machine learning model for performing a task. Data associated with the execution of the data processing pipeline may be collected for storage in a tracking database. A report including de-normalized and enriched data from the tracking database may be generated. The hyper-parameter space of the machine learning model may be analyzed based on the report. A root cause of at least one fault associated with the execution of the data processing pipeline may be identified based on the analysis.

    RUNTIME ESTIMATION FOR MACHINE LEARNING DATA PROCESSING PIPELINE

    公开(公告)号:US20220092470A1

    公开(公告)日:2022-03-24

    申请号:US17031661

    申请日:2020-09-24

    Applicant: SAP SE

    Abstract: Inputs may be received for constructing a data processing pipeline configured to implement an process to generate a machine learning model for performing a task associated with an input dataset. The process may include a plurality of machine learning trials, each of which applying, to a training dataset and/or a validation dataset generated based on the input dataset, a different type of machine learning model and/or a different set of trial parameters. The machine learning model being generated based on a result of the plurality of machine learning trials. A runtime estimate for the process to generate the machine learning model may be determined. The runtime estimate may enable the allocation of a sufficient time budget for the process. Moreover, the process may be executed if the runtime of the process does not exceed the available time budget.

    Machine learning data processing pipeline

    公开(公告)号:US11443234B2

    公开(公告)日:2022-09-13

    申请号:US16582946

    申请日:2019-09-25

    Applicant: SAP SE

    Abstract: A user interface may be generated to receive inputs for constructing a data processing pipeline that includes an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset and a validation dataset for a machine learning model. The executor node may execute machine learning trials by applying, to the training dataset and the validation dataset, machine learning models having different sets of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, an optimal machine learning model for performing a task. The data processing pipeline may be adapted dynamically based on the input dataset and/or computational resource budget. The optimal machine learning model for performing the task may be generated by executing, based on the graph, the data processing pipeline the orchestrator node, the preparator node, and the executor node.

    MACHINE LEARNING DATA PROCESSING PIPELINE

    公开(公告)号:US20210089961A1

    公开(公告)日:2021-03-25

    申请号:US16582946

    申请日:2019-09-25

    Applicant: SAP SE

    Abstract: A user interface may be generated to receive inputs for constructing a data processing pipeline that includes an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset and a validation dataset for a machine learning model. The executor node may execute machine learning trials by applying, to the training dataset and the validation dataset, machine learning models having different sets of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, an optimal machine learning model for performing a task. The data processing pipeline may be adapted dynamically based on the input dataset and/or computational resource budget. The optimal machine learning model for performing the task may be generated by executing, based on the graph, the data processing pipeline the orchestrator node, the preparator node, and the executor node.

Patent Agency Ranking