DATA CURATION WITH SYNTHETIC DATA GENERATION

    公开(公告)号:US20220413905A1

    公开(公告)日:2022-12-29

    申请号:US17358979

    申请日:2021-06-25

    Applicant: SAP SE

    Abstract: A method may include identifying an identifier field included in a first datatype of a seed data sample associated with a source system. The identifier field may store a first value that enables a differentiation between different instances of the first datatype. A relationship field, which stores a second value that define a relationship between the first datatype and a second data type, may be identified. A synthetic data sample may be generated by populating the identifier field of the synthetic data sample with a synthetically generated value and the relationship field of the synthetic data sample with the second value. The synthetic data sample may be sent to a target system to enable a performance of a task at the target system. The synthetic data sample may supplement a volume and/or a diversity of the data that occurs organically at the source system.

    Data curation with synthetic data generation

    公开(公告)号:US12073246B2

    公开(公告)日:2024-08-27

    申请号:US17358979

    申请日:2021-06-25

    Applicant: SAP SE

    CPC classification number: G06F9/4881 G06F16/2365

    Abstract: A method may include identifying an identifier field included in a first datatype of a seed data sample associated with a source system. The identifier field may store a first value that enables a differentiation between different instances of the first datatype. A relationship field, which stores a second value that define a relationship between the first datatype and a second data type, may be identified. A synthetic data sample may be generated by populating the identifier field of the synthetic data sample with a synthetically generated value and the relationship field of the synthetic data sample with the second value. The synthetic data sample may be sent to a target system to enable a performance of a task at the target system. The synthetic data sample may supplement a volume and/or a diversity of the data that occurs organically at the source system.

Patent Agency Ranking