Supporting Database Constraints in Synthetic Data Generation Based on Generative Adversarial Networks

    公开(公告)号:US20220374682A1

    公开(公告)日:2022-11-24

    申请号:US17321709

    申请日:2021-05-17

    Applicant: SAP SE

    Abstract: Disclosed herein are system, method, and computer program product embodiments for generating synthetic data records with database constraints using generative adversarial networks (GAN). The method can include training, by using a generator loss function, a generator neural network of a generator model of the GAN to generate a scaling factor and a cluster vector for a datum of a continuous variable of a continuous column of a data table, and a datum for a categorical variable of a categorical column of the data table. The generator loss function includes a penalty component determined based on a set of data constraints related to the continuous column or the categorical column.

Patent Agency Ranking