SYNTHETIC DATASET GENERATOR
    1.
    发明公开

    公开(公告)号:US20240127075A1

    公开(公告)日:2024-04-18

    申请号:US18212629

    申请日:2023-06-21

    CPC classification number: G06N3/0985

    Abstract: Machine learning is a process that learns a model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the costs associated with collecting and labeling real world datasets for use in training the model, computer processes can synthetically generate datasets which simulate real world data. The present disclosure improves the effectiveness of such synthetic datasets for training machine learning models used in real world applications, in particular by generating a synthetic dataset that is specifically targeted to a specified downstream task (e.g. a particular computer vision task, a particular natural language processing task, etc.).

Patent Agency Ranking