DATA LABELING FOR TRAINING ARTIFICIAL INTELLIGENCE SYSTEMS

    公开(公告)号:WO2022221488A2

    公开(公告)日:2022-10-20

    申请号:PCT/US2022/024750

    申请日:2022-04-14

    Abstract: Systems, apparatuses, and methods are described for data labeling for training artificial intelligence systems. A candidate dataset comprising data samples and corresponding labels may be used to update an incumbent dataset comprise data samples and corresponding labels. The integrity of a data sample-label pair in the candidate dataset may be determined before the data sample-label pair is added to the incumbent dataset. For determining labeling integrity, a plurality of machine classifiers may be trained based on the incumbent dataset and portions of the candidate dataset. The plurality of machine classifiers as trained may be used to generate predicted labels for data samples in the candidate dataset. The integrity of the data sample-label pair in the candidate dataset may be measured based on the predicted labels for the data sample.

Patent Agency Ranking