MIXED INTELLIGENCE DATA LABELING SYSTEM FOR MACHINE LEARNING

    公开(公告)号:US20200327374A1

    公开(公告)日:2020-10-15

    申请号:US16381843

    申请日:2019-04-11

    IPC分类号: G06K9/62

    摘要: A method of hybrid data labeling for machine learning, including receiving multiple unlabeled objects forming an unlabeled data set, pre-labeling the unlabeled data set by a machine learning system to output a pending label data pool, bifurcating the pending label data pool by the machine learning system into high and low confidence sets, dispatching the high confidence set to a machine labeler, dispatching the low confidence set to a human labeler, merging the label sets to return a pre-review label data pool, determining a difference between the pending label data pool and the pre-review label data pool, review labeling the data objects, if the determined difference of the data objects is greater than a predefined error threshold and storing the data objects to a reviewed pool if the determined difference of the data objects is less than and equal to the predefined error threshold.