Active learning loop-based data labeling service

    公开(公告)号:US11048979B1

    公开(公告)日:2021-06-29

    申请号:US16370706

    申请日:2019-03-29

    Abstract: Techniques for active learning-based data labeling are described. An active learning-based data labeling service enables a user to build and manage large, high accuracy datasets for use in various machine learning systems. Machine learning may be used to automate annotation and management of the datasets, increasing efficiency of labeling tasks and reducing the time required to perform labeling. Embodiments utilize active learning techniques to reduce the amount of a dataset that requires manual labeling. As subsets of the dataset are labeled, this label data is used to train a model which can then identify additional objects in the dataset without manual intervention. The process may continue iteratively until the model converges. This enables a dataset to be labeled without requiring each item in the dataset to be individually and manually labeled by human labelers.

    Active learning-based data labeling service using an augmented manifest

    公开(公告)号:US11443232B1

    公开(公告)日:2022-09-13

    申请号:US16370733

    申请日:2019-03-29

    Abstract: Techniques for active learning-based data labeling are described. An active learning-based data labeling service enables a user to build and manage large, high accuracy datasets for use in various machine learning systems. Machine learning may be used to automate annotation and management of the datasets, increasing efficiency of labeling tasks and reducing the time required to perform labeling. Embodiments utilize active learning techniques to reduce the amount of a dataset that requires manual labeling. As subsets of the dataset are labeled, this label data is used to train a model which can then identify additional objects in the dataset without manual intervention. The label data can be added to an augmented manifest, the augmented manifest can be used to filter the dataset to perform further labeling jobs on the same or different subsets of the dataset.

Patent Agency Ranking