DATA ANALYZER
    11.
    发明申请

    公开(公告)号:US20210350283A1

    公开(公告)日:2021-11-11

    申请号:US17273762

    申请日:2018-09-13

    Abstract: A series of processes of dividing given labeled teacher data into model construction data and model verification data, constructing a machine learning model using the model construction data, and applying the model to the model verification data to identify (label) a sample is repeated multiple times (S2 to S5). Although the machine learning model to be constructed changes when the model construction data changes, an accurate identification can be made with a high probability. Thus, there is a high possibility that an original label and an identification result do not coincide in a mislabeled sample, resulting in misidentification. If the number of misidentifications is counted for each sample to obtain a misidentification rate, the mislabeled sample is identified based on the misidentification rate since the misidentification rate is relatively high in the mislabeled sample (S6 to S7). In this manner, the identification performance of the machine learning model can be improved by detecting the sample included in the teacher data that is highly likely to be in a mislabeled state with high accuracy.

Patent Agency Ranking