LEARNING DATA PROCESSING DEVICE, LEARNING DATA PROCESSING METHOD AND NON-TRANSITORY COMPUTER-READABLE MEDIUM

    公开(公告)号:US20220036235A1

    公开(公告)日:2022-02-03

    申请号:US17206731

    申请日:2021-03-19

    发明人: Yoshiyuki Jinguu

    IPC分类号: G06N20/00

    摘要: The learning data processing device includes the data processing unit configured to generate learning data used in the learning device that generates a learning model on the basis of time-series data including at least one kind of measured value. The data processing unit executes at least one of a first removal process in which a statistical value of measured values included in one or multiple predetermined periods of the time-series data and at least one of an outlier determination upper limit value or an outlier determination lower limit value based on the statistical value are calculated, and, of measured values included in one or multiple predetermined periods, measured values that are at least one of those greater than or equal to the outlier determination upper limit value or those less than or equal to the outlier determination lower limit value are removed from the time-series data, or a second removal process in which, of measured values included in the time-series data, measured values satisfying a predetermined condition are removed from the time-series data.

    NON-TRANSITORY COMPUTER READABLE MEDIUM, INFORMATION PROCESSING APPARATUS, AND METHOD OF GENERATING A LEARNING MODEL

    公开(公告)号:US20220309406A1

    公开(公告)日:2022-09-29

    申请号:US17654333

    申请日:2022-03-10

    发明人: Yoshiyuki Jinguu

    IPC分类号: G06N20/20 G06K9/62

    摘要: A program causes an information processing apparatus to execute operations including determining whether, in a training data set including a plurality of pieces of training data, the count of a first label and the count of a second label are imbalanced, generating, by dividing the training data set, a plurality of subsets each including first training data characterized by the first label and at least a portion of second training data characterized by the second label, the first training data having a count balanced with the count of the second label, generating a plurality of first learning models based on each of the generated subsets, and saving the plurality of first learning models when it is determined that the value of a first evaluation index for the generated plurality of first learning models is higher than the value of a second evaluation index.