IDENTIFICATION OF FEATURES FOR PREDICTION OF MISSING ATTRIBUTE VALUES

    公开(公告)号:US20220358432A1

    公开(公告)日:2022-11-10

    申请号:US17316058

    申请日:2021-05-10

    Applicant: SAP SE

    Abstract: Technologies are described for identifying features that can be used to predict missing attribute values. For example, a set of structured data can be received comprising a plurality of features and one or more labels. The set of structured data can be pre-processed, comprise applying one or more cleaning policies to produce a set of pre-processed features. The set of pre-processed features can be filtered using correlation-based filtering that uses one or more correlation estimation techniques to remove at least some highly correlated features. The correlation-based filtering can produce a set of filtered features. Feature subset selection can be performed comprising applying machine learning algorithms to the set of filtered features to determine relative importance among the set of filtered features. Based on the relative importance, a subset of the set of filtered features can be determined. The subset of the set of filtered features can be output.

    Identification of features for prediction of missing attribute values

    公开(公告)号:US11983652B2

    公开(公告)日:2024-05-14

    申请号:US17316058

    申请日:2021-05-10

    Applicant: SAP SE

    CPC classification number: G06Q10/06313 G06N5/022

    Abstract: Technologies are described for identifying features that can be used to predict missing attribute values. For example, a set of structured data can be received comprising a plurality of features and one or more labels. The set of structured data can be pre-processed, comprise applying one or more cleaning policies to produce a set of pre-processed features. The set of pre-processed features can be filtered using correlation-based filtering that uses one or more correlation estimation techniques to remove at least some highly correlated features. The correlation-based filtering can produce a set of filtered features. Feature subset selection can be performed comprising applying machine learning algorithms to the set of filtered features to determine relative importance among the set of filtered features. Based on the relative importance, a subset of the set of filtered features can be determined. The subset of the set of filtered features can be output.

    Maintaining master data using hierarchical classification

    公开(公告)号:US11836612B2

    公开(公告)日:2023-12-05

    申请号:US16444222

    申请日:2019-06-18

    Applicant: SAP SE

    CPC classification number: G06N3/08 G06F16/258 G06F16/285

    Abstract: Disclosed herein are system, method, and computer program product embodiments for classifying data objects using machine learning. In an embodiment, an artificial neural network may be trained to identify explained variable values corresponding to data object attributes. For example, the explained variables may be a category and a subcategory with the subcategory having a hierarchical relationship to the category. The artificial neural network may then receive a data record having one or more attribute values. The neural network may then identify a first and second explained variable value corresponding to the one or more attribute values based on the trained neural network model. The first and second explained variable values may then be associated with the data record. For example, if the data record is stored in a database, the record may be updated to include the first and second explained variable values.

Patent Agency Ranking