DISCOVERING A SEMANTIC MEANING OF DATA FIELDS FROM PROFILE DATA OF THE DATA FIELDS

    公开(公告)号:US20200380212A1

    公开(公告)日:2020-12-03

    申请号:US16794361

    申请日:2020-02-19

    摘要: A data processing system for discovering a semantic meaning of a field included in one or more data sets is configured to identify a field included in one or more data sets, with the field having an identifier. For that field, the system profiles data values of the field to generate a data profile, accesses a plurality of label proposal tests, and generates a set of label proposals by applying the plurality of label proposal tests to the data profile. The system determines a similarity among the label proposals and selects a classification. The system identifies one of the label proposals as identifying the semantic meaning. The system stores the identifier of the field with the identified one of the label proposals that identifies the semantic meaning.