METADATA CLASSIFICATION
    1.
    发明公开

    公开(公告)号:US20230222142A1

    公开(公告)日:2023-07-13

    申请号:US18124415

    申请日:2023-03-21

    Applicant: SNOWFLAKE INC.

    CPC classification number: G06F16/285 G06N5/01 G06F16/221

    Abstract: Systems and method are disclosed that retrieve data from a data set organized in a plurality of columns. For each column in the plurality of columns, the systems and method generate one or more candidate semantic categories for the column, where each of the one or more candidate semantic categories has a corresponding probability. The systems and method create a feature vector for the column from the one or more candidate semantic categories and the corresponding probabilities. The systems and method determine a semantic category type of the column based on the feature vector. The systems and method anonymize the data in the column based on the semantic category type, which includes replacing more specific data in the column with less specific data based on a data hierarchy that relates the more specific data to the less specific data.

    COLUMN DATA ANONYMIZATION BASED ON PRIVACY CATEGORY CLASSIFICATION

    公开(公告)号:US20240078253A1

    公开(公告)日:2024-03-07

    申请号:US18498599

    申请日:2023-10-31

    Applicant: SNOWFLAKE INC.

    CPC classification number: G06F16/285 G06F16/221 G06N5/01

    Abstract: An approach is disclosed herein that retrieves data from a data set that includes first column data comprising a first data type and a second data type. The approach structures the first column data into second column data and third column data based on the first data type and the second data type. The approach determines a first semantic category and a second semantic category for the first data type and the second data type, and then determines a first privacy category and a second privacy category based on the first semantic category and the second semantic category. The approach anonymizes the second column data and the third column data to produce anonymized data based on the first privacy category and the second privacy category, respectively. In turn, the approach generates an anonymized view of the data set using the anonymized data.

    METADATA CLASSIFICATION
    3.
    发明申请

    公开(公告)号:US20220245175A1

    公开(公告)日:2022-08-04

    申请号:US17163156

    申请日:2021-01-29

    Applicant: Snowflake Inc.

    Abstract: Generating semantic names for a data set is described. An example method can include retrieving data from a data set, the data organized in a plurality of columns. The method may also include generating one or more candidate semantic categories for that column, wherein each of the one or more candidate semantic categories has a corresponding probability for each of the columns. The method may also further include creating a feature vector for each column from the one or more column candidate semantic categories and the corresponding probabilities. Additionally, the method may also include selecting, for each column, a column semantic category from the one or more candidate semantic categories using at least the feature vector and a trained machine learning model.

Patent Agency Ranking