Horizontally-scalable data de-identification

    公开(公告)号:US12086287B2

    公开(公告)日:2024-09-10

    申请号:US17980371

    申请日:2022-11-03

    Applicant: SNOWFLAKE INC.

    CPC classification number: G06F21/6254 G06F16/221 G06F16/282 G06F21/6227

    Abstract: A method receives data from a data source. The method generates a plurality of generalizations of the data. The method sends the plurality of generalizations of the data to a plurality of execution nodes, wherein each of the plurality of execution nodes includes computational resources to compute a candidate generalization using an information loss scoring function. The method receives a candidate generalization from each of the plurality of execution nodes. The method selects a preferred generalization from the plurality of candidate generalizations. The method generates an anonymized view of the data set using the preferred generalization.

    Metadata classification
    2.
    发明授权

    公开(公告)号:US11630853B2

    公开(公告)日:2023-04-18

    申请号:US17163156

    申请日:2021-01-29

    Applicant: Snowflake Inc.

    Abstract: Generating semantic names for a data set is described. An example method can include retrieving data from a data set, the data organized in a plurality of columns. The method may also include generating one or more candidate semantic categories for that column, wherein each of the one or more candidate semantic categories has a corresponding probability for each of the columns. The method may also further include creating a feature vector for each column from the one or more column candidate semantic categories and the corresponding probabilities. Additionally, the method may also include selecting, for each column, a column semantic category from the one or more candidate semantic categories using at least the feature vector and a trained machine learning model.

    Horizontally-scalable data de-identification

    公开(公告)号:US11755778B2

    公开(公告)日:2023-09-12

    申请号:US17352218

    申请日:2021-06-18

    Applicant: SNOWFLAKE INC.

    CPC classification number: G06F21/6254 G06F16/221 G06F16/282 G06F21/6227

    Abstract: Generating an anonymized view for a data set is described. An example method can include receiving data from a data set, wherein the data is organized in a plurality of columns. The method may also include generating a plurality of generalizations of the data. The method may also further include selecting a generalization from the plurality of generalizations using an information loss scoring function based on at least a generalization information loss. Additionally, the method may also include generating an anonymized view of the data set from the selected generalization.

    Horizontally-scalable data de-identification

    公开(公告)号:US11501021B1

    公开(公告)日:2022-11-15

    申请号:US17352217

    申请日:2021-06-18

    Applicant: SNOWFLAKE INC.

    Abstract: Generating an anonymized view for a data set is described. An example method can include receiving data from a data set, wherein the data is organized in a plurality of columns. The method may also include generating a plurality of generalizations of the data. The method may also further include selecting a generalization from the plurality of generalizations using an information loss scoring function based on at least a generalization information loss. Additionally, the method may also include generating an anonymized view of the data set from the selected generalization.

    Metadata classification
    5.
    发明授权

    公开(公告)号:US11853329B2

    公开(公告)日:2023-12-26

    申请号:US18124415

    申请日:2023-03-21

    Applicant: SNOWFLAKE INC.

    CPC classification number: G06F16/285 G06F16/221 G06N5/01

    Abstract: Systems and method are disclosed that retrieve data from a data set organized in a plurality of columns. For each column in the plurality of columns, the systems and method generate one or more candidate semantic categories for the column, where each of the one or more candidate semantic categories has a corresponding probability. The systems and method create a feature vector for the column from the one or more candidate semantic categories and the corresponding probabilities. The systems and method determine a semantic category type of the column based on the feature vector. The systems and method anonymize the data in the column based on the semantic category type, which includes replacing more specific data in the column with less specific data based on a data hierarchy that relates the more specific data to the less specific data.

    HORIZONTALLY-SCALABLE DATA DE-IDENTIFICATION

    公开(公告)号:US20230050290A1

    公开(公告)日:2023-02-16

    申请号:US17980371

    申请日:2022-11-03

    Applicant: SNOWFLAKE INC.

    Abstract: A method receives data from a data source. The method generates a plurality of generalizations of the data. The method sends the plurality of generalizations of the data to a plurality of execution nodes, wherein each of the plurality of execution nodes includes computational resources to compute a candidate generalization using an information loss scoring function. The method receives a candidate generalization from each of the plurality of execution nodes. The method selects a preferred generalization from the plurality of candidate generalizations. The method generates an anonymized view of the data set using the preferred generalization.

    HORIZONTALLY-SCALABLE DATA DE-IDENTIFICATION

    公开(公告)号:US20220343019A1

    公开(公告)日:2022-10-27

    申请号:US17352217

    申请日:2021-06-18

    Applicant: SNOWFLAKE INC.

    Abstract: Generating an anonymized view for a data set is described. An example method can include receiving data from a data set, wherein the data is organized in a plurality of columns. The method may also include generating a plurality of generalizations of the data. The method may also further include selecting a generalization from the plurality of generalizations using an information loss scoring function based on at least a generalization information loss. Additionally, the method may also include generating an anonymized view of the data set from the selected generalization.

    HORIZONTALLY-SCALABLE DATA DE-IDENTIFICATION

    公开(公告)号:US20220343012A1

    公开(公告)日:2022-10-27

    申请号:US17352218

    申请日:2021-06-18

    Applicant: SNOWFLAKE INC.

    Abstract: Generating an anonymized view for a data set is described. An example method can include receiving data from a data set, wherein the data is organized in a plurality of columns. The method may also include generating a plurality of generalizations of the data. The method may also further include selecting a generalization from the plurality of generalizations using an information loss scoring function based on at least a generalization information loss. Additionally, the method may also include generating an anonymized view of the data set from the selected generalization.

Patent Agency Ranking