System and method for association of data elements within a document

    公开(公告)号:US11775592B2

    公开(公告)日:2023-10-03

    申请号:US17089762

    申请日:2020-11-05

    申请人: SECURITI, Inc.

    摘要: A system for association of data elements within a document is disclosed. An input data receiving subsystem receives an input data source of the document. A feature generation subsystem obtains one or more lists of personal data, generates one or more personal data features representing a relationship between one or more personal data elements. An affinity computation subsystem assesses each of the one or more personal data features, computes affinity score between the one or more personal data elements, generates one or more affinities. A personal data relationship identification subsystem assigns the one or more personal data elements to corresponding one or more identification stages, derives a set of identities corresponding to the one or more personal data elements. An identity filtration subsystem receives the one or more affinities and the set of identities, determines a validation of the set of identities, filters out the set of identities.

    SYSTEM AND A METHOD FOR THE CLASSIFICATION OF SENSITIVE DATA ELEMENTS IN A FILE

    公开(公告)号:US20230237018A1

    公开(公告)日:2023-07-27

    申请号:US18156439

    申请日:2023-01-19

    申请人: SECURITI, Inc.

    摘要: A system and a method for classifying sensitive data elements in a file is provided. The method includes receiving and converting, the unstructured data file into a machine-readable format and generating, a plurality of sensitive data features. The plurality of sensitive data features represents single element of the sensitive data. The method includes generating, a plurality of adjacent elements corresponding to the single elements of the sensitive data and generating a plurality of feature categories. The method includes aggregating, the plurality of adjacent node features and the plurality of edge features. The method includes calculating and concatenating the plurality of aggregated adjacent nodes features and the plurality of aggregated edge features. The method includes comparing, the distance of the sensitive data from all of the adjacent sensitive data. The method includes classifying and predicting, the sensitive data to be a true positive or false positive sensitive data by using machine learning.