Abstract:
A computer-implemented method for applying data-loss-prevention policies. The method may include (1) maintaining a list of applications whose access to sensitive data is controlled by data-loss-prevention (DLP) policies, (2) detecting an attempt by a process to access sensitive data, (3) determining that the process has a parent-child relationship with an application within the list of applications, and (4) applying, based at least in part on the determination that the process has the parent-child relationship with the application, a DLP policy associated with the application to the process in order to prevent loss of sensitive data. Various other methods, systems, and computer-readable media are also disclosed.
Abstract:
A computing device receives a training data set that includes a plurality of positive examples of sensitive data and a plurality of negative examples of sensitive data via a user interface. The computing device analyzes the training data set using machine learning to generate a machine learning-based detection (MLD) profile that can be used to classify new data as sensitive data or as non-sensitive data. The computing device displays a quality metric for the MLD profile in the user interface.
Abstract:
A computing device receives a training data set that comprises a plurality of sensitive documents and a plurality of non-sensitive documents. The computing device determines a quality of the training data set. The quality may be determined using k-fold cross validation and/or latent semantic indexing. In response to determining that the training data set has a satisfactory quality, the computing device then analyzes the training data set using machine learning to train a machine learning-based detection (MLD) profile, the MLD profile to be used by a data loss prevention (DLP) system to classify new documents as sensitive documents or as non-sensitive documents.
Abstract:
A computer-implemented method may include (1) identifying a plurality of specific categories of sensitive information to be protected by a DLP system, (2) obtaining a training data set for each specific category of sensitive information that includes a plurality of positive and a plurality of negative examples of the specific category of sensitive information, (3) using machine learning to train, based on an analysis of the training data sets, at least one machine learning-based classifier that is capable of detecting items of data that contain one or more of the plurality of specific categories of sensitive information, and then (4) deploying the machine learning-based classifier within the DLP system to enable the DLP system to detect and protect items of data that contain one or more of the plurality of specific categories of sensitive information in accordance with at least one DLP policy of the DLP system.
Abstract:
A computer-implemented method for data-loss prevention may include: 1) identifying data associated with a user, 2) determining that the data is subject to a data-loss-prevention scan, 3) identifying a data-loss-prevention reputation associated with the user, and then 4) performing a data-loss-prevention operation based at least in part on the data-loss-prevention reputation associated with the user. Various other methods, systems, and computer-readable media are also disclosed.
Abstract:
A data loss prevention (DLP) manager running on a security virtual machine manages DLP policies for a plurality of guest virtual machines. The DLP manager identifies a startup event of a guest virtual machine, and installs a DLP component in the guest virtual machine. The DLP component communicates with the DLP manager operating within the security virtual machine. The DLP manager also receives file system events from the DLP component, and enforces a response rule associated with the guest virtual machine if the file system event violates a DLP policy.
Abstract:
A data loss prevention (DLP) manager running on a security virtual machine manages DLP policies for a plurality of guest virtual machines. The DLP manager identifies a startup event of a guest virtual machine, and installs a DLP component in the guest virtual machine. The DLP component communicates with the DLP manager operating within the security virtual machine. The DLP manager also receives file system events from the DLP component, and enforces a response rule associated with the guest virtual machine if the file system event violates a DLP policy.
Abstract:
A computing device receives a training data set that includes a plurality of positive examples of sensitive data and a plurality of negative examples of sensitive data via a user interface. The computing device analyzes the training data set using machine learning to generate a machine learning-based detection (MLD) profile that can be used to classify new data as sensitive data or as non-sensitive data. The computing device displays a quality metric for the MLD profile in the user interface.
Abstract:
A method and apparatus for identifying information as protected information using a structure is described. A DLP system, incorporating a structure analyzer, monitors outbound data transfers performed by the computing system for violations of a DLP policy. The DLP system analyzes a structure of information contained in an outbound data transfer against a protected structure defined in a DLP policy. The DLP system identifies the information as protected information to be protected by the DLP policy based on the analysis, and, when the information is identified as protected, the DLP system detects a violation of the DLP policy. The protected structure may be derived from document templates, document forms, or from a set of training documents.
Abstract:
A computing device receives a document that was incorrectly classified as sensitive data based on a machine learning-based detection (MLD) profile. The computing device modifies a training data set that was used to generate the MLD profile by adding the document to the training data set as a negative example of sensitive data to generate a modified training data set. The computing device then analyzes the modified training data set using machine learning to generate an updated MLD profile.