Abstract:
Systems and methods are disclosed for securing an enterprise environment by detecting suspicious software. A global program lineage graph is constructed. Construction of the global program lineage graph includes creating a node for each version of a program having been installed on a set of user machines. Additionally, at least two nodes are linked with a directional edge. For each version of the program, a prevalence number of the set of user machines on which each version of the program had been installed is determined; and the prevalence number is recorded to the metadata associated with the respective node. Anomalous behavior is identified based on structures formed by the at least two nodes and associated directional edge in the global program lineage graph. An alarm is displayed on a graphical user interface for each suspicious software based on the identified anomalous behavior.
Abstract:
A computer-implemented method for analyzing operations of privilege changes is presented. The computer-implemented method includes inputting a program and performing source code analysis on the program by generating a privilege control flow graph (PCFG), generating a privilege data flow graph (PDFG), and generating a privilege call context graph (PCCG). The computer-implemented method further includes, based on the source code analysis results, instrumenting the program to perform inspections on execution states at privilege change operations, and performing runtime inspection and anomaly prevention.
Abstract:
A system and computer-implemented method are provided for host level detection of malicious Domain Name System (DNS) activities in a network environment having multiple end-hosts. The system includes a set of DNS resolver agents configured to (i) gather DNS activities from each of the multiple end-hosts by recording DNS queries and DNS responses corresponding to the DNS queries, and (ii) associate the DNS activities with Program Identifiers (PIDs) that identify programs that issued the DNS queries. The system further includes a backend server configured to detect one or more of the malicious DNS activities based on the gathered DNS activities and the PIDs.
Abstract:
Systems and methods for identifying similarities in program binaries, including extracting program binary features from one or more input program binaries to generate corresponding hybrid features. The hybrid features include a reference feature, a resource feature, an abstract control flow feature, and a structural feature. Combinations of a plurality of pairs of binaries are generated from the extracted hybrid features, and a similarity score is determined for each of the pairs of binaries. A hybrid difference score is generated based on the similarity score for each of the binaries combined with input hybrid feature parameters. A likelihood of malware in the input program is identified based on the hybrid difference score.
Abstract:
Methods and systems for process constraint include collecting system call information for a process. It is detected whether the process is idle based on the system call information and then whether the process is repeating using autocorrelation to determine whether the process issues system calls in a periodic fashion. The process is constrained if it is idle or repeating to limit an attack surface presented by the process.
Abstract:
The present invention enables capturing API level calls using a combination of dynamic instrumentation and library overriding. The invention allows event level tracing of API function calls and returns, and is able to generate an execution trace. The instrumentation is lightweight and relies on dynamic library/shared library linking mechanisms in most operating systems. Hence we need no source code modification or binary injection. The tool can be used to capture parameter values, and return values, which can be used to correlate traces across API function calls to generate transaction flow logic.
Abstract:
A computer implemented method provides efficient monitoring and analysis of a program's memory objects in the operation stage. The invention can visualize and analyze a monitored program's data status with improved semantic information without requiring source code at runtime. The invention can provide higher quality of system management, performance debugging, and root-cause error analysis of enterprise software in the production stage.
Abstract:
Methods and systems for detecting and responding to an intrusion in a computer network include generating an adversarial training data set that includes original samples and adversarial samples, by perturbing one or more of the original samples with an integrated gradient attack to generate the adversarial samples. The original and adversarial samples are encoded to generate respective original and adversarial graph representations, based on node neighborhood aggregation. A graph-based neural network is trained to detect anomalous activity in a computer network, using the adversarial training data set. A security action is performed responsive to the detected anomalous activity.
Abstract:
A method for implementing confidential machine learning with program compartmentalization includes implementing a development stage to design an ML program, including annotating source code of the ML program to generate an ML program annotation, performing program analysis based on the development stage, including compiling the source code of the ML program based on the ML program annotation, inserting binary code based on the program analysis, including inserting run-time code into a confidential part of the ML program and a non-confidential part of the ML program, and generating an ML model by executing the ML program with the inserted binary code to protect the confidentiality of the ML model and the ML program from attack.
Abstract:
A computer-implemented method for securing software installation through deep graph learning includes extracting a new software installation graph (SIG) corresponding to a new software installation based on installation data associated with the new software installation, using at least two node embedding models to generate a first vector representation by embedding the nodes of the new SIG and inferring any embeddings for out-of-vocabulary (OOV) words corresponding to unseen pathnames, utilizing a deep graph autoencoder to reconstruct nodes of the new SIG from latent vector representations encoded by the graph LSTM, wherein reconstruction losses resulting from a difference of a second vector representation generated by the deep graph autoencoder and the first vector representation represent anomaly scores for each node, and performing anomaly detection by comparing an overall anomaly score of the anomaly scores to a threshold of normal software installation.