Abstract:
Systems and methods for preventing cyberattacks using a Density Estimation Network (DEN) for unsupervised anomaly detection, including constructing the DEN using acquired network traffic data by performing end-to-end training. The training includes generating low-dimensional vector representations of the network traffic data by performing dimensionality reduction of the network traffic data, predicting mixture membership distribution parameters for each of the low-dimensional representations by performing density estimation using a Gaussian Mixture Model (GMM) framework, and formulating an objective function to estimate an energy and determine a density level of the low-dimensional representations for anomaly detection, with an anomaly being identified when the energy exceeds a pre-defined threshold. Cyberattacks are prevented by blocking transmission of network flows with identified anomalies by directly filtering out the flows using a network traffic monitor.
Abstract:
Endpoint security systems and methods include a distance estimation module configured to calculate a travel distance between a source Internet Protocol (IP) address and an IP address for a target network endpoint system from a received packet received by the target network endpoint system based on time-to-live (TTL) information from the received packet. A machine learning model is configured to estimate an expected travel distance between the source IP address and the target network endpoint system IP address based on a sparse set of known source/target distances. A spoof detection module is configured to determine that the received packet has a spoofed source IP address based on a comparison between the calculated travel distance and the expected travel distance. A security module is configured to perform a security action at the target network endpoint system responsive to the determination that the received packet has a spoofed source IP address.
Abstract:
Methods and systems for network management include performing path regression to determine an end-to-end path across physical links for each data flow in a network. A per-flow utilization of each physical link in the network is estimated based on the determined end-to-end paths. A management action is performed in the network based on the estimated per-flow utilization.
Abstract:
A method implemented in a network apparatus used in a network is disclosed. The method includes sensing network topology and network utilization, receiving a request from an application, deciding path setup requirement using network state information obtained from the network topology and the network utilization, and translating the path setup requirement into a rule to be installed. Other methods, apparatuses, and systems also are disclosed.
Abstract:
A method for policy-aware mapping of an enterprise virtual tenant network includes receiving inputs from a hosting network and tenants, translating resource demand and policies of the tenants into a network topology and bandwidth demand on each link in the network; pre-arranging a physical resource of a physical topology for clustering servers on the network to form an allocation unit before a VTN allocation; allocating resources of the hosting network to satisfy demand of the tenants in response to a VTN demand request; and conducting a policy aware VTN mapping for enumerating all feasibly resource mappings, bounded by a predetermined counter for outputting optimal mapping with policy-compliant routing paths in the hosting network.
Abstract:
A method classifies missing labels. The method computes, using a neural network model trained on training data, rank-based statistics of a feature of a time series segment to attempt to select two candidate labels from the training data that the segment most likely belongs to. The method classifies the segment using k-NN-based classification applied to the training data, responsive to the two candidate labels being present in the training data. The method classifies the segment by hypothesis testing, responsive to only one candidate label being present in the training data. The method classifies the segment into a class with higher values of the rank-based statistics from among a plurality of classes with different values of the rank-based statistics, responsive to no candidate labels being present in the training data. The method corrects a prediction by an applicable one of the classifying steps by majority voting with time windows.
Abstract:
A system for cross-modal data retrieval is provided that includes a neural network having a time series encoder and text encoder which are jointly trained using an unsupervised training method which is based on a loss function. The loss function jointly evaluates a similarity of feature vectors of training sets of two different modalities of time series and free-form text comments and a compatibility of the time series and the free-form text comments with a word-overlap-based spectral clustering method configured to compute pseudo labels for the unsupervised training method. The computer processing system further includes a database for storing the training sets with feature vectors extracted from encodings of the training sets. The encodings are obtained by encoding a training set of the time series using the time series encoder and encoding a training set of the free-form text comments using the text encoder.
Abstract:
A method for explaining sensor time series data in natural language is presented. The method includes training a neural network model with text-annotated time series data, the neural network model including a time series encoder and a text generator, allowing a human operator to select a time series segment from the text-annotated time series data, the time series segment processed by the time series encoder, outputting, from the time series encoder, a sequence of hidden state vectors, one for each timestep, and generating readable explanatory texts for the human operator based on the selected time series segment, the readable explanatory texts being a set of comment texts explaining and interpreting the selected time series segment in a plurality of different ways.
Abstract:
Systems and methods for retrieving similar multivariate time series segments are provided. The systems and methods include extracting a long feature vector and a short feature vector from a time series segment, converting the long feature vector into a long binary code, and converting the short feature vector into a short binary code. The systems and methods further include obtaining a subset of long binary codes from a binary dictionary storing dictionary long codes based on the short binary codes, and calculating similarity measure for each pair of the long feature vector with each dictionary long code. The systems and methods further include identifying a predetermined number of dictionary long codes having the similarity measures indicting a closest relationship between the long binary codes and dictionary long codes, and retrieving a predetermined number of time series segments associated with the predetermined number of dictionary long codes.
Abstract:
A system for cross-modal data retrieval is provided. The system includes a database for storing training sets of two different modalities of time series and free-form text comments as pairs of mixed modality data. The computer processing system further includes a neural network having a time series encoder and text encoder which are jointly trained using a canonical correlation analysis that finds transformations of feature vectors from among the pairs of mixed modality data such that correlated mixed modality data is emphasized in the two different modalities and uncorrelated mixed modality data is minimized. The feature vectors are obtained by encoding a training set of the time series using the time series encoder and encoding a training set of the free-form text comments using the text encoder.