-
公开(公告)号:US20220012535A1
公开(公告)日:2022-01-13
申请号:US16924009
申请日:2020-07-08
Applicant: VMware, Inc.
Inventor: Yaniv Ben-Itzhak , Shay Vargaftik
Abstract: Techniques for augmenting training data sets for machine learning (ML) classifiers using classification metadata are provided. In one set of embodiments, a computer system can train a first ML classifier using a training data set, where the training data set comprises a plurality of data instances, where each data instance includes a set of features, and where the training results in a trained version of the first ML classifier. The computer system can further classify each data instance in the plurality of data instances using the trained version of the first ML classifier, the classifications generating classification metadata for each data instance, and augment the training data set with the classification metadata to create an augmented version of the training data set. The computer system can then train a second ML classifier using the augmented version of the training data set.
-
公开(公告)号:US20210216831A1
公开(公告)日:2021-07-15
申请号:US16743865
申请日:2020-01-15
Applicant: VMware, Inc.
Inventor: Yaniv Ben-Itzhak
Abstract: Techniques for implementing an efficient machine learning (ML) model for classification are provided. In one set of embodiments, a computer system can receive a query data instance to be classified. The computer system can then generate a first classification result for the query data instance using a first (i.e., primary) ML model, where the first classification result includes a predicted class for the query data instance and a confidence level indicating a likelihood that the predicted class is correct, and compare the confidence level with a classification confidence threshold. If the confidence level is greater than or equal to the classification confidence threshold, the computer system can output the first classification result as a final classification result for the query data instance. However, if the confidence level is less than the classification confidence threshold, the computer system can forward the query data instance to one of a plurality of second (i.e., secondary) ML models for further classification.
-
公开(公告)号:US20230409243A1
公开(公告)日:2023-12-21
申请号:US17845740
申请日:2022-06-21
Applicant: VMware, Inc.
Inventor: Alex Markuze , Shay Vargaftik , Igor Golikov , Yaniv Ben-Itzhak , Avishay Yanai
IPC: G06F3/06
CPC classification number: G06F3/067 , G06F3/0655 , G06F3/0604
Abstract: Some embodiments provide a method for, at a network interface controller (NIC) of a computer, accessing data in a network. From the computer, the method receives a request to access data stored at a logical memory address. The method translates the logical memory address into a memory address of a particular network device storing the requested data. The method sends a data message to the particular network device to retrieve the requested data.
-
公开(公告)号:US11687824B2
公开(公告)日:2023-06-27
申请号:US16248622
申请日:2019-01-15
Applicant: VMware, Inc.
Inventor: Yaniv Ben-Itzhak , Shay Vargaftik
CPC classification number: G06N20/00 , G06F16/285 , G06N5/045
Abstract: Techniques for implementing intelligent data partitioning for a distributed machine learning (ML) system are provided. In one set of embodiments, a computer system implementing a data partition module can receive a training data instance for a ML task and identify, using a clustering algorithm, a cluster to which the training data instance belongs, the cluster being one of a plurality of clusters determined via the clustering algorithm that partition a data space of the ML task. The computer system can then transmit the training data instance to a ML worker of the distributed ML system that is assigned to the cluster, where the ML worker is configured to build or update a ML model using the training data instance.
-
公开(公告)号:US20220335300A1
公开(公告)日:2022-10-20
申请号:US17231476
申请日:2021-04-15
Applicant: VMware, Inc.
Inventor: Yaniv Ben-Itzhak , Shay Vargaftik , Ayal Taitler
IPC: G06N3/08 , G06N5/00 , G06F16/901
Abstract: In one set of embodiments, a deep reinforcement learning (RL) system can train an agent to construct an efficient decision tree for classifying network packets according to a rule set, where the training includes: identifying, by an environment of the deep RL system, a leaf node in a decision tree; computing, by the environment, a graph structure representing a state of the leaf node, the graph structure including information regarding how one or more rules in the rule set that are contained in the leaf node are distributed in a hypercube of the leaf node; communicating, by the environment, the graph structure to the agent; providing, by the agent, the graph structure as input to a graph neural network; and generating, by the graph neural network based on the graph structure, an action to be taken on the leaf node for extending the decision tree.
-
26.
公开(公告)号:US20220012567A1
公开(公告)日:2022-01-13
申请号:US16924015
申请日:2020-07-08
Applicant: VMware, Inc.
Inventor: Yaniv Ben-Itzhak , Shay Vargaftik
Abstract: Techniques for training a neural network classifier using classification metadata from another, non-neural network (non-NN) classifier are provided. In one set of embodiments, a computer system can train the non-NN classifier using a training data set, where the training results in a trained version of the non-NN network classifier. The computer system can further classify a data instance in the plurality of data instances using the trained non-NN classifier, the classifying generating a first class distribution for the data instance, and provide the data instance's feature set as input to a neural network classifier, the providing causing the neural network classifier to generate a second class distribution for the data instance. The computer system can then compute a loss value indicating a degree of divergence between the first and second class distributions and provide the loss value as feedback to the neural network classifier, which can cause the neural network classifier to adjust one or more internal edge weights in an manner that reduces the degree of divergence.
-
公开(公告)号:US20200341789A1
公开(公告)日:2020-10-29
申请号:US16394663
申请日:2019-04-25
Applicant: VMware, Inc.
Inventor: Aditi Ghag , Pranshu Jain , Yaniv Ben-Itzhak , Sujata Banerjee , Yongzhe Fan
Abstract: A method for containerized workload scheduling can include monitoring network traffic between a first containerized workload deployed on a node in a virtual computing environment to determine affinities between the first containerized workload and other containerized workloads in the virtual computing environment. The method can further include scheduling, based, at least in part, on the determined affinities between the first containerized workload and the other containerized workloads, execution of a second containerized workload on the node on which the first containerized workload is deployed.
-
公开(公告)号:US20200226491A1
公开(公告)日:2020-07-16
申请号:US16248622
申请日:2019-01-15
Applicant: VMware, Inc.
Inventor: Yaniv Ben-Itzhak , Shay Vargaftik
Abstract: Techniques for implementing intelligent data partitioning for a distributed machine learning (ML) system are provided. In one set of embodiments, a computer system implementing a data partition module can receive a training data instance for a ML task and identify, using a clustering algorithm, a cluster to which the training data instance belongs, the cluster being one of a plurality of clusters determined via the clustering algorithm that partition a data space of the ML task. The computer system can then transmit the training data instance to a ML worker of the distributed ML system that is assigned to the cluster, where the ML worker is configured to build or update a ML model using the training data instance.
-
公开(公告)号:US11928857B2
公开(公告)日:2024-03-12
申请号:US16924048
申请日:2020-07-08
Applicant: VMware, Inc.
Inventor: Yaniv Ben-Itzhak , Shay Vargaftik
IPC: G06V10/774 , G06F18/214 , G06F18/22 , G06N20/00 , G06V10/77 , G06V10/778
CPC classification number: G06V10/7753 , G06F18/214 , G06F18/22 , G06N20/00 , G06V10/77 , G06V10/778
Abstract: Techniques for implementing unsupervised anomaly detection by self-prediction are provided. In one set of embodiments, a computer system can receive an unlabeled training data set comprising a plurality of unlabeled data instances, where each unlabeled data instance includes values for a plurality of features. The computer system can further train, for each feature in the plurality of features, a supervised machine learning (ML) model using a labeled training data set derived from the unlabeled training data set, receive a query data instance, and generate a self-prediction vector using at least a portion of the trained supervised ML models and the query data instance, where the self-prediction vector indicates what the query data instance should look like if it were normal. The computer system can then generate an anomaly score for the query data instance based on the self-prediction vector and the query data instance.
-
公开(公告)号:US20230409484A1
公开(公告)日:2023-12-21
申请号:US17845661
申请日:2022-06-21
Applicant: VMware, Inc.
Inventor: Shay Vargaftik , Alex Markuze , Yaniv Ben-Itzhak , Igor Golikov , Avishay Yanai
IPC: G06F12/0891 , G06F13/16
CPC classification number: G06F12/0891 , G06F13/1668 , G06F2213/3808 , G06F2213/0026
Abstract: Some embodiments provide a method for performing data message processing at a smart NIC of a computer that executes a software forwarding element (SFE). The method determines whether a received data message matches an entry in a data message classification cache stored on the smart NIC based on data message classification results of the SFE. When the data message matches an entry, the method determines whether the matched entry is valid by comparing a timestamp of the entry to a set of rules stored on the smart NIC. When the matched entry is valid, the method processes the data message according to the matched entry without providing the data message to the SFE executing on the computer.
-
-
-
-
-
-
-
-
-