-
61.
公开(公告)号:US10181983B2
公开(公告)日:2019-01-15
申请号:US15174017
申请日:2016-06-06
Applicant: VMware, Inc.
Inventor: Ashot Nshan Harutyunyan , Naira Movses Grigoryan , Arnak Poghosyan
Abstract: Methods recommend to data center customers those attributes of a data center infrastructure and application program that are associated with service-level objective (“SLO”) metric degradation and may be recorded in problem definitions. In other words, a data center customer is offered to “codify” problems primarily with atomic abnormality conditions on indicated attributes that decrease the SLO by some degree that the data center customer would like to be aware. As a result, the data center customer is warned of potentially significant SLO decline in order to prevent unwanted loss and take any necessary actions to prevent active anomalies. Methods also generate patterns of attributes that constitute core structures highly associated with degradation of the SLO metric.
-
公开(公告)号:US20180341566A1
公开(公告)日:2018-11-29
申请号:US15604460
申请日:2017-05-24
Applicant: VMware, Inc.
Inventor: Ashot Nshan Harutyunyan , Vardan Movsisyan , Arnak Poghosyan , Naira Movses Grigoryan
Abstract: Methods and systems are directed to quantifying and prioritizing the impact of problems or changes in a computer system. Resources of a computer system are monitored by management tools. When a change occurs at a resource of a computer system or in log data generated by event sources of the computer system, one or more of the management tools generates an alert. The alert may be an alert that indicates a problem with the computer system resource or the alert may be an alert trigger identified in an event message of the log data. Methods described herein compute an impact factor that serves as a measure of the difference between event messages generated before the alert and event messages generated after the alert. The value of the impact factor associated with an alert may be used to quantitatively prioritize the alert and generate appropriate recommendations for responding to the alert.
-
公开(公告)号:US20170364581A1
公开(公告)日:2017-12-21
申请号:US15184862
申请日:2016-06-16
Applicant: VMware, Inc.
CPC classification number: G06F16/287 , G06F11/3006 , G06F11/3452 , G06F16/24578 , G06F2201/815
Abstract: Methods and systems to evaluate importance of metrics generated in a data center and ranking metric in order of relevance to data center performance are described. Methods collect sets of metric data generated in a data center over a period of time and categorize each set of metric data as being of high importance, medium importance, or low importance. Methods also calculate a rank ordering of each set of high importance and medium importance metric data. By determining importance of data center metrics, an optimal usage and distribution of computational and storage resources of the data center may be determined.
-
64.
公开(公告)号:US20150379110A1
公开(公告)日:2015-12-31
申请号:US14314490
申请日:2014-06-25
Applicant: VMware, Inc.
IPC: G06F17/30
CPC classification number: G06F11/3452 , G05B23/0227 , G06F11/0706 , G06F11/0754 , G06F11/3409 , G06F2201/81
Abstract: This disclosure is directed to automated methods and systems for calculating hard thresholds used to monitor time-series data generated by data-generating entity. The methods are based on determining a cumulative distribution that characterizes the probability that data values of time-series data generated by the data-generating entity violate a hard threshold. The hard threshold is calculated as an inverse of the cumulative distribution based on a user defined risk confidence level. The hard threshold may then be used to generate alerts when time-series data generated later by the data-generating entity violate the hard threshold.
Abstract translation: 本公开涉及用于计算用于监视由数据生成实体生成的时间序列数据的硬阈值的自动化方法和系统。 这些方法基于确定表征数据生成实体产生的时间序列数据的数据值违反硬阈值的概率的累积分布。 基于用户定义的风险可信度,硬阈值被计算为累积分布的倒数。 随后由数据生成实体生成的时间序列数据违反硬阈值,硬阈值可用于产生警报。
-
65.
公开(公告)号:US20250111251A1
公开(公告)日:2025-04-03
申请号:US18376378
申请日:2023-10-03
Applicant: VMware, Inc.
Inventor: Arnak Poghosyan , Ashot Nshan Harutyunyan , Eduard Amirkhanyan , Tigran Mkrtchyan , Avetik Havhannisyan , Vahe Minasyan , Hakob Arakelyan
IPC: G06N5/025
Abstract: Automated computer-implemented methods and systems for troubleshooting and resolving problems with objects of a cloud infrastructure are described herein. In response to detecting abnormal behavior of an object running in the cloud infrastructure based on a key performance indicator (“KPI”) of the object, a graphical user interface (“GUI”) is displayed to enable a user to select KPIs of components of the object. For each of the components, a separate rule learning engine is deployed to generate rules for detecting a problem with the component based on the KPI of the object and the KPIs of the component. The rules are subsequently used to detect a runtime problem with the object and display in the GUI remedial measures for resolving the problem. Remedial measures are automatically executed to resolve the problem with the object via the GUI.
-
公开(公告)号:US11803440B2
公开(公告)日:2023-10-31
申请号:US17490340
申请日:2021-09-30
Applicant: VMware, Inc.
Inventor: Ashot Nshan Harutyunyan , Arnak Poghosyan
CPC classification number: G06F11/079 , G06F11/3447 , G06F11/3612 , G06N5/04
Abstract: Automated processes and systems troubleshoot and optimize performance of applications running in distributed computing systems. An automated computer-implemented processes train an inference model for an application based on metrics associated with the application and a key performance indicator (“KPI”) of the application. When a run-time performance problem is detected in run-time KPI values of KPI, the trained inference model is applied to run-time metrics and run-time KPI values to identify relevant run-time metrics that can be used to identify the root cause of the performance problem. The root cause of the performance problem can be used to generate a recommendation for correcting the performance problem. An alert identifying the root cause of the performance problem and the recommendation for correcting the performance problem are displayed on an interface of a display, thereby enabling correction of the performance problem and optimization of the application.
-
公开(公告)号:US20230229537A1
公开(公告)日:2023-07-20
申请号:US17577329
申请日:2022-01-17
Applicant: VMware, Inc.
Inventor: Ashot Nshan Harutyunyan , Nelli Aghajanyan , Lilit Harutyunyan , Arnak Poghosyan , Tigran Bunarjyan
CPC classification number: G06F11/0757 , G06N5/022 , G06F11/0766 , G06F11/0709
Abstract: The current document is directed to methods and systems that automatically generate training data for machine-learning-based components used by a metric-data processing-and-analysis component of a distributed computer system, a subsystem within a distributed computer system, or a standalone metric-data processing-and-analysis system. The training data sets are labeled using categorical KPI values. The machine-learning-based components are applied to metric data both for predicting anomalous operational behaviors and problems within the distributed computer system and for determination of potential causes of anomalous operational behaviors and problems within the distributed computer system. Training of machine-learning-based components is carried out concurrently and asynchronously with respect to other metric-data collection, aggregation, processing, storage, and analysis tasks.
-
68.
公开(公告)号:US11481300B2
公开(公告)日:2022-10-25
申请号:US16391668
申请日:2019-04-23
Applicant: VMware, Inc.
Abstract: Automated processes and systems for detecting abnormally behaving objects of a distributed computing system are described. Processes and systems obtain metrics that are generated in a historical time window and are associated with an object of the distributed computing system. Processes and system use the metrics to compute a time-dependent system indicator over the historical time window. Each value of the system indicator corresponds to a point in time of the historical time window when the object was in a normal or an abnormal state. Processes and systems use the normal and abnormal states of the system indicator in the historical time window to train a state classifier that is used to detect run-time abnormal behavior of the object. When the state classifier identifies abnormal behavior of the object, an alert is generated, indicating the abnormal behavior of the object.
-
公开(公告)号:US20220291982A1
公开(公告)日:2022-09-15
申请号:US17374682
申请日:2021-07-13
Applicant: VMware, Inc.
Inventor: Arnak Poghosyan , Ashot Nshan Harutyunyan , Naira Movses Grigoryan , Clement Pang , George Oganesyan , Karen Avagyan
IPC: G06F11/07
Abstract: Computer-implemented methods and systems described herein perform intelligent sampling of application traces generated by an application. Computer-implemented methods and systems determine different sampling rates based on frequency of occurrence of normal traces and erroneous traces of the application. The sampling rates for low frequency normal and erroneous traces are larger than the sampling rates for high frequency normal and erroneous traces. The relatively larger sampling rates for low frequency trace ensures that low frequency traces are sampled in sufficient numbers and are not passed over during sampling of the application traces. The sampled normal and erroneous traces are stored in a data storage device.
-
公开(公告)号:US11294758B2
公开(公告)日:2022-04-05
申请号:US15828133
申请日:2017-11-30
Applicant: VMware, Inc.
Inventor: Ashot Nshan Harutyunyan , Arnak Poghosyan , Naira Movses Grigoryan
IPC: G06F11/07
Abstract: Automated computational methods and systems to classify and troubleshoot problems in information technology (“IT”) systems or services provided by a distributed computing system are described. Each IT system of the distribution computing system or IT service provided by the distributed computing system has an associated key performance indicator (“KPI”) used to monitor performance of the IT system or service. When real-time KPI data violates a KPI threshold, a real-time event-type distribution is computed from event messages generated by event sources associated with the IT system or service following the threshold violation. The real-time event-type distribution is compared with historical event-type distributions recorded for the KPI data in order to identify the problem and execute remedial action to resolve the problem.
-
-
-
-
-
-
-
-
-