Methods and systems that sample log/event messages in a distributed log-analytics system

    公开(公告)号:US11650868B2

    公开(公告)日:2023-05-16

    申请号:US17143203

    申请日:2021-01-07

    Applicant: VMWARE, INC.

    CPC classification number: G06F9/546 G06F9/542

    Abstract: The current document is directed to methods and systems that sample log/event messages for downstream processing by log/event-message systems incorporated within distributed computer facilities. The data-collection, data-storage, and data-querying functionalities of log/event-message systems provide a basis for distributed log-analytics systems which, in turn, provide a basis for automated and semi-automated system-administration-and-management systems. By sampling log/event-messages, rather than processing and storing every log/event-message generated within a distributed computer system, a log/event-message system significantly decreases data-storage-capacity, computational-bandwidth, and networking-bandwidth overheads involved in processing and retaining large numbers of log/event messages that do not provide sufficient useful information to justify these costs. Increase in efficiencies of log/event-message systems obtained by sampling translate directly into increases in bandwidths of distributed computer systems, in general, and to increases in time periods during which useful log/event messages can be stored.

    AUTOMATED LOG/EVENT-MESSAGE MASKING IN A DISTRIBUTED LOG-ANALYTICS SYSTEM

    公开(公告)号:US20220179991A1

    公开(公告)日:2022-06-09

    申请号:US17115197

    申请日:2020-12-08

    Applicant: VMware, Inc.

    Abstract: The current document is directed to methods and systems that efficiently and accurately process log/event messages generated within distributed computer facilities. Various different types of initial processing steps may be applied to a stream of log/event messages received by a message-collector system and/or a message-ingestion-and-processing system, including masking sensitive fields to prevent exposure of confidential and sensitive information contained in log/event messages. Rule-based identification and masking of sensitive fields in log/event messages is currently provided by certain automated log/event-message systems, but current approaches suffer numerous deficiencies. The methods and systems to which the current document is directed automatically create sensitive-field dictionaries and associated logic and/or train machine-learning components to automatically identify and mask fields within log/event messages in order to address the deficiencies of traditional rule-based sensitive-field identification and masking.

    ENHANCED LEARNING WITH FEEDBACK LOOP FOR MACHINE READING COMPREHENSION MODELS

    公开(公告)号:US20200320429A1

    公开(公告)日:2020-10-08

    申请号:US16423201

    申请日:2019-05-28

    Applicant: VMWARE, INC.

    Abstract: The present disclosure provides an approach for training a machine learning model by first training the model on a generic dataset and then iteratively training the model on “easy” domain specific training data before moving on to “difficult” domain specific training data. Inputs of a domain-specific dataset are run on the generically-trained model to determine which inputs generate an accuracy score above a threshold. The inputs with an accuracy score above a threshold are used to retrain the model, along with the corresponding outputs. The retraining continues until all domain specific dataset has been used to train the model, or until no remaining inputs of the domain specific dataset generate an accuracy score, when run on the model, that is above a threshold.

    Methods and systems that rank and display log/event messages and transactions

    公开(公告)号:US11500713B2

    公开(公告)日:2022-11-15

    申请号:US17133479

    申请日:2020-12-23

    Applicant: VMWARE, INC.

    Abstract: Methods and systems that automatically rank log/event messages and log/event-message transactions to facilitate analysis of log/event-messages generated within distributed-computer systems are disclosed. A base-window dataset and current-window dataset are selected for diagnosis of a particular error or failure and processed to generate a transaction sequence for each dataset corresponding to log/event-message traces identified in the datasets. Then, frequencies of occurrence of log/event-message types relative to transaction types are generated for each dataset. From these two sets of relative frequencies of occurrence, changes in the relative frequency of occurrence for each log/event-message-type/transaction-type pair are generated. Normalized scores for log/event-message-type/transaction-type pairs and scores for transaction types are then generated from the changes in the relative frequency of occurrence. The generated scores reflect the relevance of log/event-messages in traces corresponding to particular transaction as well as the relevance of transaction types to the error or failure.

    METHOD AND SUBSYSTEM OF A DISTRIBUTED LOG-ANALYTICS SYSTEM THAT AUTOMATICALLY DETERMINE THE SOURCE OF LOG/EVENT MESSAGES

    公开(公告)号:US20220318202A1

    公开(公告)日:2022-10-06

    申请号:US17222050

    申请日:2021-04-05

    Applicant: VMware, Inc.

    Abstract: The current document is directed to methods and subsystems within distributed log-analytics systems that automatically and autonomously generate indications of log sources for log/event messages received by the distributed log-analytics systems. The log-source indications can be incorporated in tags associated with received log/event messages to facilitate use of log/event-message information and log/event-message-processing tools contained in content packs provided by designers, manufacturers, and vendors of computational entities by log/event-message systems that collect, process, and store large volumes of log/event messages generated by many different types of computational entities within distributed computer systems. Log-source indications are generated by a combination of using currently available log-source indications associated with log/event messages, event-type-clustering based event-type-to-log source mapping, and machine-learning-based event-type-to-log source mapping.

    Enhanced learning with feedback loop for machine reading comprehension models

    公开(公告)号:US11151478B2

    公开(公告)日:2021-10-19

    申请号:US16423201

    申请日:2019-05-28

    Applicant: VMWARE, INC.

    Abstract: The present disclosure provides an approach for training a machine learning model by first training the model on a generic dataset and then iteratively training the model on “easy” domain specific training data before moving on to “difficult” domain specific training data. Inputs of a domain-specific dataset are run on the generically-trained model to determine which inputs generate an accuracy score above a threshold. The inputs with an accuracy score above a threshold are used to retrain the model, along with the corresponding outputs. The retraining continues until all domain specific dataset has been used to train the model, or until no remaining inputs of the domain specific dataset generate an accuracy score, when run on the model, that is above a threshold.

Patent Agency Ranking