High volume message classification and distribution
摘要:
A log message classifier employs machine learning for identifying a corresponding parser for interpreting the incoming log message and for retraining a classification logic model processing the incoming log messages. Voluminous log messages generate a large amount of data, typically in a text form. Data fields are parseable from the message by a parser that knows a format of the message. The classification logic is trained by a set of messages having a known format for defining groups of messages recognizable by a corresponding parser. The classification logic is defined by a random forest that outputs a corresponding group and confidence value for each incoming message. Groups may be split to define new groups based on a recurring matching tail (latter portion) of the incoming messages. A trend of decreased confidence scores triggers a periodic retraining of the random forest, and may also generate an alert to operators.
信息查询
0/0