Invention Grant
- Patent Title: Parsing of unstructured log data into structured data and creation of schema
-
Application No.: US16246765Application Date: 2019-01-14
-
Publication No.: US11372868B2Publication Date: 2022-06-28
- Inventor: Rod Reddekopp , Andrew Brownsword , Manel Fernandez Gomez , Juan Fernandez Peinador
- Applicant: Oracle International Corporation
- Applicant Address: US CA Redwood Shores
- Assignee: Oracle International Corporation
- Current Assignee: Oracle International Corporation
- Current Assignee Address: US CA Redwood Shores
- Agency: Hickman Becker Bingham Ledesma LLP
- Agent Brian N. Miller
- Main IPC: G06F16/2458
- IPC: G06F16/2458 ; G06F16/21 ; G06F16/17 ; G06N3/04 ; G06N3/08 ; G06F40/284 ; G06N20/20

Abstract:
Herein are techniques for training a parser by categorizing and generalizing messages and abstracting message templates for parsing after training. In an embodiment, a computer generates a message signature based on a message sequence of tokens that were extracted from a training message. The message signature is matched to a cluster signature that represents messages of one of many clusters that have distinct signatures. The training message is added to the cluster. Based on a data type of the cluster signature, a value is extracted from a second message, such as a live message after training. Fuzzy signatures may be probabilistically matched to select a best matching cluster for a message. The value range of a token may be broadened or narrowed by adding or removing candidate data types, by adding or removing literals to a data type, and/or by promoting a narrow data type to a broader data type.
Public/Granted literature
- US20200226214A1 PARSING OF UNSTRUCTURED LOG DATA INTO STRUCTURED DATA AND CREATION OF SCHEMA Public/Granted day:2020-07-16
Information query