Abstract:
The technology disclosed relates to formulating and refining field extraction rules that are used at query time on raw data with a late-binding schema. The field extraction rules identify portions of the raw data, as well as their data types and hierarchical relationships. These extraction rules are executed against very large data sets not organized into relational structures that have not been processed by standard extraction or transformation methods. By using sample events, a focus on primary and secondary example events help formulate either a single extraction rule spanning multiple data formats, or multiple rules directed to distinct formats. Selection tools mark up the example events to indicate positive examples for the extraction rules, and to identify negative examples to avoid mistaken value selection. The extraction rules can be saved for query-time use, and can be incorporated into a data model for sets and subsets of event data.
Abstract:
Embodiments are directed towards real time display of event records and extracted values based on at least one extraction rule, such as a regular expression. A user interface may be employed to enable a user to have an extraction rule automatically generate and/or to manually enter an extraction rule. The user may be enabled to manually edit a previously provided extraction rule, which may result in real time display of updated extracted values. The extraction rule may be utilized to extract values from each of a plurality of records, including event records of unstructured machine data. Statistics may be determined for each unique extracted value, and may be displayed to the user in real time. The user interface may also enable the user to select at least one unique extracted value to display those event records that include an extracted value that matches the selected value.
Abstract:
The technology disclosed relates to formulating and refining field extraction rules that are used at query time on raw data with a late-binding schema. The field extraction rules identify portions of the raw data, as well as their data types and hierarchical relationships. These extraction rules are executed against very large data sets not organized into relational structures that have not been processed by standard extraction or transformation methods. By using sample events, a focus on primary and secondary example events help formulate either a single extraction rule spanning multiple data formats, or multiple rules directed to distinct formats. Selection tools mark up the example events to indicate positive examples for the extraction rules, and to identify negative examples to avoid mistaken value selection. The extraction rules can be saved for query-time use, and can be incorporated into a data model for sets and subsets of event data.
Abstract:
Embodiments are directed towards real time display of event records and extracted values based on at least one extraction rule, such as a regular expression. A user interface may be employed to enable a user to have an extraction rule automatically generate and/or to manually enter an extraction rule. The user may be enabled to manually edit a previously provided extraction rule, which may result in real time display of updated extracted values. The extraction rule may be utilized to extract values from each of a plurality of records, including event records of unstructured machine data. Statistics may be determined for each unique extracted value, and may be displayed to the user in real time. The user interface may also enable the user to select at least one unique extracted value to display those event records that include an extracted value that matches the selected value.