Generation of a data model for searching machine data
    31.
    发明授权
    Generation of a data model for searching machine data 有权
    生成用于搜索机器数据的数据模型

    公开(公告)号:US08983994B2

    公开(公告)日:2015-03-17

    申请号:US14067203

    申请日:2013-10-30

    Applicant: Splunk Inc.

    Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model. The method further includes generating a new query string using the data model, executing the new query string on the data, and generating a new result set based on the new query string being executed on the data.

    Abstract translation: 实施例包括生成可以给非结构化或结构化数据赋予语义意义的数据模型,其可以包括由搜索引擎(包括时间序列引擎)生成和/或接收的数据。 一种方法包括为存储在存储库中的数据生成数据模型。 生成数据模型包括生成初始查询字符串,对数据执行初始查询字符串,基于对数据执行的初始查询字符串生成初始结果集,从一个或多个初始查询字符串的结果确定一个或多个候选字段 生成基于一个或多个候选字段的候选数据模型,迭代地修改候选数据模型,直到候选数据模型对数据建模,并使用候选数据模型作为数据模型。 该方法还包括使用数据模型生成新的查询字符串,对数据执行新的查询字符串,并且基于对数据执行的新查询字符串生成新的结果集。

    METADATA TRACKING FOR A PIPELINED SEARCH LANGUAGE (DATA MODELING FOR FIELDS)
    32.
    发明申请
    METADATA TRACKING FOR A PIPELINED SEARCH LANGUAGE (DATA MODELING FOR FIELDS) 审中-公开
    用于管道搜索语言的元数据跟踪(数据建模)

    公开(公告)号:US20140214807A1

    公开(公告)日:2014-07-31

    申请号:US14068651

    申请日:2013-10-31

    Applicant: Splunk Inc.

    Abstract: Embodiments are directed towards determining and tracking metadata for the generation of visualizations of requested data. A user may request data by providing a query that may be employed to search for the requested data. The query may include a plurality of commands, which may be employed in a pipeline to perform the search and to generate a table of the requested data. In some embodiments, each command may be executed to perform an action on a set of data. The execution of a command may generate one or more columns to append and/or insert into the table of requested data. Metadata for each generated column may be determined based on the actions performed by executing the commands. The table of requested data and the column metadata may be employed to generate and display a visualization of at least a portion of the requested data to a user.

    Abstract translation: 实施例旨在确定和跟踪用于生成所请求数据的可视化的元数据。 用户可以通过提供可用于搜索所请求的数据的查询来请求数据。 该查询可以包括多个命令,其可以在流水线中用于执行搜索并生成所请求的数据的表。 在一些实施例中,可以执行每个命令以对一组数据执行动作。 命令的执行可以生成一个或多个列来附加和/或插入到所请求的数据的表中。 可以基于通过执行命令执行的动作来确定每个生成的列的元数据。 可以使用所请求的数据和列元数据的表来生成并向用户显示所请求的数据的至少一部分的可视化。

    Parallelization of collection queries

    公开(公告)号:US11163738B2

    公开(公告)日:2021-11-02

    申请号:US16451450

    申请日:2019-06-25

    Applicant: SPLUNK INC.

    Abstract: Embodiments are directed are towards the parallelization of collection queries. A method of parallelizing collection queries comprises providing a field searchable data store comprising a plurality of field searchable time stamped event records. The method further comprises receiving, at a search head, a collection query that references a field name that identifies portions of one or more event records to be summarized. Further, the method comprises determining if the collection query can be concurrently executed on a first plurality of indexers, wherein the search head is configured to communicate with the first plurality of indexers, and wherein each indexer of the first plurality of indexers comprises one or more field searchable time stamped event records. Responsive to an affirmative determination, the method also comprises determining a second plurality of indexers relevant to the collection query and executing the collection query to generate a respective summarization table at each indexer.

    Identifying similar field sets using related source types

    公开(公告)号:US10949420B2

    公开(公告)日:2021-03-16

    申请号:US16050487

    申请日:2018-07-31

    Applicant: SPLUNK INC.

    Abstract: Embodiments of the present invention are directed to identifying related data, in particular, data associated with different source types. In embodiments, a first source type related to a second source type associated with a search query is identified. Field set pairs are identified from a first data set associated with the first source type and a second data set associated with the second source type. Each field set pair can include one field set associated with the first source type and another field set associated with the second source type. For each field set pair, an extent of similarity is determined between the corresponding field sets. Based on the extent of similarities between the corresponding field sets, at least one pair of related field sets is identified. An indication of the at least one pair of related field sets is provided, for example, for presentation to a user.

    PROVIDING SIMILAR FIELD SETS BASED ON RELATED SOURCE TYPES

    公开(公告)号:US20200042651A1

    公开(公告)日:2020-02-06

    申请号:US16050616

    申请日:2018-07-31

    Applicant: SPLUNK INC.

    Abstract: Embodiments of the present invention are directed to identifying and providing related data field sets. In one embodiment, a first portion of a graphical user interface (GUI) configured to receive a search query is displayed. The GUI enables user interaction to specify a source type in association with the search query. In accordance with a first source type specified in the search query, a first field set associated with the first source type is identified as related to a second field set associated with a second source type. A second portion of the GUI is displayed that includes a relationship indication that indicates the first field set associated with the first source type is related to the second field set associated with a second source type. Further, a third portion of the GUI is displayed that includes an explanation or recommendation associated with the relationship indication.

    PARALLELIZATION OF COLLECTION QUERIES
    38.
    发明申请

    公开(公告)号:US20190332590A1

    公开(公告)日:2019-10-31

    申请号:US16451450

    申请日:2019-06-25

    Applicant: SPLUNK INC.

    Abstract: Embodiments are directed are towards the parallelization of collection queries. A method of parallelizing collection queries comprises providing a field searchable data store comprising a plurality of field searchable time stamped event records. The method further comprises receiving, at a search head, a collection query that references a field name that identifies portions of one or more event records to be summarized. Further, the method comprises determining if the collection query can be concurrently executed on a first plurality of indexers, wherein the search head is configured to communicate with the first plurality of indexers, and wherein each indexer of the first plurality of indexers comprises one or more field searchable time stamped event records. Responsive to an affirmative determination, the method also comprises determining a second plurality of indexers relevant to the collection query and executing the collection query to generate a respective summarization table at each indexer.

    Collection query driven generation of summarization information for raw machine data

    公开(公告)号:US10387396B2

    公开(公告)日:2019-08-20

    申请号:US15705875

    申请日:2017-09-15

    Applicant: SPLUNK INC.

    Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

    Generating and storing summarization tables for sets of searchable events

    公开(公告)号:US09990386B2

    公开(公告)日:2018-06-05

    申请号:US14815973

    申请日:2015-08-01

    Applicant: Splunk Inc.

    Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

Patent Agency Ranking