PARTITIONING AND REDUCING RECORDS AT INGEST OF A WORKER NODE

    公开(公告)号:US20190258637A1

    公开(公告)日:2019-08-22

    申请号:US16397970

    申请日:2019-04-29

    Applicant: Splunk Inc.

    Abstract: Systems and methods are described for partitioning and reducing records at ingest of a worker node. The worker node receives chunks of data from one or more indexers of a data intake and query system based on the execution of a query by the data intake and query system. The worker node assigns records to different record groups based on the content of the records. The system also assigns the record to a partition of a group of partitions. Record data of the records in a particular partition is combined. The system processes the partitions based on the query.

    Determining Records Generated by a Processing Task of a Query

    公开(公告)号:US20190258635A1

    公开(公告)日:2019-08-22

    申请号:US16398044

    申请日:2019-04-29

    Applicant: Splunk Inc.

    Abstract: Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.

    GENERATING A SUBQUERY FOR AN EXTERNAL DATA SYSTEM USING A CONFIGURATION FILE

    公开(公告)号:US20190147086A1

    公开(公告)日:2019-05-16

    申请号:US16147165

    申请日:2018-09-28

    Applicant: Splunk Inc.

    Abstract: Systems and methods are disclosed for receiving, at a data intake and query system, a query that includes an indication to process data managed by a third-party data storage and processing system that supports a different query language than the data intake and query system. The data intake and query system identifies a third-party data storage and processing system that manages the data to be processed and generates a subquery for execution by the third-party data storage and processing system, generates instructions for one or more worker nodes to receive and process results of the subquery from the third-party data storage and processing system, and instructs the worker nodes to provide results of the processing to the data intake and query system.

    EXECUTION OF A QUERY RECEIVED FROM A DATA INTAKE AND QUERY SYSTEM

    公开(公告)号:US20190138642A1

    公开(公告)日:2019-05-09

    申请号:US16051310

    申请日:2018-07-31

    Applicant: Splunk Inc.

    Abstract: Systems and methods are disclosed for receiving and executing a query received from a data intake and query system and providing results to a first group of worker nodes in a distributed execution environment. The query identifies a set of data to be processed and a manner of processing the set of data. Based on the query, the system defines a query processing scheme, and generates instructions for a second group of worker nodes to obtain the set of data from one or more dataset sources and to process the set of data. The system communicates results of the query to the first group of worker nodes.

    MULTI-PARTITION OPERATION IN COMBINATION OPERATIONS

    公开(公告)号:US20190095493A1

    公开(公告)日:2019-03-28

    申请号:US15713976

    申请日:2017-09-25

    Applicant: Splunk Inc.

    Abstract: In an environment where multiple datasets are to be combined, systems and methods are disclosed for allocating a group of data entries from at least one dataset into multiple partitions. For a particular partition, the subgroup in the partition can be combined with data entries from the other dataset. In some cases, groups of data entries from each dataset are assigned to different partitions. For a particular partition, a subgroup is duplicated, some of the data entries of the subgroup are reassigned to other partitions, the subgroup is reformed to include data entries from other partitions, and the reformed subgroup is combined with the subgroup from the other dataset(s).

    GENERATING A DISTRIBUTED EXECUTION MODEL WITH UNTRUSTED COMMANDS

    公开(公告)号:US20190095491A1

    公开(公告)日:2019-03-28

    申请号:US15714424

    申请日:2017-09-25

    Applicant: Splunk Inc.

    Abstract: Systems and methods are disclosed for generating a distributed execution model with untrusted commands. The system can receive a query, and process the query to identify the untrusted commands. The system can use data associated with the untrusted command to identify one or more files associated with the untrusted command. Based on the files, the system can generate a data structure and include one or more identifiers associated with the data structure in the distributed execution model. The system can distribute the distributed execution model to one or more nodes in a distributed computing environment for execution.

Patent Agency Ranking