-
公开(公告)号:US11461334B2
公开(公告)日:2022-10-04
申请号:US15665197
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Alexander Douglas James , Christopher Pride
IPC: G06F16/2455 , H04L43/08 , G06F11/20 , H04L43/12 , H04L69/22 , H04L67/1097 , G06F16/27 , G06F16/951 , G06F16/2458 , G06F16/903 , H04L43/028 , H04L43/00 , G06F3/06 , G06F11/34
Abstract: Systems and methods are disclosed for processing queries against one or more dataset sources utilizing dynamically allocated partitions operating on one or more worker nodes. The results of the processing are stored in a dataset destination. The queries can identify data in the one or more dataset sources for processing and a manner for processing the data. In addition, the queries can identify the dataset destination for storing results of the query. To process the query, a query coordinator can dynamically allocate partitions operating on worker nodes to retrieve data for processing, process the data, and communicate the data to the dataset sources. In addition, the query coordinator can dynamically allocate partitions based on an identification of the dataset destination.
-
公开(公告)号:US11341131B2
公开(公告)日:2022-05-24
申请号:US16398031
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Nikhil Roy
IPC: G06F16/00 , G06F16/2453 , G06F16/242 , G06F16/2458 , G06F16/17
Abstract: Systems and methods are described for scheduling a query for execution. The system receives and parses a query to identify one or more portions of the query. The system determines a resource allocation for each portion of the query, and determines an availability of compute resources for the different portions of the query. Based on the resource allocation and the availability of compute resources, the system schedules the query.
-
公开(公告)号:US11294941B1
公开(公告)日:2022-04-05
申请号:US16000688
申请日:2018-06-05
Applicant: Splunk Inc.
Inventor: Eric Sammer , Sourav Pal , Joseph Gabriel Echeverria
IPC: G06F16/00 , G06F16/31 , G06F16/38 , G06F16/951 , G06F16/33
Abstract: Systems and methods are described for preprocessing data later ingested into an indexing system. The preprocessing can include receiving messages published to a first publish-subscribe messaging system, the messages containing raw machine data generated by one or more components in an information technology environment, performing one or more processing operations on at least some of the messages to generate preprocessed messages, republishing the preprocessed messages to a second publish-subscribe messaging system, and providing to the indexing system, a subset of the messages from the second publish-subscribe messaging system.
-
公开(公告)号:US11243963B2
公开(公告)日:2022-02-08
申请号:US16051223
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F17/00 , G06F16/2458 , H04L29/08 , G06F16/25 , G06F16/9032 , G06F16/903
Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed, and generates a subquery for the external data system indicating that the results of the subquery are to be sent to multiple worker nodes. The system also generates instructions for multiple worker nodes to receive and process results of the subquery from the external data system.
-
公开(公告)号:US11232100B2
公开(公告)日:2022-01-25
申请号:US15665187
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Christopher Pride
IPC: G06F16/242 , G06F16/22 , G06F16/2453
Abstract: Systems and methods are disclosed for processing queries against multiple dataset sources. One dataset source can include indexers that index and store data. The system can receive a query that identifies a set of data to be processed and a manner of processing the set of data. The set of data can include a first dataset that is accessible by one or more indexers and a second dataset that is accessible by one or more other dataset sources. A query coordinator can define a query processing scheme for obtaining and processing the set of data that includes a dynamic allocation of multiple layers of partitions. The partitions can operate on multiple worker nodes. The query can then be executed based on the query processing scheme.
-
76.
公开(公告)号:US11222066B1
公开(公告)日:2022-01-11
申请号:US15967588
申请日:2018-04-30
Applicant: Splunk Inc.
Inventor: Alexandros Batsakis , Sourav Pal , Sai Krishna , Igor Stojanovski , Tameem Anwar , Paul J. Lucas , Eric Woo , Steve Wong
IPC: G06F16/901 , G06F16/903 , G06F3/06 , G06F16/23 , G06F16/27
Abstract: Systems and methods are disclosed for processing and executing queries in a data intake and query system. The data intake and query system receives raw machine data at an indexing system, and stores at least a portion of the raw machine data in buckets using containerized indexing nodes instantiated in a containerized environment. The data intake and query system stores the buckets in a shared storage system.
-
公开(公告)号:US11093564B1
公开(公告)日:2021-08-17
申请号:US16147129
申请日:2018-09-28
Applicant: Splunk Inc.
Inventor: Alexander Douglas James , Manu Jose , Sourav Pal , Christopher Madden Pride , Nicholas Robert Romito , Igor Braylovskiy , Arun Ramani , Ankit Jain
IPC: G06F16/00 , G06F16/9535 , G06F9/54 , G06F16/242 , G06F40/205
Abstract: Systems and methods are disclosed for processing and executing queries in a data intake and query system. The data intake and query system receives a query identifying a set of data to be processed and a manner of processing the set of data. The data intake and query system parses the query and uses a metadata catalog to dynamically identify configuration parameters of datasets and/or rules associated with the query. The identified configuration parameters are communicated to a query processing component of the data intake and query system for use in executing the query.
-
公开(公告)号:US10977260B2
公开(公告)日:2021-04-13
申请号:US16051300
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
Abstract: Systems and methods are disclosed for processing data chunks from different data sources at an execution node in a distributed execution environment. The execution node receives data chunks from different sources and combines data from groups of data chunks into partitions based on an associated data source. The execution node executes the partitions using one or more processors.
-
公开(公告)号:US20210049177A1
公开(公告)日:2021-02-18
申请号:US17086043
申请日:2020-10-30
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Christopher Pride
IPC: G06F16/2455 , G06F11/30 , G06F7/53 , G06F16/27 , G06F11/34
Abstract: Systems and methods are disclosed for processing and executing queries against one or more dataset. As part of processing the query, the system determines whether the query is susceptible to a significantly imbalanced partition. In the event, the query is susceptible to an imbalanced partition, the system monitors the query and determines whether to perform a multi-partitioning determination to avoid a significantly imbalanced partition.
-
公开(公告)号:US20200257691A1
公开(公告)日:2020-08-13
申请号:US16851979
申请日:2020-04-17
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Alexander Douglas James
IPC: G06F16/2455 , G06F16/13 , G06F16/23 , G06F16/242 , G06F16/903 , G06F16/901
Abstract: Systems and methods are disclosed for generating a distributed execution model with untrusted commands. The system can receive a query, and process the query to identify the untrusted commands. The system can use data associated with the untrusted command to identify one or more files associated with the untrusted command. Based on the files, the system can generate a data structure and include one or more identifiers associated with the data structure in the distributed execution model. The system can distribute the distributed execution model to one or more nodes in a distributed computing environment for execution.
-
-
-
-
-
-
-
-
-