-
公开(公告)号:US11243963B2
公开(公告)日:2022-02-08
申请号:US16051223
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F17/00 , G06F16/2458 , H04L29/08 , G06F16/25 , G06F16/9032 , G06F16/903
Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed, and generates a subquery for the external data system indicating that the results of the subquery are to be sent to multiple worker nodes. The system also generates instructions for multiple worker nodes to receive and process results of the subquery from the external data system.
-
公开(公告)号:US11232100B2
公开(公告)日:2022-01-25
申请号:US15665187
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Christopher Pride
IPC: G06F16/242 , G06F16/22 , G06F16/2453
Abstract: Systems and methods are disclosed for processing queries against multiple dataset sources. One dataset source can include indexers that index and store data. The system can receive a query that identifies a set of data to be processed and a manner of processing the set of data. The set of data can include a first dataset that is accessible by one or more indexers and a second dataset that is accessible by one or more other dataset sources. A query coordinator can define a query processing scheme for obtaining and processing the set of data that includes a dynamic allocation of multiple layers of partitions. The partitions can operate on multiple worker nodes. The query can then be executed based on the query processing scheme.
-
公开(公告)号:US10977260B2
公开(公告)日:2021-04-13
申请号:US16051300
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
Abstract: Systems and methods are disclosed for processing data chunks from different data sources at an execution node in a distributed execution environment. The execution node receives data chunks from different sources and combines data from groups of data chunks into partitions based on an associated data source. The execution node executes the partitions using one or more processors.
-
公开(公告)号:US20210049177A1
公开(公告)日:2021-02-18
申请号:US17086043
申请日:2020-10-30
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Christopher Pride
IPC: G06F16/2455 , G06F11/30 , G06F7/53 , G06F16/27 , G06F11/34
Abstract: Systems and methods are disclosed for processing and executing queries against one or more dataset. As part of processing the query, the system determines whether the query is susceptible to a significantly imbalanced partition. In the event, the query is susceptible to an imbalanced partition, the system monitors the query and determines whether to perform a multi-partitioning determination to avoid a significantly imbalanced partition.
-
公开(公告)号:US20200257691A1
公开(公告)日:2020-08-13
申请号:US16851979
申请日:2020-04-17
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Alexander Douglas James
IPC: G06F16/2455 , G06F16/13 , G06F16/23 , G06F16/242 , G06F16/903 , G06F16/901
Abstract: Systems and methods are disclosed for generating a distributed execution model with untrusted commands. The system can receive a query, and process the query to identify the untrusted commands. The system can use data associated with the untrusted command to identify one or more files associated with the untrusted command. Based on the files, the system can generate a data structure and include one or more identifiers associated with the data structure in the distributed execution model. The system can distribute the distributed execution model to one or more nodes in a distributed computing environment for execution.
-
公开(公告)号:US20200167395A1
公开(公告)日:2020-05-28
申请号:US16777602
申请日:2020-01-30
Applicant: Splunk Inc.
Inventor: Sourav Pal , Christopher Madden Pride , Arindam Bhattacharjee , Xiaowei Wang , James Alasdair Robert Hodge , Mustafa Ahamed
IPC: G06F16/951 , G06F16/25 , G06F16/2455 , G06F16/27 , G06F16/2458 , G06F16/248 , G06F16/21 , G06F16/903 , G06F16/9038 , G06F16/901 , G06F16/904
Abstract: Disclosed is a data fabric service system that can be implemented in a distributed computer network, such as a data intake and query system. The data index and query system can receive a search query and define a search scheme for applying the search query on distributed data storage systems including internal data storage and external data storage. The data index and query system may provide a portion of the search scheme to a search service of the data fabric service system, which can cause worker nodes of the data fabric service system to perform various functions—including applying the search query to the external data storage based on the portion of the search scheme in order to obtain search results.
-
公开(公告)号:US10599724B2
公开(公告)日:2020-03-24
申请号:US15339840
申请日:2016-10-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Christopher Pride
IPC: G06F16/00 , G06F16/951 , G06F16/21 , G06F16/25 , G06F16/904 , G06F16/901 , G06F16/9038 , G06F16/903 , G06F16/248 , G06F16/2458 , G06F16/27 , G06F16/2455
Abstract: The disclosed embodiments include techniques for organizing and presenting search results obtained from within a big data ecosystem via a data intake and query system. In particular, a data intake and query system may cause output of the search results or data indicative of the search results on a display device.
-
公开(公告)号:US10599723B2
公开(公告)日:2020-03-24
申请号:US15339835
申请日:2016-10-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Xiaowei Wang , Christopher Pride , James Alasdair Robert Hodge
IPC: G06F16/00 , G06F16/951 , G06F16/21 , G06F16/25 , G06F16/904 , G06F16/901 , G06F16/9038 , G06F16/903 , G06F16/248 , G06F16/2458 , G06F16/27 , G06F16/2455
Abstract: The disclosed embodiments include techniques for exporting partial search results in parallel from peer indexers of a data intake and query system to the worker nodes. In particular, partial search results (e.g., time-indexed events) obtained from peer indexers can be exported in parallel from the peer indexers to worker nodes. Exporting the partial search results from the peer indexers in parallel can improve the rate at which the partial search results are transferred to the worker nodes for subsequent combination with partial search results of the external data systems. As such, the rate at which the search results of a search query can be obtained from the distributed data system can be improved by implementing parallel export techniques.
-
公开(公告)号:US10592561B2
公开(公告)日:2020-03-17
申请号:US15339845
申请日:2016-10-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal
IPC: G06F16/00 , G06F16/951 , G06F16/21 , G06F16/25 , G06F16/904 , G06F16/901 , G06F16/9038 , G06F16/903 , G06F16/248 , G06F16/2458 , G06F16/27 , G06F16/2455
Abstract: The capabilities of a data intake and query system can be improved by implementing the data fabric service (DFS) system in a co-located deployment with the data intake and query system. The DFS system can extend the capabilities of a data intake and query system by leveraging computing assets from anywhere in a big data ecosystem to collectively execute search queries on diverse data systems regardless of whether data stores are internal of the data intake and query system and/or external data stores that are communicatively coupled to the data intake and query system over a network.
-
公开(公告)号:US20200050612A1
公开(公告)日:2020-02-13
申请号:US16657916
申请日:2019-10-18
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Timothy Tully
IPC: G06F16/28 , G06F16/2452
Abstract: Systems and methods are described for distributed processing a query in a first query language utilizing a query execution engine intended for single-device execution. While distributed processing provides numerous benefits over single-device processing, distributed query execution engines can be significantly more difficult to develop that single-device engines. Embodiments of this disclosure enable the use of a single-device engine to support distributed processing, by dividing a query into multiple stages, each of which can be executed by multiple, concurrent executions of a single-device engine. Between stages, data can be shuffled between executions of the engine, such that individual executions of the engine are provided with a complete set of records needed to implement an individual stage. Because single-device engines can be significantly less difficult to develop, use of the techniques described herein can enable a distributed system to rapidly support multiple query languages.
-
-
-
-
-
-
-
-
-