-
公开(公告)号:US20200004794A1
公开(公告)日:2020-01-02
申请号:US16570545
申请日:2019-09-13
Applicant: Splunk Inc.
Inventor: Sourav Pal , Christopher Madden Pride , Arindam Bhattacharjee , Xiaowei Wang , James Alasdair Robert Hodge , Mustafa Ahamed
IPC: G06F16/951 , G06F16/21 , G06F16/25 , G06F16/904 , G06F16/901 , G06F16/9038 , G06F16/903 , G06F16/248 , G06F16/2458 , G06F16/27 , G06F16/2455
Abstract: Disclosed is a technique that can be performed in a distributed computer network. The technique can include a data index and query system that receives a search query, defines a search scheme for applying the search query on distributed data storage systems including an internal data storage system of the data index and query system and an external data storage system. The internal data storage system stores data as time-indexed events including respective segments of raw machine data. The data index and query system can transfer a portion of the search scheme to a search service, which can return search results obtained by application of the search scheme to the distributed data storage systems including the internal data storage system and the external data storage system. Lastly, the search results or data indicative of the search results can be output on a display device to the user.
-
公开(公告)号:US20190147084A1
公开(公告)日:2019-05-16
申请号:US16051304
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed and generates a subquery for the external data system indicating that the results of the subquery are to be sent to one worker node of multiple worker nodes. The system instructs the one worker node to distribute the results received from the external data system to multiple worker nodes for processing.
-
公开(公告)号:US20180089258A1
公开(公告)日:2018-03-29
申请号:US15665187
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Christopher Pride
IPC: G06F17/30
CPC classification number: G06F16/2425 , G06F16/2272 , G06F16/24535
Abstract: Systems and methods are disclosed for processing queries against multiple dataset sources. One dataset source can include indexers that index and store data. The system can receive a query that identifies a set of data to be processed and a manner of processing the set of data. The set of data can include a first dataset that is accessible by one or more indexers and a second dataset that is accessible by one or more other dataset sources. A query coordinator can define a query processing scheme for obtaining and processing the set of data that includes a dynamic allocation of multiple layers of partitions. The partitions can operate on multiple worker nodes. The query can then be executed based on the query processing scheme.
-
公开(公告)号:US12204536B2
公开(公告)日:2025-01-21
申请号:US17658792
申请日:2022-04-11
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Nikhil Roy
IPC: G06F16/00 , G06F16/17 , G06F16/22 , G06F16/242 , G06F16/2453 , G06F16/2458 , G06F16/25
Abstract: Systems and methods are described for scheduling a query for execution. The system receives and parses a query to identify one or more portions of the query. The system determines a resource allocation for each portion of the query, and determines an availability of compute resources for the different portions of the query. Based on the resource allocation and the availability of compute resources, the system schedules the query.
-
公开(公告)号:US11921672B2
公开(公告)日:2024-03-05
申请号:US16657872
申请日:2019-10-18
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Timothy Tully
CPC classification number: G06F16/148 , G06F16/13 , G06F16/1734
Abstract: Systems and methods are described for executing a query of raw machine data that is stored at a remote data store that may store heterogeneous data. The system can determine the directories or file types that may store event data and may instruct one or more worker nodes to access files that may store events based on the determined directories of file types. Further, the system may exclude files at the remote data store that may not be identified as potentially storing events enabling a query that implicates a heterogeneous data store to be efficiently executed.
-
公开(公告)号:US11604795B2
公开(公告)日:2023-03-14
申请号:US16051304
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F17/00 , G06F16/2453 , G06F16/25 , G06F16/21 , G06F16/28 , G06F16/2455 , G06F16/2458 , G06F40/205
Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed and generates a subquery for the external data system indicating that the results of the subquery are to be sent to one worker node of multiple worker nodes. The system instructs the one worker node to distribute the results received from the external data system to multiple worker nodes for processing.
-
公开(公告)号:US11500875B2
公开(公告)日:2022-11-15
申请号:US17086043
申请日:2020-10-30
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Christopher Pride
Abstract: Systems and methods are disclosed for processing and executing queries against one or more dataset. As part of processing the query, the system determines whether the query is susceptible to a significantly imbalanced partition. In the event, the query is susceptible to an imbalanced partition, the system monitors the query and determines whether to perform a multi-partitioning determination to avoid a significantly imbalanced partition.
-
公开(公告)号:US20220156335A1
公开(公告)日:2022-05-19
申请号:US17589764
申请日:2022-01-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Alexander Douglas James , Sourav Pal
IPC: G06F16/9535 , G06F9/54 , G06F9/50 , G06F16/903 , G06F16/2458
Abstract: Systems and methods are disclosed for processing streaming data. The data can come from various sources. Worker nodes can be configured to process the streaming data, without delays that may be caused by indexing the data. The data can be filtered and/or transformed as it is processed. In some cases, data can be stored in a data store without transformation. The data in the data store can be accessed and processed at a later time.
-
公开(公告)号:US11314753B2
公开(公告)日:2022-04-26
申请号:US16051310
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F7/00 , G06F16/2458 , G06F16/27 , G06F16/21 , G06F16/22
Abstract: Systems and methods are disclosed for receiving and executing a query received from a data intake and query system and providing results to a first group of worker nodes in a distributed execution environment. The query identifies a set of data to be processed and a manner of processing the set of data. Based on the query, the system defines a query processing scheme, and generates instructions for a second group of worker nodes to obtain the set of data from one or more dataset sources and to process the set of data. The system communicates results of the query to the first group of worker nodes.
-
公开(公告)号:US11238112B2
公开(公告)日:2022-02-01
申请号:US16675026
申请日:2019-11-05
Applicant: Splunk Inc.
Inventor: James Alasdair Robert Hodge , Sourav Pal , Arindam Bhattacharjee , Mustafa Ahamed
IPC: G06F16/00 , G06F16/951 , G06F16/21 , G06F16/25 , G06F16/904 , G06F16/901 , G06F16/9038 , G06F16/903 , G06F16/248 , G06F16/2458 , G06F16/27 , G06F16/2455
Abstract: The disclosed embodiments also include monitoring and metering services of the data fabric service (DFS) system. Specifically, these services can include techniques for monitoring and metering metrics of the DFS system. The metrics are standards for measuring use or misuse of the DFS system. Examples of the metrics include data or components of the DFS system. For example, a metric can include data stored or communicated by the DFS system or components of the DFS system that are used or reserved for exclusive use by customers. The metrics can be measured with respect to time or computing resources (e.g., CPU utilization, memory usage) of the DFS system. For example, a DFS service can include metering the usage of particular worker nodes by a customer over a threshold period of time.
-
-
-
-
-
-
-
-
-