-
公开(公告)号:US20190163821A1
公开(公告)日:2019-05-30
申请号:US15276717
申请日:2016-09-26
Applicant: Splunk Inc.
Inventor: Sourav Pal , Christopher Pride , Arindam Bhattacharjee , Xiaowei Wang , James Alasdair Robert Hodge , Mustafa Ahamed
IPC: G06F17/30
CPC classification number: G06F16/951 , G06F16/211 , G06F16/212 , G06F16/2455 , G06F16/2471 , G06F16/248 , G06F16/252 , G06F16/258 , G06F16/27 , G06F16/9024 , G06F16/90335 , G06F16/9038 , G06F16/904
Abstract: Disclosed is a technique that can be performed in a distributed computer network. The technique can include a data index and query system that receives search query, defines a search scheme for applying the search query on distributed data storage systems including an internal data storage system of the data index and query system and an external data storage system. The internal data storage system stores data as time-indexed events including respective segments of raw machine data. The data index and query system can transfer a portion of the search scheme to a search service, which can return search results obtained by application of the search scheme to the distributed data storage systems including the internal data storage system and the external data storage system. Lastly, the search results or data indicative of the search results can be output on a display device to the user.
-
公开(公告)号:US20190147086A1
公开(公告)日:2019-05-16
申请号:US16147165
申请日:2018-09-28
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F17/30
Abstract: Systems and methods are disclosed for receiving, at a data intake and query system, a query that includes an indication to process data managed by a third-party data storage and processing system that supports a different query language than the data intake and query system. The data intake and query system identifies a third-party data storage and processing system that manages the data to be processed and generates a subquery for execution by the third-party data storage and processing system, generates instructions for one or more worker nodes to receive and process results of the subquery from the third-party data storage and processing system, and instructs the worker nodes to provide results of the processing to the data intake and query system.
-
公开(公告)号:US20190138642A1
公开(公告)日:2019-05-09
申请号:US16051310
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F17/30
Abstract: Systems and methods are disclosed for receiving and executing a query received from a data intake and query system and providing results to a first group of worker nodes in a distributed execution environment. The query identifies a set of data to be processed and a manner of processing the set of data. Based on the query, the system defines a query processing scheme, and generates instructions for a second group of worker nodes to obtain the set of data from one or more dataset sources and to process the set of data. The system communicates results of the query to the first group of worker nodes.
-
公开(公告)号:US20190095493A1
公开(公告)日:2019-03-28
申请号:US15713976
申请日:2017-09-25
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Christopher Pride
Abstract: In an environment where multiple datasets are to be combined, systems and methods are disclosed for allocating a group of data entries from at least one dataset into multiple partitions. For a particular partition, the subgroup in the partition can be combined with data entries from the other dataset. In some cases, groups of data entries from each dataset are assigned to different partitions. For a particular partition, a subgroup is duplicated, some of the data entries of the subgroup are reassigned to other partitions, the subgroup is reformed to include data entries from other partitions, and the reformed subgroup is combined with the subgroup from the other dataset(s).
-
公开(公告)号:US20190095491A1
公开(公告)日:2019-03-28
申请号:US15714424
申请日:2017-09-25
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Alexander Douglas James
IPC: G06F17/30
Abstract: Systems and methods are disclosed for generating a distributed execution model with untrusted commands. The system can receive a query, and process the query to identify the untrusted commands. The system can use data associated with the untrusted command to identify one or more files associated with the untrusted command. Based on the files, the system can generate a data structure and include one or more identifiers associated with the data structure in the distributed execution model. The system can distribute the distributed execution model to one or more nodes in a distributed computing environment for execution.
-
公开(公告)号:US20180089324A1
公开(公告)日:2018-03-29
申请号:US15665339
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Alexander Douglas James
CPC classification number: G06F16/9535 , G06F9/5011 , G06F9/546 , G06F16/2471 , G06F16/90335
Abstract: Systems and methods are disclosed for utilizing an ingested data buffer operating according to a publish-subscribe messaging model as an intake mechanism for a query system. Data from various sources can be placed into the data buffer according to different topics. Indexers can subscribe to these topics in order to ingest the data into the system for long-term storage and later search. In addition, worker nodes may directly subscribe to the topics to enable continuous or streaming searching of the data, without delays that may be caused by ingestion of the data at an indexer. When a request for a streaming search is received, a query coordinator can determine a number of message queues on the data buffer that contain potentially relevant messages. The query coordinator can then dynamically allocate partitions operating on worker nodes to retrieve and intake messages from the message queues into a phased search process.
-
公开(公告)号:US20180089269A1
公开(公告)日:2018-03-29
申请号:US15665148
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Christopher Pride
IPC: G06F17/30
CPC classification number: G06F16/24542 , G06F16/24554 , G06F16/258
Abstract: Systems and methods are disclosed for processing queries against one or more dataset sources. The system tracks query resource data and resource utilization data. The query-resource usage data can indicate resources used to execute queries. The node resource utilization data can indicate current utilization of nodes in the system. Upon receipt of a query that identifies a set of data to be processed and a manner of processing the set of data, the system can use the query-resource usage data and the resource utilization data to define a query processing scheme. The query can then be executed using the query processing scheme. In some cases, the query coordinator can dynamically allocate partitions operating on worker nodes to execute the query.
-
公开(公告)号:US20180089262A1
公开(公告)日:2018-03-29
申请号:US15665302
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Ramkumar Chandrasekharan
IPC: G06F17/30
CPC classification number: G06F16/24532 , G06F16/24535 , G06F16/24554 , G06F16/2465 , G06F16/3334 , G06F16/3349
Abstract: Systems and methods are disclosed for processing queries against a common storage utilizing dynamically allocated partitions operating on one or more worker nodes. The common storage can include one or more data stores, which collectively contain a data set divided across multiple buckets of data. To query the common storage, a query coordinator can retrieve metadata regarding the multiple buckets, in order to determine a subset of buckets that are potentially relevant to a query. The query coordinator can then dynamically allocate partitions operating on worker nodes to retrieve and intake individual buckets of the subset into a phased search process. The dynamic allocation can be selected to maximize parallelization of the buckets across partitions, thus increasing a speed at which the common storage can be searched.
-
公开(公告)号:US20180089259A1
公开(公告)日:2018-03-29
申请号:US15665248
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Alexander Douglas James , Sourav Pal , Arindam Bhattacharjee , Christopher Pride
IPC: G06F17/30
CPC classification number: G06F16/2425 , G06F16/2282
Abstract: Systems and methods are disclosed for processing queries against an external data source utilizing dynamically allocated partitions operating on one or more worker nodes. The external data source can include data that has not been processed by the system. To query the external data source, a query coordinator can generate a subquery for the external data source based on determined functionality of the data source. The subquery can identify data in the external data source for processing and a manner for processing the data. In addition, the query coordinator can dynamically allocate partitions operating on worker nodes to retrieve and intake results of the subquery. In some cases, number of partitions allocated can be based on a number of partitions supported by the external data source.
-
公开(公告)号:US12248484B2
公开(公告)日:2025-03-11
申请号:US16657894
申请日:2019-10-18
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Wayne Patterson
IPC: G06F16/2458 , G06F16/2453
Abstract: Systems and methods are described for reducing execution time of a query that references external data systems. The system can determine an external data system is capable of processing one or more map or reduce phases of a map-reduce operation. When it is determined that the external data system can process a map or reduce phase, associated operations may be reassigned from the system to the external data system reducing the processing resources used by the system to response to the query and, in some cases, speeding up performance of the query.
-
-
-
-
-
-
-
-
-