-
公开(公告)号:US20190258637A1
公开(公告)日:2019-08-22
申请号:US16397970
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Wayne Patterson , Srinivas Bobba
IPC: G06F16/2453 , G06F16/22 , G06F16/27
Abstract: Systems and methods are described for partitioning and reducing records at ingest of a worker node. The worker node receives chunks of data from one or more indexers of a data intake and query system based on the execution of a query by the data intake and query system. The worker node assigns records to different record groups based on the content of the records. The system also assigns the record to a partition of a group of partitions. Record data of the records in a particular partition is combined. The system processes the partitions based on the query.
-
公开(公告)号:US20190258635A1
公开(公告)日:2019-08-22
申请号:US16398044
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade
IPC: G06F16/2453 , G06F16/242 , G06F16/22 , G06F16/901 , G06F16/2458
Abstract: Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.
-
公开(公告)号:US20190171676A1
公开(公告)日:2019-06-06
申请号:US16264430
申请日:2019-01-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Christopher Pride , Arindam Bhattacharjee , Xiaowei Wang , James Alasdair Robert Hodge , Mustafa Ahamed
IPC: G06F16/951 , G06F16/21 , G06F16/903 , G06F16/9038 , G06F16/904 , G06F16/25 , G06F16/901
Abstract: Disclosed is a technique that can be performed in a distributed computer network. The technique can include a data index and query system that receives a search query and defines a search scheme for applying the search query on distributed data storage systems including an internal data storage system of the data intake and query system and an external data storage system communicatively coupled to the data intake and query system over a network. The data index and query system communicates at least a portion of the search scheme to a search service for application on behalf of the data intake and query system, receives from the search service a search result of the search query obtained by application of the search scheme to the distributed data storage systems, and causes the search result or data indicative thereof to be displayed on a display device.
-
公开(公告)号:US20190163842A1
公开(公告)日:2019-05-30
申请号:US15339847
申请日:2016-10-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F17/30
CPC classification number: G06F16/951 , G06F16/211 , G06F16/212 , G06F16/2455 , G06F16/2471 , G06F16/248 , G06F16/252 , G06F16/258 , G06F16/27 , G06F16/9024 , G06F16/90335 , G06F16/9038 , G06F16/904
Abstract: The performance and flexibility of a data intake and query system having capabilities extended by a fabric service (DFS) system can be improved with deployment on a cloud computing platform. The DFS system can extend the capabilities of a data intake and query system by leveraging computing assets from anywhere in a big data ecosystem to collectively execute search queries on diverse data systems regardless of whether data stores are internal of the data intake and query system and/or external data stores that are communicatively coupled to the data intake and query system over a network.
-
公开(公告)号:US20190163841A1
公开(公告)日:2019-05-30
申请号:US15339845
申请日:2016-10-31
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal
IPC: G06F17/30
CPC classification number: G06F16/951 , G06F16/211 , G06F16/212 , G06F16/2455 , G06F16/2471 , G06F16/248 , G06F16/252 , G06F16/258 , G06F16/27 , G06F16/9024 , G06F16/90335 , G06F16/9038 , G06F16/904
Abstract: The capabilities of a data intake and query system can be improved by implementing the data fabric service (DFS) system in a co-located deployment with the data intake and query system. The DFS system can extend the capabilities of a data intake and query system by leveraging computing assets from anywhere in a big data ecosystem to collectively execute search queries on diverse data systems regardless of whether data stores are internal of the data intake and query system and/or external data stores that are communicatively coupled to the data intake and query system over a network.
-
公开(公告)号:US20190163821A1
公开(公告)日:2019-05-30
申请号:US15276717
申请日:2016-09-26
Applicant: Splunk Inc.
Inventor: Sourav Pal , Christopher Pride , Arindam Bhattacharjee , Xiaowei Wang , James Alasdair Robert Hodge , Mustafa Ahamed
IPC: G06F17/30
CPC classification number: G06F16/951 , G06F16/211 , G06F16/212 , G06F16/2455 , G06F16/2471 , G06F16/248 , G06F16/252 , G06F16/258 , G06F16/27 , G06F16/9024 , G06F16/90335 , G06F16/9038 , G06F16/904
Abstract: Disclosed is a technique that can be performed in a distributed computer network. The technique can include a data index and query system that receives search query, defines a search scheme for applying the search query on distributed data storage systems including an internal data storage system of the data index and query system and an external data storage system. The internal data storage system stores data as time-indexed events including respective segments of raw machine data. The data index and query system can transfer a portion of the search scheme to a search service, which can return search results obtained by application of the search scheme to the distributed data storage systems including the internal data storage system and the external data storage system. Lastly, the search results or data indicative of the search results can be output on a display device to the user.
-
公开(公告)号:US20190147086A1
公开(公告)日:2019-05-16
申请号:US16147165
申请日:2018-09-28
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F17/30
Abstract: Systems and methods are disclosed for receiving, at a data intake and query system, a query that includes an indication to process data managed by a third-party data storage and processing system that supports a different query language than the data intake and query system. The data intake and query system identifies a third-party data storage and processing system that manages the data to be processed and generates a subquery for execution by the third-party data storage and processing system, generates instructions for one or more worker nodes to receive and process results of the subquery from the third-party data storage and processing system, and instructs the worker nodes to provide results of the processing to the data intake and query system.
-
公开(公告)号:US20190138642A1
公开(公告)日:2019-05-09
申请号:US16051310
申请日:2018-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee
IPC: G06F17/30
Abstract: Systems and methods are disclosed for receiving and executing a query received from a data intake and query system and providing results to a first group of worker nodes in a distributed execution environment. The query identifies a set of data to be processed and a manner of processing the set of data. Based on the query, the system defines a query processing scheme, and generates instructions for a second group of worker nodes to obtain the set of data from one or more dataset sources and to process the set of data. The system communicates results of the query to the first group of worker nodes.
-
公开(公告)号:US20190095493A1
公开(公告)日:2019-03-28
申请号:US15713976
申请日:2017-09-25
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Christopher Pride
Abstract: In an environment where multiple datasets are to be combined, systems and methods are disclosed for allocating a group of data entries from at least one dataset into multiple partitions. For a particular partition, the subgroup in the partition can be combined with data entries from the other dataset. In some cases, groups of data entries from each dataset are assigned to different partitions. For a particular partition, a subgroup is duplicated, some of the data entries of the subgroup are reassigned to other partitions, the subgroup is reformed to include data entries from other partitions, and the reformed subgroup is combined with the subgroup from the other dataset(s).
-
公开(公告)号:US20190095491A1
公开(公告)日:2019-03-28
申请号:US15714424
申请日:2017-09-25
Applicant: Splunk Inc.
Inventor: Arindam Bhattacharjee , Sourav Pal , Alexander Douglas James
IPC: G06F17/30
Abstract: Systems and methods are disclosed for generating a distributed execution model with untrusted commands. The system can receive a query, and process the query to identify the untrusted commands. The system can use data associated with the untrusted command to identify one or more files associated with the untrusted command. Based on the files, the system can generate a data structure and include one or more identifiers associated with the data structure in the distributed execution model. The system can distribute the distributed execution model to one or more nodes in a distributed computing environment for execution.
-
-
-
-
-
-
-
-
-