Abstract:
Multi-thread processing of search responses is disclosed. An example method may include transmitting, by a computer system, a search request to a plurality of search peers of a data aggregation and analysis system; receiving a plurality of data packets from the plurality of search peers; parsing, by a first processing thread of the computer system, one or more data packets of the plurality of data packets, to produce a partial response to the search request; and processing, by a second processing thread of the computer system, the partial response to produce a memory data structure representing an aggregated response to the search request.
Abstract:
Systems and methods are disclosed for utilizing an ingested data buffer operating according to a publish-subscribe messaging model as an intake mechanism for a query system. Data from various sources can be placed into the data buffer according to different topics. Indexers can subscribe to these topics in order to ingest the data into the system for long-term storage and later search. In addition, worker nodes may directly subscribe to the topics to enable continuous or streaming searching of the data, without delays that may be caused by ingestion of the data at an indexer. When a request for a streaming search is received, a query coordinator can determine a number of message queues on the data buffer that contain potentially relevant messages. The query coordinator can then dynamically allocate partitions operating on worker nodes to retrieve and intake messages from the message queues into a phased search process.
Abstract:
Systems and methods are disclosed for processing queries against one or more dataset sources. The system tracks query resource data and resource utilization data. The query-resource usage data can indicate resources used to execute queries. The node resource utilization data can indicate current utilization of nodes in the system. Upon receipt of a query that identifies a set of data to be processed and a manner of processing the set of data, the system can use the query-resource usage data and the resource utilization data to define a query processing scheme. The query can then be executed using the query processing scheme. In some cases, the query coordinator can dynamically allocate partitions operating on worker nodes to execute the query.
Abstract:
Systems and methods are disclosed for processing queries against a common storage utilizing dynamically allocated partitions operating on one or more worker nodes. The common storage can include one or more data stores, which collectively contain a data set divided across multiple buckets of data. To query the common storage, a query coordinator can retrieve metadata regarding the multiple buckets, in order to determine a subset of buckets that are potentially relevant to a query. The query coordinator can then dynamically allocate partitions operating on worker nodes to retrieve and intake individual buckets of the subset into a phased search process. The dynamic allocation can be selected to maximize parallelization of the buckets across partitions, thus increasing a speed at which the common storage can be searched.
Abstract:
Systems and methods are disclosed for processing queries against an external data source utilizing dynamically allocated partitions operating on one or more worker nodes. The external data source can include data that has not been processed by the system. To query the external data source, a query coordinator can generate a subquery for the external data source based on determined functionality of the data source. The subquery can identify data in the external data source for processing and a manner for processing the data. In addition, the query coordinator can dynamically allocate partitions operating on worker nodes to retrieve and intake results of the subquery. In some cases, number of partitions allocated can be based on a number of partitions supported by the external data source.
Abstract:
Systems and methods for asynchronous processing of messages that are received from multiple servers. An example method may comprise: receiving, by a first processing thread, in a non-blocking mode, a plurality of sub-application layer protocol packets from a plurality of servers; processing one or more sub-application layer protocol packets received from a first server of the plurality of servers, to produce a first application layer message; writing the first application layer message to a message queue; processing one or more sub-application layer protocol packets received from a second server of the plurality of servers, to produce a second application layer message; writing the second application layer message to the message queue; and reading, by two or more processing threads of a processing thread pool, two or more application layer messages including the first application layer message and the second application layer message from the message queue, to produce two or more memory data structures based on the read application layer messages.
Abstract:
The disclosed embodiments include a method performed by a data intake and query system. The method includes receiving a search query by a search head, defining a search process for applying the search query to indexers, delegating a first portion of the search process to indexers and a second portion of the search process to intermediary node(s) communicatively coupled to the search head and the indexers. The first portion can define a search scope for obtaining partial search results of the indexers and the second portion can define operations for combining the partial search results by the intermediary node(s) to produce a combination of the partial search results. The search head then receives the combination of the partial search results, and outputs final search results for the search query, where the final search results are based on the combination of the partial search results.
Abstract:
Systems and methods are disclosed for processing and executing queries in a data intake and query system. The data intake and query system receives raw machine data at an indexing system, and stores at least a portion of the raw machine data in buckets using containerized indexing nodes instantiated in a containerized environment. The data intake and query system stores the buckets in a shared storage system.
Abstract:
Systems and methods are described for distributed processing a query in a first query language utilizing a query execution engine intended for single-device execution. While distributed processing provides numerous benefits over single-device processing, distributed query execution engines can be significantly more difficult to develop that single-device engines. Embodiments of this disclosure enable the use of a single-device engine to support distributed processing, by dividing a query into multiple stages, each of which can be executed by multiple, concurrent executions of a single-device engine. Between stages, data can be shuffled between executions of the engine, such that individual executions of the engine are provided with a complete set of records needed to implement an individual stage. Because single-device engines can be significantly less difficult to develop, use of the techniques described herein can enable a distributed system to rapidly support multiple query languages.
Abstract:
Multi-threaded processing of search responses returned by search peers is disclosed. An example method may include transmitting, by a computer system, a search request to a plurality of search peers of a data aggregation and analysis system; receiving, by a first processing thread, a plurality of data packets from the plurality of search peers; parsing, by a second processing thread, one or more data packets of the plurality of data packets to produce a first partial response to the search request; parsing, by a third processing thread, the one or more data packets to produce a second partial response to the search request; and generating, based on the first partial response and the second partial response, an aggregated response to the search request.