Abstract:
Techniques are described for managing data within a multi-site clustered data intake and query system. A data intake and query system as described herein generally refers to a system for collecting, retrieving, and analyzing data. In this context, a clustered data intake and query system generally refers to a system environment that is configured to provide data redundancy and other features that improve the availability of data stored by the system. For example, a clustered data intake and query system may be configured to store multiple copies of data stored by the system across multiple components such that recovery from a failure of one or more of the components is possible by using copies of the data stored elsewhere in the cluster.
Abstract:
According to various embodiments, techniques are described for managing data within a multi-site clustered data intake and query system. A data intake and query system as described herein generally refers to a system for collecting, retrieving, and analyzing data. In this context, a clustered data intake and query system generally refers to a system environment that is configured to provide data redundancy and other features that improve the availability of data stored by the system. For example, a clustered data intake and query system may be configured to store multiple copies of data stored by the system across multiple components such that recovery from a failure of one or more of the components is possible by using copies of the data stored elsewhere in the cluster.
Abstract:
Techniques and mechanisms are disclosed to increase the availability of summary data within a clustered data intake and query system by replicating the summary data within the cluster. In general, summary data may store “pre-computed” results for one or more search queries and can be used by indexers of a cluster to process subsequent instances of the same search queries. At a high level, replication of summary data within a cluster may include ensuring that each instance of summary data created by an indexer of a cluster is replicated to other indexers within the cluster that store copies of the same grouped subset(s) of data to which the summary data relates. In this manner, if one or more indexers of an indexer cluster fail, other indexers of the cluster can make immediate use of replicated copies of the summary data without re-creating it.
Abstract:
A method for performing disaster recovery in a clustered environment comprises identifying, at a master device, a first indexer from a set of indexers to serve as a primary indexer for responding to queries pertaining to a subset of data. The method also comprises assigning, at the master device, a generation identifier indicating that the first indexer is the primary indexer for the subset of data. Responsive to an event prompting a change in a primary indexer designation for the subset of data, the method comprises identifying, at the master device, a second indexer from the set of indexers to serve as the primary indexer for responding to queries pertaining to the subset of data. Further, the method comprises assigning, at the master device, a new generation identifier indicating that the second indexer is the primary indexer for the subset of data.
Abstract:
Techniques and mechanisms are disclosed to increase the availability of summary data within a clustered data intake and query system by replicating the summary data within the cluster. In general, summary data may store “pre-computed” results for one or more search queries and can be used by indexers of a cluster to process subsequent instances of the same search queries. At a high level, replication of summary data within a cluster may include ensuring that each instance of summary data created by an indexer of a cluster is replicated to other indexers within the cluster that store copies of the same grouped subset(s) of data to which the summary data relates. In this manner, if one or more indexers of an indexer cluster fail, other indexers of the cluster can make immediate use of replicated copies of the summary data without re-creating it.
Abstract:
Techniques are described for managing data within a multi-site clustered data intake and query system. A data intake and query system as described herein generally refers to a system for collecting, retrieving, and analyzing data. In this context, a clustered data intake and query system generally refers to a system environment that is configured to provide data redundancy and other features that improve the availability of data stored by the system. For example, a clustered data intake and query system may be configured to store multiple copies of data stored by the system across multiple components such that recovery from a failure of one or more of the components is possible by using copies of the data stored elsewhere in the cluster.
Abstract:
A method for performing disaster recovery in a clustered environment comprises identifying, at a master device, a first indexer from a set of indexers to serve as a primary indexer for responding to queries pertaining to a subset of data. The method also comprises assigning, at the master device, a generation identifier indicating that the first indexer is the primary indexer for the subset of data. Responsive to an event prompting a change in a primary indexer designation for the subset of data, the method comprises identifying, at the master device, a second indexer from the set of indexers to serve as the primary indexer for responding to queries pertaining to the subset of data. Further, the method comprises assigning, at the master device, a new generation identifier indicating that the second indexer is the primary indexer for the subset of data.
Abstract:
A method for performing disaster recovery in a clustered environment comprises identifying, at a master device, a first indexer from a set of indexers to serve as a primary indexer for responding to queries pertaining to a subset of data. The method also comprises assigning, at the master device, a generation identifier indicating that the first indexer is the primary indexer for the subset of data. Responsive to an event prompting a change in a primary indexer designation for the subset of data, the method comprises identifying, at the master device, a second indexer from the set of indexers to serve as the primary indexer for responding to queries pertaining to the subset of data. Further, the method comprises assigning, at the master device, a new generation identifier indicating that the second indexer is the primary indexer for the subset of data.
Abstract:
Techniques and mechanisms are disclosed to execute data searches using generation identifiers. In general, a method of executing the searches comprises broadcasting, from a search head, a first query to a plurality of indexers in a cluster, wherein a portion of the first query is directed to a set of data, and wherein the set of data comprises time-stamps within a particular time frame. The method further comprises providing, with the first query, a first generation identifier for the set of data, wherein the first generation identifier identifies a first indexer from the plurality of indexers to serve as a primary indexer for responding to queries that comprise the first generation identifier and that pertain to the set of data, wherein one or more indexers in the cluster other than the first indexer are designated as secondary indexers, wherein the secondary indexers are configured to ignore queries that pertain to the set of data and that comprise the first generation identifier. Subsequently, the method comprises receiving a response to the first query from the plurality of indexers.
Abstract:
Techniques and mechanisms are disclosed to execute data searches using generation identifiers. In general, a method of executing the searches comprises broadcasting, from a search head, a first query to a plurality of indexers in a cluster, wherein a portion of the first query is directed to a set of data, and wherein the set of data comprises time-stamps within a particular time frame. The method further comprises providing, with the first query, a first generation identifier for the set of data, wherein the first generation identifier identifies a first indexer from the plurality of indexers to serve as a primary indexer for responding to queries that comprise the first generation identifier and that pertain to the set of data, wherein one or more indexers in the cluster other than the first indexer are designated as secondary indexers, wherein the secondary indexers are configured to ignore queries that pertain to the set of data and that comprise the first generation identifier. Subsequently, the method comprises receiving a response to the first query from the plurality of indexers.