Optimizing index file sizes based on indexed data storage conditions

    公开(公告)号:US10235431B2

    公开(公告)日:2019-03-19

    申请号:US15011473

    申请日:2016-01-29

    Applicant: Splunk Inc.

    Abstract: Techniques and mechanisms are disclosed to optimize the size of index files to improve use of storage space available to indexers and other components of a data intake and query system. Index files of a data intake and query system may include, among other data, a keyword portion containing mappings between keywords and location references to event data containing the keywords. Optimizing an amount of storage space used by index files may include removing, modifying and/or recreating various components of index files in response to detecting one or more storage conditions related to the event data indexed by the index files. The optimization of index files generally may attempt to manage a tradeoff between an efficiency with which search requests can be processed using the index files and an amount of storage space occupied by the index files.

    Replication of summary data in a clustered computing environment

    公开(公告)号:US10387448B2

    公开(公告)日:2019-08-20

    申请号:US14929089

    申请日:2015-10-30

    Applicant: Splunk Inc.

    Abstract: Techniques and mechanisms are disclosed to increase the availability of summary data within a clustered data intake and query system by replicating the summary data within the cluster. In general, summary data may store “pre-computed” results for one or more search queries and can be used by indexers of a cluster to process subsequent instances of the same search queries. At a high level, replication of summary data within a cluster may include ensuring that each instance of summary data created by an indexer of a cluster is replicated to other indexers within the cluster that store copies of the same grouped subset(s) of data to which the summary data relates. In this manner, if one or more indexers of an indexer cluster fail, other indexers of the cluster can make immediate use of replicated copies of the summary data without re-creating it.

    DISASTER RECOVERY IN A CLUSTERED ENVIRONMENT USING GENERATION IDENTIFIERS

    公开(公告)号:US20210279251A1

    公开(公告)日:2021-09-09

    申请号:US17228429

    申请日:2021-04-12

    Applicant: SPLUNK, INC.

    Abstract: A method for performing disaster recovery in a clustered environment comprises identifying, at a master device, a first indexer from a set of indexers to serve as a primary indexer for responding to queries pertaining to a subset of data. The method also comprises assigning, at the master device, a generation identifier indicating that the first indexer is the primary indexer for the subset of data. Responsive to an event prompting a change in a primary indexer designation for the subset of data, the method comprises identifying, at the master device, a second indexer from the set of indexers to serve as the primary indexer for responding to queries pertaining to the subset of data. Further, the method comprises assigning, at the master device, a new generation identifier indicating that the second indexer is the primary indexer for the subset of data.

    REPLICATION OF SUMMARY DATA IN A CLUSTERED COMPUTING ENVIRONMENT
    4.
    发明申请
    REPLICATION OF SUMMARY DATA IN A CLUSTERED COMPUTING ENVIRONMENT 审中-公开
    集群计算环境中的摘要数据的复制

    公开(公告)号:US20160055225A1

    公开(公告)日:2016-02-25

    申请号:US14929089

    申请日:2015-10-30

    Applicant: Splunk Inc.

    Abstract: Techniques and mechanisms are disclosed to increase the availability of summary data within a clustered data intake and query system by replicating the summary data within the cluster. In general, summary data may store “pre-computed” results for one or more search queries and can be used by indexers of a cluster to process subsequent instances of the same search queries. At a high level, replication of summary data within a cluster may include ensuring that each instance of summary data created by an indexer of a cluster is replicated to other indexers within the cluster that store copies of the same grouped subset(s) of data to which the summary data relates. In this manner, if one or more indexers of an indexer cluster fail, other indexers of the cluster can make immediate use of replicated copies of the summary data without re-creating it.

    Abstract translation: 公开了技术和机制,以通过复制集群内的摘要数据来增加集群数据采集和查询系统内的摘要数据的可用性。 通常,摘要数据可以存储一个或多个搜索查询的“预先计算的”结果,并且可以由群集的索引器使用来处理相同搜索查询的后续实例。 在高级别中,集群内的摘要数据的复制可以包括确保由集群的索引器创建的每个概要数据实例被复制到集群内的其他索引器,其将相同的分组数据子集的副本存储到 摘要数据与之相关。 以这种方式,如果索引器集群的一个或多个索引器失败,集群的其他索引器可以立即使用摘要数据的复制副本,而无需重新创建。

    Reducing index file size based on event attributes

    公开(公告)号:US11934418B2

    公开(公告)日:2024-03-19

    申请号:US17447620

    申请日:2021-09-14

    Applicant: Splunk Inc.

    CPC classification number: G06F16/248 G06F16/2228 G06F16/285 G06F16/21

    Abstract: Techniques and mechanisms are disclosed to optimize the size of index files to improve use of storage space available to indexers and other components of a data intake and query system. Index files of a data intake and query system may include, among other data, a keyword portion containing mappings between keywords and location references to event data containing the keywords. Optimizing an amount of storage space used by index files may include removing, modifying and/or recreating various components of index files in response to detecting one or more storage conditions related to the event data indexed by the index files. The optimization of index files generally may attempt to manage a tradeoff between an efficiency with which search requests can be processed using the index files and an amount of storage space occupied by the index files.

    Reducing index file size based on event attributes

    公开(公告)号:US11138218B2

    公开(公告)日:2021-10-05

    申请号:US16259975

    申请日:2019-01-28

    Applicant: Splunk Inc.

    Abstract: Techniques and mechanisms are disclosed to optimize the size of index files to improve use of storage space available to indexers and other components of a data intake and query system. Index files of a data intake and query system may include, among other data, a keyword portion containing mappings between keywords and location references to event data containing the keywords. Optimizing an amount of storage space used by index files may include removing, modifying and/or recreating various components of index files in response to detecting one or more storage conditions related to the event data indexed by the index files. The optimization of index files generally may attempt to manage a tradeoff between an efficiency with which search requests can be processed using the index files and an amount of storage space occupied by the index files.

    Executing data searches using generation identifiers

    公开(公告)号:US11003687B2

    公开(公告)日:2021-05-11

    申请号:US16451582

    申请日:2019-06-25

    Applicant: SPLUNK, INC.

    Abstract: Techniques and mechanisms are disclosed to execute data searches using generation identifiers. In general, a method of executing the searches comprises broadcasting, from a search head, a first query to a plurality of indexers in a cluster, wherein a portion of the first query is directed to a set of data, and wherein the set of data comprises time-stamps within a particular time frame. The method further comprises providing, with the first query, a first generation identifier for the set of data, wherein the first generation identifier identifies a first indexer from the plurality of indexers to serve as a primary indexer for responding to queries that comprise the first generation identifier and that pertain to the set of data, wherein one or more indexers in the cluster other than the first indexer are designated as secondary indexers, wherein the secondary indexers are configured to ignore queries that pertain to the set of data and that comprise the first generation identifier. Subsequently, the method comprises receiving a response to the first query from the plurality of indexers.

    EXECUTING DATA SEARCHES USING GENERATION IDENTIFIERS

    公开(公告)号:US20190317947A1

    公开(公告)日:2019-10-17

    申请号:US16451582

    申请日:2019-06-25

    Applicant: SPLUNK, INC.

    Abstract: Techniques and mechanisms are disclosed to execute data searches using generation identifiers. In general, a method of executing the searches comprises broadcasting, from a search head, a first query to a plurality of indexers in a cluster, wherein a portion of the first query is directed to a set of data, and wherein the set of data comprises time-stamps within a particular time frame. The method further comprises providing, with the first query, a first generation identifier for the set of data, wherein the first generation identifier identifies a first indexer from the plurality of indexers to serve as a primary indexer for responding to queries that comprise the first generation identifier and that pertain to the set of data, wherein one or more indexers in the cluster other than the first indexer are designated as secondary indexers, wherein the secondary indexers are configured to ignore queries that pertain to the set of data and that comprise the first generation identifier. Subsequently, the method comprises receiving a response to the first query from the plurality of indexers.

Patent Agency Ranking