FACILITATING GENERATION OF DATA MODEL SUMMARIES

    公开(公告)号:US20220245091A1

    公开(公告)日:2022-08-04

    申请号:US17163039

    申请日:2021-01-29

    申请人: SPLUNK INC.

    IPC分类号: G06F16/13 G06F16/182

    摘要: Embodiments described herein facilitate enhancement of data model acceleration, including generating data model summaries and performing searches in an accelerated manner. In one implementation, a set of events are indexed, each of the events having a corresponding index time representing a time at which the event was indexed in an indexer. Index time parameters including an index earliest time indicating a first index time at which to begin generating a data model summary and an index latest time indicating a second index time at which to complete generating the data model summary are obtained. Thereafter, a data model summary is generated. Such a data model summary summarizes events having corresponding index times between the index earliest time and the index latest time. The data model summary is provided to a remote data store that is separate from the indexer at which at least a portion of the events were indexed.

    BUCKET MERGING FOR A DATA INTAKE AND QUERY SYSTEM USING SIZE THRESHOLDS

    公开(公告)号:US20220261385A1

    公开(公告)日:2022-08-18

    申请号:US17661510

    申请日:2022-04-29

    申请人: Splunk Inc.

    IPC分类号: G06F16/22

    摘要: Systems and methods are disclosed for scalable bucket merging in a data intake and query system. Various components of a bucket manager can be used to monitor recently-created buckets of data in common storage that are associated with a particular tenant and a particular index, apply a comprehensive bucket merge policy to determine groups of buckets that qualify for merging, merge those group of buckets into merged buckets to be stored in the common storage, and update any information associated with the merged buckets and pre-merged buckets. These components may be shared across multiple tenants, and some of these components may be dynamically scalable based on need. This approach may also provide many additional benefits, including improved search performance from merged buckets, efficient resource utilization associated with discriminate merging, and redundancy in case of component failure.

    Pre-fetching files from buckets in remote storage for a cache based on file usage history

    公开(公告)号:US10678696B2

    公开(公告)日:2020-06-09

    申请号:US16049357

    申请日:2018-07-30

    申请人: Splunk, Inc.

    摘要: Embodiments are disclosed for a prefetching method that may include copying, in response to a search query, a first bucket from a remote storage to a cache. The first bucket may include first data associated with the search query. The method may further include identifying a first file type associated with a first file in the first bucket. The first file may be associated with a usage status. The method may further include accessing, based on the search query, a second bucket from the remote storage. The second bucket may include second data associated with the search query. The method may further include identifying a second file in the second bucket having the first file type, and copying, in response to the usage status indicating that the first file was used in processing the search query, the second file from the remote storage to the cache.