Approximate order statistics of real numbers in generic data

    公开(公告)号:US09645975B2

    公开(公告)日:2017-05-09

    申请号:US14255981

    申请日:2014-04-18

    Applicant: Splunk Inc.

    Inventor: Steve Yu Zhang

    Abstract: A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.

    Generating and Storing Summarization Tables for Sets of Searchable Events
    43.
    发明申请
    Generating and Storing Summarization Tables for Sets of Searchable Events 有权
    生成和存储可搜索事件集合的汇总表

    公开(公告)号:US20160004750A1

    公开(公告)日:2016-01-07

    申请号:US14815973

    申请日:2015-08-01

    Applicant: Splunk Inc.

    Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

    Abstract translation: 实施例针对事件的透明总结。 可以在搜索头收到针对事件记录的总结和报告的查询。 搜索头可能与一个包含事件记录的索引器相关联。 搜索头可以将查询转发给索引器,可以解析用于并发执行的查询。 如果查询是集合查询,则索引器可以基于位于索引器上的事件记录生成摘要信息。 包含在汇总信息中的事件记录字段可以基于收集查询中包含的项来确定。 如果查询是统计查询,则每个索引器可以从先前生成的摘要信息生成部分结果集,将部分结果集返回到搜索头。 收集查询可以保存并计划运行,并定期更新摘要信息。

    Supplementing a high performance analytics store with evaluation of individual events to respond to an event query
    44.
    发明授权
    Supplementing a high performance analytics store with evaluation of individual events to respond to an event query 有权
    补充高性能分析商店,评估各种事件以响应事件查询

    公开(公告)号:US09128985B2

    公开(公告)日:2015-09-08

    申请号:US14170159

    申请日:2014-01-31

    Applicant: SPLUNK INC.

    Abstract: Embodiments are directed are towards the transparent summarization of events. Queries directed towards summarizing and reporting on event records may be received at a search head. Search heads may be associated with one more indexers containing event records. The search head may forward the query to the indexers the can resolve the query for concurrent execution. If a query is a collection query, indexers may generate summarization information based on event records located on the indexers. Event record fields included in the summarization information may be determined based on terms included in the collection query. If a query is a stats query, each indexer may generate a partial result set from previously generated summarization information, returning the partial result sets to the search head. Collection queries may be saved and scheduled to run and periodically update the summarization information.

    Abstract translation: 实施例针对事件的透明总结。 可以在搜索头收到针对事件记录的总结和报告的查询。 搜索头可能与一个包含事件记录的索引器相关联。 搜索头可以将查询转发给索引器,可以解析用于并发执行的查询。 如果查询是集合查询,则索引器可以基于位于索引器上的事件记录生成摘要信息。 包含在汇总信息中的事件记录字段可以基于收集查询中包含的项来确定。 如果查询是统计查询,则每个索引器可以从先前生成的摘要信息生成部分结果集,将部分结果集返回到搜索头。 收集查询可以保存并计划运行,并定期更新摘要信息。

    Interactive Display of Aggregated Search Result Information
    45.
    发明申请
    Interactive Display of Aggregated Search Result Information 审中-公开
    综合搜索结果信息的交互式显示

    公开(公告)号:US20150058326A1

    公开(公告)日:2015-02-26

    申请号:US14530692

    申请日:2014-11-01

    Applicant: Splunk Inc.

    Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

    Abstract translation: 方法,系统和处理器可读存储介质被引导为生成从存储在多个分布式节点上的诸如事件数据的数据导出的报告。 在一个实施例中,使用“分割和征服”算法生成分析,使得每个分布式节点分析本地存储的事件数据,而聚合节点组合这些分析结果以生成报告。 在一个实施例中,每个分布式节点还将与分析结果相关联的事件数据引用的列表发送到聚合节点。 然后,聚合节点可以基于从每个分布式节点接收的事件数据参考的列表来生成数据引用的全局有序列表。 随后,响应于用户选择一系列全局事件数据,报告可以动态地从一个或多个分布式节点检索事件数据,以便根据全局顺序进行显示。

    Scalable interactive display of distributed data
    47.
    发明授权
    Scalable interactive display of distributed data 有权
    分布式数据的可扩展交互式显示

    公开(公告)号:US08751529B2

    公开(公告)日:2014-06-10

    申请号:US13660845

    申请日:2012-10-25

    Applicant: Splunk Inc.

    Abstract: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.

    Abstract translation: 方法,系统和处理器可读存储介质被引导为生成从存储在多个分布式节点上的诸如事件数据的数据导出的报告。 在一个实施例中,使用“分割和征服”算法生成分析,使得每个分布式节点分析本地存储的事件数据,而聚合节点组合这些分析结果以生成报告。 在一个实施例中,每个分布式节点还将与分析结果相关联的事件数据引用的列表发送到聚合节点。 然后,聚合节点可以基于从每个分布式节点接收的事件数据参考的列表来生成数据引用的全局有序列表。 随后,响应于用户选择一系列全局事件数据,报告可以动态地从一个或多个分布式节点检索事件数据,以便根据全局顺序进行显示。

    GENERATION OF A DATA MODEL FOR SEARCHING MACHINE DATA
    48.
    发明申请
    GENERATION OF A DATA MODEL FOR SEARCHING MACHINE DATA 有权
    用于搜索机器数据的数据模型的生成

    公开(公告)号:US20140074889A1

    公开(公告)日:2014-03-13

    申请号:US14067203

    申请日:2013-10-30

    Applicant: Splunk Inc.

    Abstract: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model. The method further includes generating a new query string using the data model, executing the new query string on the data, and generating a new result set based on the new query string being executed on the data.

    Abstract translation: 实施例包括生成可以给非结构化或结构化数据赋予语义意义的数据模型,其可以包括由搜索引擎(包括时间序列引擎)生成和/或接收的数据。 一种方法包括为存储在存储库中的数据生成数据模型。 生成数据模型包括生成初始查询字符串,对数据执行初始查询字符串,基于对数据执行的初始查询字符串生成初始结果集,从一个或多个初始查询字符串的结果确定一个或多个候选字段 生成基于一个或多个候选字段的候选数据模型,迭代地修改候选数据模型,直到候选数据模型对数据建模,并使用候选数据模型作为数据模型。 该方法还包括使用数据模型生成新的查询字符串,对数据执行新的查询字符串,并且基于对数据执行的新查询字符串生成新的结果集。

    Metadata tracking for a pipelined search language (data modeling for fields)
    49.
    发明授权
    Metadata tracking for a pipelined search language (data modeling for fields) 有权
    流水线搜索语言的元数据跟踪(字段的数据建模)

    公开(公告)号:US08583631B1

    公开(公告)日:2013-11-12

    申请号:US13756105

    申请日:2013-01-31

    Applicant: Splunk Inc.

    Abstract: Embodiments are directed towards determining and tracking metadata for the generation of visualizations of requested data. A user may request data by providing a query that may be employed to search for the requested data. The query may include a plurality of commands, which may be employed in a pipeline to perform the search and to generate a table of the requested data. In some embodiments, each command may be executed to perform an action on a set of data. The execution of a command may generate one or more columns to append and/or insert into the table of requested data. Metadata for each generated column may be determined based on the actions performed by executing the commands. The table of requested data and the column metadata may be employed to generate and display a visualization of at least a portion of the requested data to a user.

    Abstract translation: 实施例旨在确定和跟踪用于生成所请求数据的可视化的元数据。 用户可以通过提供可用于搜索所请求的数据的查询来请求数据。 该查询可以包括多个命令,其可以在流水线中用于执行搜索并生成所请求的数据的表。 在一些实施例中,可以执行每个命令以对一组数据执行动作。 命令的执行可以生成一个或多个列来附加和/或插入到所请求的数据的表中。 可以基于通过执行命令执行的动作来确定每个生成的列的元数据。 可以使用所请求的数据和列元数据的表来生成并向用户显示所请求的数据的至少一部分的可视化。

Patent Agency Ranking