Run time prediction for data queries

    公开(公告)号:US10133775B1

    公开(公告)日:2018-11-20

    申请号:US14219802

    申请日:2014-03-19

    Abstract: Techniques are described for modeling data query execution time based on a cost of data queries, where the cost provides a measure of the processing resources used by the data query while executing. Using regression analysis or other statistical methods, a model may be generated that enables the prediction of the query execution time based on the query cost. In some cases, the model may be generated based on a linear regression analysis of previously measured execution times and previously determined data query costs. The model may be stored and employed prior to, or during, the subsequent execution of a data query, to predict the execution time of the data query. Data queries that execute substantially longer than the predicted execution time may be terminated.

    Storage configuration in data warehouses
    2.
    发明授权
    Storage configuration in data warehouses 有权
    数据仓库中的存储配置

    公开(公告)号:US09563687B1

    公开(公告)日:2017-02-07

    申请号:US14540648

    申请日:2014-11-13

    CPC classification number: G06F17/30306 G06F17/30339

    Abstract: Techniques are described for employing a graph-based analysis to determine a configuration of datasets to be stored on data storage systems in a data warehouse environment. Associations between datasets may be determined based on the parsing of join statements or other types of statements in jobs that are executed on the data storage systems. A graph may be generated that describes the associations among datasets. A greedy breadth-first traversal of the graph may be performed to determine sets of associated datasets. A utilization metric describing a weight of storing the datasets may be determined and employed to identify a data storage system on which to store a set of associated datasets, given the storage and processing capacity of the data storage system.

    Abstract translation: 描述了采用基于图形的分析来确定要存储在数据仓库环境中的数据存储系统上的数据集的配置的技术。 可以基于在数据存储系统上执行的作业中的连接语句或其他类型的语句的解析来确定数据集之间的关联。 可以生成描述数据集之间关联的图形。 可以执行图的贪心宽度优先遍历以确定相关数据集的集合。 考虑到数据存储系统的存储和处理能力,可以确定描述存储数据集的权重的使用度量,并用于识别在其上存储一组相关联的数据集的数据存储系统。

    Query data acquisition and analysis
    3.
    发明授权
    Query data acquisition and analysis 有权
    查询数据采集和分析

    公开(公告)号:US09489423B1

    公开(公告)日:2016-11-08

    申请号:US13973324

    申请日:2013-08-22

    Abstract: Described in this disclosure are systems and techniques for acquiring query data which includes an execution plan descriptive of how queries used to access a database are processed. In one implementation, an inquiry analysis system uses a copy of a production system to generate execution plan information. The copy includes tables, relationships, metadata, and so forth, but may omit data in the tables, allowing for a compact installation. By analyzing the query data, usage trends, inefficient queries, unused fields, and so forth may be determined and used for maintenance or performance improvements.

    Abstract translation: 在本公开中描述的是用于获取查询数据的系统和技术,其包括描述如何处理用于访问数据库的查询的执行计划。 在一个实现中,查询分析系统使用生产系统的副本来生成执行计划信息。 副本包括表,关系,元数据等,但可以省略表中的数据,从而进行紧凑的安装。 通过分析查询数据,可以确定使用趋势,低效率查询,未使用字段等,并用于维护或性能改进。

Patent Agency Ranking