KEYWORD SEARCH OVER HEAVY-TAILED DATA AND MULTI-KEYWORD QUERIES
    52.
    发明申请
    KEYWORD SEARCH OVER HEAVY-TAILED DATA AND MULTI-KEYWORD QUERIES 审中-公开
    关键字搜索超重数据和多关键字查询

    公开(公告)号:US20090083214A1

    公开(公告)日:2009-03-26

    申请号:US11858920

    申请日:2007-09-21

    CPC classification number: G06F16/3331 G06F16/313

    Abstract: Index structures and query processing framework that enforces a given threshold on the overhead of computing conjunctive keyword queries. This includes a keyword processing algorithm, logic to determine which indexes to materialize, and a probabilistic approach to reducing the overhead for determining which indexes to build. The index structures leverage the fact that the frequency distribution of natural-language text follows a power law. Given a document collection, a set of indexes is proposed for materialization so that the time for intersecting keywords does not exceed a given threshold Δ. When considering the associated space requirement, the additional indexes are limited. Materialization of such a set of indexes for reasonable values of Δ (e.g., the time required to scan 20% of the largest inverted index), at least for a collection of short documents is distributed by the power law.

    Abstract translation: 索引结构和查询处理框架,其对计算关键词查询的开销执行给定的阈值。 这包括关键字处理算法,确定要实现哪些索引的逻辑,以及减少用于确定构建哪些索引的开销的概率方法。 指数结构利用了自然语言文本的频率分布遵循幂律的事实。 给定文档集合,提出了一组索引用于实现,以便关键字相交的时间不超过给定的阈值Delta。 在考虑相关空间需求时,附加指标有限。 对于合理的Delta值(例如,扫描20%的最大倒排指数所需的时间),至少对于短文件的收集,这种一组索引的实现是通过权力法分配的。

    Dynamic physical database design
    54.
    发明授权
    Dynamic physical database design 有权
    动态物理数据库设计

    公开(公告)号:US07483918B2

    公开(公告)日:2009-01-27

    申请号:US10914901

    申请日:2004-08-10

    Abstract: A monitoring component of a database server collects a subset of a query workload along with related statistics. A remote index tuning component uses the workload subset and related statistics to determine a physical design that minimizes the cost of executing queries in the workload subset while ensuring that queries omitted from the subset do not degrade in performance.

    Abstract translation: 数据库服务器的监视组件收集查询工作负载的一部分以及相关统计信息。 远程索引调整组件使用工作负载子集和相关统计信息来确定最小化在工作负载子集中执行查询的成本的物理设计,同时确保从子集中省略的查询不会降低性能。

    Integrating horizontal partitioning into physical database design
    55.
    发明授权
    Integrating horizontal partitioning into physical database design 有权
    将水平分区整合到物理​​数据库设计中

    公开(公告)号:US07472107B2

    公开(公告)日:2008-12-30

    申请号:US10601416

    申请日:2003-06-23

    Abstract: Integrating the partitioning of physical design structures with the physical design process can result in more efficient query execution. When candidate structures are evaluated for their relative benefit, one or more partitioning methods is associated with each structure so that the benefits of various partitioning methods are taken into consideration when the structures are selected for use by the database. A pool of partitioned candidate structures is formed by proposing and evaluating the benefit of candidate structures with associated partitioning on a per query basis. The selected partitioned candidates are then used to construct generalized structures with associated partitioning methods that are evaluated for their benefit over the workload. Those generalized structures are added to the pool of partitioned candidate structures. From this augmented pool of partitioned candidate structures, an optimal set of partitioned structures is enumerated for use by the database system.

    Abstract translation: 将物理设计结构的分区与物理设计过程集成可以实现更有效的查询执行。 当评估候选结构的相对效益时,一个或多个分区方法与每个结构相关联,以便在选择结构以供数据库使用时考虑各种分区方法的优点。 通过在每个查询的基础上提出并评估具有关联划分的候选结构的优点来形成分区候选结构池。 然后,所选择的分区候选者用于构建具有相关分区方法的通用结构,该方法被评估为其对工作负载的好处。 那些广义结构被添加到分区候选结构的池中。 从这个扩展的分区候选结构池中,列举了一组最佳的分区结构,供数据库系统使用。

    LIGHTWEIGHT PHYSICAL DESIGN ALERTER
    56.
    发明申请
    LIGHTWEIGHT PHYSICAL DESIGN ALERTER 有权
    轻型物理设计报警器

    公开(公告)号:US20080183644A1

    公开(公告)日:2008-07-31

    申请号:US11669782

    申请日:2007-01-31

    CPC classification number: G06F17/30306

    Abstract: A lightweight physical design alerter can analyze a workload and determine whether a comprehensive tuning session would result in a configuration improvement over the current configuration. The alerter provides a low-overhead procedure that can run during normal operation of a database management system and produce a notification if a current configuration is less than optimal. The alerter can report lower and upper bounds on the improvements that could be obtained if a comprehensive tuning tool is launched. A lower bound can be justified by generating feasible configurations. The disclosed embodiments can be extended to query updates, materialized views, and other physical design features (e.g., partitioning).

    Abstract translation: 轻量级物理设计报警器可以分析工作负载并确定综合调优会话是否会导致配置改进超过当前配置。 报警器提供了一个低开销的过程,可以在数据库管理系统的正常操作期间运行,并在当前配置不太适合的情况下产生通知。 报警器可以报告如果启动综合调整工具可以获得的改进的上下限。 可以通过生成可行的配置来证明下限。 所公开的实施例可以扩展到查询更新,物化视图和其他物理设计特征(例如,分区)。

    Method and apparatus for exploiting statistics on query expressions for optimization
    57.
    发明授权
    Method and apparatus for exploiting statistics on query expressions for optimization 有权
    利用查询表达式进行统计优化的方法和装置

    公开(公告)号:US07363289B2

    公开(公告)日:2008-04-22

    申请号:US11177598

    申请日:2005-07-07

    Abstract: A method for evaluating a user query on a relational database having records stored therein, a workload made up of a set of queries that have been executed on the database, and a query optimizer that generates a query execution plan for the user query. Each query plan includes a plurality of intermediate query plan components that verify a subset of records from the database meeting query criteria. The method accesses the query plan and a set of stored intermediate statistics for records verified by query components, such as histograms that summarize the cardinality of the records that verify the query component. The method forms a transformed query plan based on the selected intermediate statistics (possibly by rewriting the query plan) and estimates the cardinality of the transformed query plan to arrive at a more accurate cardinality estimate for the query. If additional intermediate statistics are necessary, a pool of intermediate statistics may be generated based on the queries in the workload by evaluating the benefit of a given statistic over the workload and adding intermediate statistics to the pool that provide relatively great benefit.

    Abstract translation: 一种用于评估具有存储在其中的记录的关系数据库的用户查询的方法,由在数据库上执行的一组查询组成的工作负载以及生成用户查询的查询执行计划的查询优化器。 每个查询计划包括多个中间查询计划组件,其从数据库会议查询条件验证记录的子集。 该方法访问查询计划和一组存储的中间统计信息,用于查询组件验证的记录,例如总结验证查询组件的记录的基数的直方图。 该方法基于所选择的中间统计(可能通过重写查询计划)形成转换的查询计划,并且估计转换后的查询计划的基数以得到查询的更准确的基数估计。 如果需要额外的中间统计数据,则可以根据工作负载中的查询生成中间统计数据池,方法是评估给定统计量对工作负载的好处,并将中间统计信息添加到提供相对较大收益的池中。

    Automated layout of relational databases
    58.
    发明授权
    Automated layout of relational databases 有权
    关系数据库的自动布局

    公开(公告)号:US07249141B2

    公开(公告)日:2007-07-24

    申请号:US10426235

    申请日:2003-04-30

    Abstract: Layout in a database system is performed using workload information. Execution information for a workload is obtained. Cumulative access and co-access information for database objects is then assembled. A cost model is developed for quantitatively capturing the value of different layouts, and a search is performed for a recommended database layout. In one embodiment, a greedy search is performed which initially attempts provide a layout that minimizes co-location of objects on storage objects, and then attempts to improve that layout via a greedy search.

    Abstract translation: 使用工作负载信息执行数据库系统中的布局。 获取工作负载的执行信息。 然后组合数据库对象的累积访问和共存信息。 开发了一种成本模型,用于定量捕获不同布局的值,并为推荐的数据库布局执行搜索。 在一个实施例中,执行贪婪搜索,其最初尝试提供使存储对象上的对象的共同定位最小化的布局,然后尝试通过贪婪搜索来改进该布局。

Patent Agency Ranking