Primitives for workload summarization
    101.
    发明申请
    Primitives for workload summarization 有权
    用于工作负载摘要的基元

    公开(公告)号:US20050223026A1

    公开(公告)日:2005-10-06

    申请号:US10815061

    申请日:2004-03-31

    Abstract: A database object summarization tool is provided that selects a subset of database objects subject to filtering constraints such as a partial order or optimization of some attribute. A dominance primitive filters out tuples that are dominated according to a partial order constraint by another tuple. A representation primitive selects a representative subset of tuples such than an optimization criteria is met.

    Abstract translation: 提供了一种数据库对象摘要工具,该工具选择受过滤约束(如某些属性的部分顺序或优化)的数据库对象的子集。 优势原语过滤掉由另一个元组根据部分顺序约束所主导的元组。 表示基元选择满足优化标准的元组的代表性子集。

    Method and apparatus for exploiting statistics on query expressions for optimization

    公开(公告)号:US06947927B2

    公开(公告)日:2005-09-20

    申请号:US10191822

    申请日:2002-07-09

    Abstract: A method for evaluating a user query on a relational database having records stored therein, a workload made up of a set of queries that have been executed on the database, and a query optimizer that generates a query execution plan for the user query. Each query plan includes a plurality of intermediate query plan components that verify a subset of records from the database meeting query criteria. The method accesses the query plan and a set of stored intermediate statistics for records verified by query components, such as histograms that summarize the cardinality of the records that verify the query component. The method forms a transformed query plan based on the selected intermediate statistics (possibly by rewriting the query plan) and estimates the cardinality of the transformed query plan to arrive at a more accurate cardinality estimate for the query. If additional intermediate statistics are necessary, a pool of intermediate statistics may be generated based on the queries in the workload by evaluating the benefit of a given statistic over the workload and adding intermediate statistics to the pool that provide relatively great benefit.

    Database monitoring system
    103.
    发明申请
    Database monitoring system 有权
    数据库监控系统

    公开(公告)号:US20050192921A1

    公开(公告)日:2005-09-01

    申请号:US10788077

    申请日:2004-02-26

    Abstract: A framework is provided within a database system for specifying database monitoring rules that will be evaluated as part of the execution code path of database events being monitored. The occurrence of a selected database event triggers a rule that evaluates some parameter of an object related to the event against a condition in the rule. If the condition is met, a specified action is taken that can alter the execution of the database event or database system performance. Lightweight aggregation tables are utilized to enable aggregation of object parameter values so that presently occurring events can be compared to a summary of the object parameter values from previously occurring database events. Signatures are assigned to queries based on the structure of the query plan so that information in the lightweight aggregation tables can be grouped according to query signature.

    Abstract translation: 在数据库系统中提供一个框架,用于指定数据库监视规则,该规则将作为被监视的数据库事件的执行代码路径的一部分进行评估。 所选数据库事件的发生触发一个规则,该规则根据规则中的条件来评估与事件相关的对象的某些参数。 如果满足条件,则采取可以改变数据库事件或数据库系统性能执行的指定操作。 轻量级聚合表用于启用对象参数值的聚合,以便将当前发生的事件与先前发生的数据库事件的对象参数值的摘要进行比较。 根据查询计划的结构将签名分配给查询,以便轻量级聚合表中的信息可以根据查询签名进行分组。

    Compressing database workloads
    104.
    发明授权
    Compressing database workloads 有权
    压缩数据库工作负载

    公开(公告)号:US06912547B2

    公开(公告)日:2005-06-28

    申请号:US10180667

    申请日:2002-06-26

    Abstract: Relational database applications such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these applications are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such applications is the size of the workload. The invention concerns workload compression which helps improve the scalability of such applications. The exemplary embodiment is broadly applicable to a variety of workload-driven applications, while allowing for incorporation of application specific knowledge. The process is described in detail in the context of two workload-driven applications: index selection and approximate query processing.

    Abstract translation: 诸如索引选择,直方图调整,近似查询处理和统计选择等关系数据库应用程序已经认识到利用工作负载的重要性。 通常,这些应用程序具有大的工作负载,即一组SQL DML语句作为输入。 影响这些应用程序可扩展性的关键因素是工作负载的大小。 本发明涉及工作负载压缩,这有助于提高这种应用的可扩展性。 该示例性实施例广泛地适用于各种工作负载驱动的应用,同时允许结合应用特定的知识。 该过程在两个工作负载驱动的应用程序的上下文中进行了详细描述:索引选择和近似查询处理。

    Generalized keyword matching for keyword based searching over relational databases
    105.
    发明授权
    Generalized keyword matching for keyword based searching over relational databases 有权
    通过关键字搜索关系数据库的广义关键词匹配

    公开(公告)号:US06792414B2

    公开(公告)日:2004-09-14

    申请号:US10036348

    申请日:2001-10-19

    Abstract: Searching by keywords and providing generalized matching capabilities on a relational database is enabled by performing preprocessing operations to construct inverted list lookup tables based on data record components at an interim level of granularity, such as column location. Prefix information is in the inverted list stored for each keyword, keyword sub-string, or stemmed version of the keyword. A keyword search is performed on the lookup tables rather than the database tables to determine database column locations of the keyword. The lookup tables is scanned to identify each prefix associated with the search term. Schema information about the database is used to link the column locations to form database subgraphs that span the keywords. Join tables are to generated based on the subgraphs consisting of columns containing the keywords. A query on the database is generated to join the tables and retrieve database rows that contain the keyword and the prefixes associated with the keyword. The retrieved rows are ranked in order of relevance before being output. By preprocessing a relational database to form lookup tables, and initially searching the lookup tables to obtain a targeted subset of the database upon which SQL queries can be performed to collect data records, keyword searching on relational database is made efficient.

    Abstract translation: 通过关键字搜索和在关系数据库上提供广义匹配功能,可以通过执行预处理操作,以基于数据记录组件的临时级别(如列位置)构建反向列表查找表。 前缀信息位于每个关键字,关键字子字符串或关键字的主题版本中存储的反向列表中。 对查找表而不是数据库表执行关键字搜索,以确定关键字的数据库列位置。 扫描查找表以识别与搜索项相关联的每个前缀。 关于数据库的模式信息用于链接列位置以形成跨越关键字的数据库子图。 根据由包含关键字的列组成的子图生成连接表。 生成关于数据库的查询以连接表并检索包含与关键字关联的关键字和前缀的数据库行。 检索到的行在输出之前按照相关性的顺序排列。 通过预处理关系数据库以形成查找表,并且最初搜索查找表以获得数据库的目标子集,可以执行SQL查询来收集数据记录,关系数据库上的关键字搜索是有效的。

    Self-tuning histogram and database modeling
    106.
    发明授权
    Self-tuning histogram and database modeling 有权
    自调整直方图和数据库建模

    公开(公告)号:US06460045B1

    公开(公告)日:2002-10-01

    申请号:US09268589

    申请日:1999-03-15

    Abstract: Building histograms by using feedback information about the execution of query workload rather than by examining the data helps reduce the cost of building and maintaining histograms. A method of maintaining self-tuning histograms updates histograms based on feedback about the execution of a user query. A histogram may be initialized using an assumption of uniform distribution of data or by combining existing histograms. A histogram tuner accesses and estimated result in response to a user query generated by using the histogram. The histogram tuner calculates an estimation error based on the result of the user query and the estimated result. The frequencies of histogram buckets are refined based on the estimation error. The bucket bounds of the histogram are restructured based on the refined frequencies. The method may be performed on-line after a user query or off-line by accessing a workload log. By updating a histogram without accessing the database, the cost of building and maintaining histograms is significantly reduced.

    Abstract translation: 通过使用有关执行查询工作负载的反馈信息而不是检查数据来构建直方图有助于降低构建和维护直方图的成本。 维持自调整直方图的方法基于关于用户查询的执行的反馈来更新直方图。 可以使用数据均匀分布的假设或通过组合现有直方图来初始化直方图。 直方图调谐器响应于通过使用直方图生成的用户查询来访问和估计结果。 直方图调谐器基于用户查询的结果和估计结果来计算估计误差。 基于估计误差来改进直方图桶的频率。 直方图的边界根据精细的频率进行重组。 该方法可以在用户查询之后在线执行,或者通过访问工作负载日志离线执行。 通过更新直方图而不访问数据库,建立和维护直方图的成本显着降低。

    What-if index analysis utility for database systems
    107.
    发明授权
    What-if index analysis utility for database systems 有权
    数据库系统的假设索引分析实用程序

    公开(公告)号:US06223171B1

    公开(公告)日:2001-04-24

    申请号:US09139843

    申请日:1998-08-25

    Abstract: What-if index analysis utility provides the ability to analyze the performance of the existing configuration of a database system with respect to one or more workloads of queries and to propose a hypothetical configuration for the database system to analyze its potential impact on the performance of the database system. The utility may be used, for example, to perform an impact analysis of the set of indexes selected by an index selection tool, for example, with respect to a workload of queries and may also be used to explore what-if scenarios for the database system by analyzing the impact of hypothetical sets of indexes with respect to the execution of various workloads over projected sizes of a database. The utility may be used to perform summarizations of workloads, configurations, and the performance of workloads with respect to the existing configuration and hypothetical configurations. What-if index analysis utility may be used, for example, by a database administrator or a physical database design tool to help improve performance of a database system.

    Abstract translation: 假设索引分析实用程序提供了分析数据库系统对一个或多个查询工作负载的现有配置的性能的能力,并提出数据库系统的假设配置,以分析其对性能的潜在影响 数据库系统。 例如,该实用程序可以用于对由索引选择工具选择的索引集合进行影响分析,例如关于查询的工作负载,并且还可以用于探索数据库的假设情况 系统通过分析假设的索引集合对各种工作负载的执行与数据库的预计大小的影响。 该实用程序可用于执行相对于现有配置和假设配置的工作负载,配置和工作负载性能的摘要。 假设索引分析实用程序可以由数据库管理员或物理数据库设计工具使用,以帮助提高数据库系统的性能。

    Method and apparatus for query optimization in a relational database
system having foreign functions
    108.
    发明授权
    Method and apparatus for query optimization in a relational database system having foreign functions 失效
    具有外部功能的关系数据库系统中查询优化的方法和装置

    公开(公告)号:US5544355A

    公开(公告)日:1996-08-06

    申请号:US77227

    申请日:1993-06-14

    CPC classification number: G06F17/30463 Y10S707/99932

    Abstract: Database applications typically need to invoke foreign functions or to access data that is not stored in the database. The invention provides a comprehensive approach to cost-based optimization of relational queries in the presence of such foreign functions. The optimization takes into account semantic information about foreign functions using a declarative rule language (e.g., SQL) to express such semantics. Procedures for applying the rewrite rules and for generating the execution space of equivalent queries are described. Procedures to obtain an optimal plan from this enriched execution space are also described. Moreover, necessary extensions to the cost model that are needed in the presence of foreign functions are described.

    Abstract translation: 数据库应用程序通常需要调用外部函数或访问未存储在数据库中的数据。 本发明提供了在存在这种外部功能的情况下关系查询的基于成本优化的综合方法。 优化考虑到使用声明性规则语言(例如SQL)来表达这种语义的关于外部函数的语义信息。 描述了应用重写规则和生成等效查询的执行空间的过程。 还描述了从这个丰富的执行空间获得最佳计划的过程。 此外,描述了在存在外部功能的情况下对成本模型的必要扩展。

    Tagging entities with descriptive phrases
    109.
    发明授权
    Tagging entities with descriptive phrases 有权
    使用描述性短语标记实体

    公开(公告)号:US09298825B2

    公开(公告)日:2016-03-29

    申请号:US13298349

    申请日:2011-11-17

    CPC classification number: G06F17/30864 G06F17/30277

    Abstract: A plurality of description phrases associated with a first domain may be determined, based on an analysis of a first plurality of documents to determine co-occurrences of the description phrases with one or more name labels associated with the first domain. An entity associated with the first domain may be obtained. An analysis of a second plurality of documents may be initiated to identify co-occurrences of mentions of the obtained entity and one or more of the plurality of description phrases, and contexts associated with each of the co-occurrences of the mentions and description phrases, in each one of the second plurality of documents. A description tag association between the obtained entity and one of the description phrases may be determined, based on an analysis of the identified contexts.

    Abstract translation: 可以基于第一多个文档的分析来确定与第一域相关联的多个描述短语,以确定描述短语与与第一域相关联的一个或多个名称标签的共同出现。 可以获得与第一域相关联的实体。 可以启动对第二多个文档的分析,以识别获得的实体的提及和多个描述短语中的一个或多个以及与提及和描述短语的共同出现中的每一个相关联的上下文, 在第二多个文档的每一个中。 可以基于对所识别的上下文的分析来确定获得的实体与描述短语之一之间的描述标签关联。

    Scalable lookup-driven entity extraction from indexed document collections
    110.
    发明授权
    Scalable lookup-driven entity extraction from indexed document collections 有权
    从索引文档集合提取可扩展的查找驱动实体

    公开(公告)号:US08782061B2

    公开(公告)日:2014-07-15

    申请号:US12144675

    申请日:2008-06-24

    CPC classification number: G06F17/30011 G06F17/278

    Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.

    Abstract translation: 过滤一组文档进行实体提取。 接收到实体字符串的列表。 确定一组涵盖列表中的实体字符串的令牌集。 使用该组令​​牌查询在第一组文档上生成的反向索引,以确定第一组中的文档的子集的一组文档标识符。 从第一组文档中检索由该组文档标识符标识的第二组文档。 第二组文档被过滤以包括第二组的一个或多个文档,每个文档包括与实体字符串列表的至少一个实体字符串的匹配。 可以对经过滤的第二组文件执行实体识别。

Patent Agency Ranking