Efficient type annontation of XML schema-validated XML documents without schema validation
    1.
    发明申请
    Efficient type annontation of XML schema-validated XML documents without schema validation 审中-公开
    XML模式验证的XML文档的无效模式验证

    公开(公告)号:US20050177578A1

    公开(公告)日:2005-08-11

    申请号:US10774584

    申请日:2004-02-10

    IPC分类号: G06F7/00

    摘要: Type annotation record information storage for annotated automaton encoding for high-performance XML schema validation is optimized in a space efficient aspect. Subsequent to type annotation record information organization, type annotation records are used for type annotation of validated XML documents, either by implementing annotation records and type annotation part of an algorithm only, or by skipping one or more validation steps in a full validation implementation. Given a schema context, a type annotation may be performed for a validated XML fragment as opposed to an entire document. In addition, default features such as attribute and type are supported.

    摘要翻译: 用于高性能XML模式验证的注释自动机编码的类型注释记录信息存储在空间有效的方面进行了优化。 在类型注释记录信息组织之后,类型注释记录用于经验证的XML文档的类型注释,无论是通过实现注释记录和仅对算法类型注释部分,或者通过跳过完整验证实现中的一个或多个验证步骤。 给定模式上下文,对于经过验证的XML片段,可以对整个文档执行类型注释。 此外,还支持默认功能,如属性和类型。

    Efficient XML schema validation of XML fragments using annotated automaton encoding
    2.
    发明申请
    Efficient XML schema validation of XML fragments using annotated automaton encoding 失效
    使用注释自动机编码的XML片段的高效XML模式验证

    公开(公告)号:US20050177543A1

    公开(公告)日:2005-08-11

    申请号:US10774594

    申请日:2004-02-10

    IPC分类号: G06F7/00 G06F17/22 G06F17/27

    摘要: A method and system for Extensible Markup Language (XML) schema validation, includes: loading an XML document into a runtime validation engine, where the runtime validation engine includes an XML schema validation parser; loading an annotated automaton encoding (AAE) for an XML schema definition into the XML schema validation parser; and validating the XML document against the XML schema definition by the XML schema validation parser utilizing the annotated automaton encoding. Each XML schema definition is compiled once into the AAE format, rather than being compiled each time an XML document is validated, and thus significant time is saved. The code for the runtime validation engine is fixed and does not vary depending on the XML schema definition, rather than varying for each XML schema definition, and thus space overhead is minimized. Flexibility in the validation process is provided without compromising performance.

    摘要翻译: 可扩展标记语言(XML)模式验证的方法和系统包括:将XML文档加载到运行时验证引擎中,运行时验证引擎包含XML模式验证解析器; 将用于XML模式定义的带注释的自动机编码(AAE)加载到XML模式验证解析器中; 并通过使用带注释的自动机编码的XML模式验证解析器根据XML模式定义验证XML文档。 每个XML模式定义都被编译为AAE格式,而不是每次验证XML文档时被编译,因此节省了大量的时间。 运行时验证引擎的代码是固定的,并且不会根据XML模式定义而变化,而不是因为每个XML模式定义而变化,因此空间开销最小化。 提供验证过程中的灵活性,而不会影响性能。

    Packing nodes into records to store XML XQuery data model and other hierarchically structured data
    3.
    发明申请
    Packing nodes into records to store XML XQuery data model and other hierarchically structured data 失效
    将节点包装到记录中以存储XML XQuery数据模型和其他分层结构化数据

    公开(公告)号:US20070043743A1

    公开(公告)日:2007-02-22

    申请号:US11209997

    申请日:2005-08-22

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30917 G06F17/30911

    摘要: A storage of nodes of hierarchically structured data uses logical node identifiers to reference the nodes stored within and across record data structures. A node identifier index is used to map each logical node identifier to a record identifier for the record that contains the node. When a sub-tree is stored in a separate record, a proxy node is used to represent the sub-tree in the parent record. The mapping in the node identifier index reflects the storage of the sub-tree nodes in the separate record. Since the references between the records are through logical node identifiers, there is no limitation to the moving of records across pages, as long as the indices are updated or rebuilt to maintain synchronization with the resulting data pages. This approach is highly scalable and has a much smaller storage consumption than approaches that use explicit references between nodes.

    摘要翻译: 分级结构化数据节点的存储使用逻辑节点标识符来引用存储在记录数据结构内和跨记录数据结构的节点。 节点标识符索引用于将每个逻辑节点标识符映射到包含节点的记录的记录标识符。 当子树存储在单独的记录中时,代理节点用于表示父记录中的子树。 节点标识符索引中的映射反映了子树节点在单独记录中的存储。 由于记录之间的引用是通过逻辑节点标识符,只要索引被更新或重建以维持与所得到的数据页的同步,就不限于跨页面的记录移动。 这种方法具有高度可扩展性,并且比使用节点之间的明确引用的方法具有更小的存储消耗。

    Streaming XPath algorithm for XPath value index key generation
    4.
    发明申请
    Streaming XPath algorithm for XPath value index key generation 失效
    用于XPath值索引密钥生成的流XPath算法

    公开(公告)号:US20060106758A1

    公开(公告)日:2006-05-18

    申请号:US10990834

    申请日:2004-11-16

    IPC分类号: G06F17/00

    摘要: A method generates hierarchical path index keys for single and multiple indexes with one scan of a document. Each data node of the document is scanned and matches to query nodes are identified. A data node matches a query node if the three conditions hold: if it is not the root step, there is a match for the query node in the previous step of the query; the data node matches the query node of the current step; and the edges of the data and query nodes match. A sub-tree of a data node can be skipped if the data node is not matched and its level is less than the fixed levels of the query. The matched data node is then placed in the match stacks corresponding to the match query nodes. The method uses transitivity properties among matching units to reduce the number of states that need to be tracked and to improve the evaluation of path expressions significantly.

    摘要翻译: 一种方法是通过文档的一次扫描为单索引和多索引生成分层路径索引键。 对文档的每个数据节点进行扫描,并与查询节点进行匹配。 如果三个条件成立,则数据节点与查询节点相匹配:如果不是根步骤,则查询前一步骤中的查询节点匹配; 数据节点匹配当前步骤的查询节点; 数据和查询节点的边缘匹配。 如果数据节点不匹配且其级别小于查询的固定级别,则可以跳过数据节点的子树。 然后将匹配的数据节点放置在与匹配查询节点相对应的匹配堆栈中。 该方法使用匹配单位之间的传递属性来减少需要跟踪的状态数量,并显着提高路径表达式的评估。

    Scalable storage schemes for native XML column data of relational tables
    5.
    发明申请
    Scalable storage schemes for native XML column data of relational tables 有权
    关系表的本机XML列数据的可扩展存储方案

    公开(公告)号:US20070043751A1

    公开(公告)日:2007-02-22

    申请号:US11209598

    申请日:2005-08-22

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30595 G06F17/30923

    摘要: A method and system for providing a scalable storage scheme for native hierarchically structured data of relational tables, includes a base table with indicator columns with information pertaining to hierarchically structured data of a document, data tables for storing the hierarchically structured data corresponding to the indicator columns, and node identifier indexes corresponding to the data tables for mapping between the indicator columns and the hierarchically structured data in the data tables. In an embodiment, actual data for each hierarchically structured data (such as XML) column is stored in a separate data table, and each data table has a separate node identifier index. The node identifier index is searched with a key containing the document identifier and a logical node identifier is used, and a record identifier of a record in the data table containing the node assigned the logical node identifier is retrieved.

    摘要翻译: 提供用于关系表的本机分层结构化数据的可伸缩存储方案的方法和系统包括具有与文档的分层结构化数据有关的信息的指示符列的基表,用于存储对应于指示符列的分层结构化数据的数据表 ,以及与数据表对应的节点标识符索引,用于在指标列与数据表中的分层结构化数据之间进行映射。 在一个实施例中,每个分级结构化数据(例如XML)列的实际数据被存储在单独的数据表中,并且每个数据表具有单独的节点标识符索引。 使用包含文档标识符的密钥搜索节点标识符索引,并且使用逻辑节点标识符,并且检索包含分配有逻辑节点标识符的节点的数据表中的记录的记录标识符。

    Query transformation for union all view join queries using join predicates for pruning and distribution
    6.
    发明申请
    Query transformation for union all view join queries using join predicates for pruning and distribution 有权
    联合所有视图的查询转换使用连接谓词进行修剪和分发

    公开(公告)号:US20050065926A1

    公开(公告)日:2005-03-24

    申请号:US10669749

    申请日:2003-09-24

    IPC分类号: G06F17/30

    摘要: A method, apparatus, and article of manufacture for optimizing a query in a computer system, wherein the query is performed by the computer system to retrieve data from a database stored on the computer system. The optimization includes: (a) combining join predicates from a query with local predicates from each branch of one or more UNION ALL views referenced by the query; (b) analyzing the combined predicates; and (c) not generating the join when the analysis step indicates that the combined predicates lead to an empty result.

    摘要翻译: 一种用于优化计算机系统中的查询的方法,装置和制品,其中所述查询由所述计算机系统执行以从存储在所述计算机系统上的数据库中检索数据。 优化包括:(a)将来自查询的连接谓词与来自查询引用的一个或多个UNION ALL视图的每个分支的本地谓词组合; (b)分析合并谓词; 和(c)当分析步骤指示组合的谓词导致空的结果时,不产生连接。

    Method, system, and program for a join operation on a multi-column table and satellite tables including duplicate values
    7.
    发明授权
    Method, system, and program for a join operation on a multi-column table and satellite tables including duplicate values 有权
    在多列表和包含重复值的卫星表中的连接操作的方法,系统和程序

    公开(公告)号:US06374235B1

    公开(公告)日:2002-04-16

    申请号:US09344731

    申请日:1999-06-25

    IPC分类号: G06F1730

    摘要: Disclosed is a method, system, and program for performing a join operation on a multi-column table and at least two satellite tables having a join condition. Each satellite table is comprised of multiple rows and at least one join column. The multi-column table is comprised of multiple rows and at least one column corresponding to the join column in each satellite table. A join operation is performed on the rows of the satellite tables to generate concatenated rows of the satellite tables. One of the concatenated rows is joined to the multi-column table and a returned entry from the multi-column table is received. A determination is then made as to whether the returned entry matches the search criteria. If so, a determination is made as to whether one of the satellite tables has duplicates of values in the join column of the returned matching entry or the multi-column table has duplicate entries in the join columns. Returned matching entries are generated for each duplicate value in the satellite tables and duplicate entry in the multi-column table.

    摘要翻译: 公开了一种用于在多列表和至少两个具有连接条件的卫星表上执行加入操作的方法,系统和程序。 每个卫星表由多行和至少一个连接列组成。 多列表由多行和至少一列对应于每个卫星表中的连接列组成。 对卫星表的行执行加入操作,以生成卫星表的级联行。 其中一个连接的行被连接到多列表,并且从多列表中返回的条目被接收。 然后确定返回的条目是否与搜索条件匹配。 如果是,则确定卫星表中的一个是否具有在返回的匹配条目的连接列中的值的重复,或者多列表在连接列中具有重复条目。 为卫星表中的每个重复值和多列表中的重复条目生成返回的匹配条目。

    Order-preserving encoding formats of floating-point decimal numbers for efficient value comparison
    8.
    发明申请
    Order-preserving encoding formats of floating-point decimal numbers for efficient value comparison 有权
    浮点十进制数的订单保留编码格式,用于有效的价值比较

    公开(公告)号:US20070050436A1

    公开(公告)日:2007-03-01

    申请号:US11213551

    申请日:2005-08-26

    IPC分类号: G06F7/00

    摘要: A method for conversion between a decimal floating-point number and an order-preserving format has been disclosed. The method encodes numbers in the decimal floating-point format into a format which preserves value ordering. This encoding allows for fast and direct string comparison of two values. Such an encoding provides normalized representations for decimal floating-point numbers and supports type-insensitive comparisons. Type-insensitive comparisons are often used in database management systems, where the data type is not specified for values to compare. In addition, the original decimal floating-point format can be recovered from the order-preserving format.

    摘要翻译: 已经公开了用于在十进制浮点数和订单保留格式之间进行转换的方法。 该方法将十进制浮点格式的数字编码为保留值排序的格式。 该编码允许对两个值进行快速和直接的字符串比较。 这样的编码提供十进制浮点数的归一化表示,并且支持类型不敏感的比较。 类型不敏感的比较通常用于数据库管理系统,其中数据类型未指定用于比较的值。 另外,原始的十进制浮点格式可以从订单保留格式中恢复。

    Dynamic selection of optimal grouping sequence at runtime for grouping sets, rollup and cube operations in SQL query processing
    9.
    发明申请
    Dynamic selection of optimal grouping sequence at runtime for grouping sets, rollup and cube operations in SQL query processing 审中-公开
    运行时动态选择最佳分组序列,以便在SQL查询处理中进行分组,汇总和多维数据集操作

    公开(公告)号:US20050027690A1

    公开(公告)日:2005-02-03

    申请号:US10629459

    申请日:2003-07-29

    IPC分类号: G06F17/30 G06F7/00

    CPC分类号: G06F16/24537

    摘要: A method, apparatus, and article of manufacture for optimizing a query in a computer system. During compilation of the query, a GROUP BY clause with one or more GROUPING SETS, ROLLUP or CUBE operations is maintained in its original form until after query rewrite. The GROUP BY clause with the GROUPING SETS, ROLLUP or CUBE operations is then translated into a plurality of levels having one or more grouping sets. After compilation of the query, a grouping sets sequence is dynamically determined for the GROUP BY clause with the GROUPING SETS, ROLLUP or CUBE operations based on intermediate grouping sets, in order to optimize the grouping sets sequence. The execution of the grouping sets sequence is optimized by selecting a smallest grouping set from a previous one of the levels as an input to a grouping set on a next one of the levels. Finally, a UNION ALL operation is performed on the grouping sets.

    摘要翻译: 一种用于优化计算机系统中的查询的方法,装置和制品。 在编译查询期间,具有一个或多个GROUPING SETS,ROLLUP或CUBE操作的GROUP BY子句将保持其原始格式,直到查询重写为止。 然后将具有GROUPING SETS,ROLLUP或CUBE操作的GROUP BY子句转换为具有一个或多个分组集合的多个级别。 在编译查询之后,基于中间分组集合的GROUP BY子句动态确定分组集序列,并使用GROUPING GROUP,ROLLUP或CUBE操作,以优化分组集序列。 通过从先前的一个级别中选择最小的分组集作为在下一个级别上的分组集合的输入来优化分组集序列的执行。 最后,对分组集执行UNION ALL操作。