Indexing Mechanism for Efficient Node-Aware Full-Text Search Over XML
    1.
    发明申请
    Indexing Mechanism for Efficient Node-Aware Full-Text Search Over XML 有权
    用于高效节点感知的索引机制通过XML进行全文搜索

    公开(公告)号:US20100169354A1

    公开(公告)日:2010-07-01

    申请号:US12346327

    申请日:2008-12-30

    IPC分类号: G06F7/06 G06F17/30

    CPC分类号: G06F17/30911

    摘要: Techniques are provided for searching within a collection of XML documents. A relational table in an XML index stores an entry for each node of a set of nodes in the collection. Each entry of the relational table stores an order key and a path identifier along with the atomized value of the node. An index on the atomized value provides a mechanism to perform a node-aware full-text search. Instead of storing the atomized value in the table, a virtual column may be created to represent, for each node, the atomized value of the node. Alternately, each entry of the relational table stores an order key and a path identifier along with, for simple nodes, the atomized value, and for complex nodes, a null value. For a complex node with a descendant text node, a separate entry is stored for the descendant text node in the relational table.

    摘要翻译: 提供了在XML文档集合内进行搜索的技术。 XML索引中的关系表存储集合中一组节点的每个节点的条目。 关系表的每个条目存储订单密钥和路径标识符以及节点的雾化值。 雾化值上的索引提供了执行节点感知全文搜索的机制。 不用在表中存储雾化值,而是可以创建一个虚拟列,以便为每个节点表示节点的雾化值。 或者,关系表的每个条目存储订单密钥和路径标识符,对于简单节点,存在雾化值,对于复杂节点存储空值。 对于具有后代文本节点的复杂节点,为关系表中的后代文本节点存储单独的条目。

    Indexing mechanism for efficient node-aware full-text search over XML
    2.
    发明授权
    Indexing mechanism for efficient node-aware full-text search over XML 有权
    基于XML的高效节点感知全文检索的索引机制

    公开(公告)号:US08219563B2

    公开(公告)日:2012-07-10

    申请号:US12346327

    申请日:2008-12-30

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30911

    摘要: Techniques are provided for searching within a collection of XML documents. A relational table in an XML index stores an entry for each node of a set of nodes in the collection. Each entry of the relational table stores an order key and a path identifier along with the atomized value of the node. An index on the atomized value provides a mechanism to perform a node-aware full-text search. Instead of storing the atomized value in the table, a virtual column may be created to represent, for each node, the atomized value of the node. Alternately, each entry of the relational table stores an order key and a path identifier along with, for simple nodes, the atomized value, and for complex nodes, a null value. For a complex node with a descendant text node, a separate entry is stored for the descendant text node in the relational table.

    摘要翻译: 提供了在XML文档集合内进行搜索的技术。 XML索引中的关系表存储集合中一组节点的每个节点的条目。 关系表的每个条目存储订单密钥和路径标识符以及节点的雾化值。 雾化值上的索引提供了执行节点感知全文搜索的机制。 不用在表中存储雾化值,而是可以创建一个虚拟列,以便为每个节点表示节点的雾化值。 或者,关系表的每个条目存储订单密钥和路径标识符,对于简单节点,存在雾化值,对于复杂节点存储空值。 对于具有后代文本节点的复杂节点,为关系表中的后代文本节点存储单独的条目。

    Indexing strategy with improved DML performance and space usage for node-aware full-text search over XML
    3.
    发明授权
    Indexing strategy with improved DML performance and space usage for node-aware full-text search over XML 有权
    具有改进的DML性能的索引策略和通过XML的节点感知全文搜索的空间使用

    公开(公告)号:US08126932B2

    公开(公告)日:2012-02-28

    申请号:US12346393

    申请日:2008-12-30

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30911

    摘要: Techniques are provided for searching within a collection of XML documents. A relational table stores an entry for each node of a set of nodes in a collection of XML documents. Each entry of the relational table stores an order key and a path identifier along with the atomized value of the node. Instead of storing the atomized value in a full-text index, a virtual column can be created to represent, for each node, the atomized value of the node. Alternately, each entry of the relational table stores an order key and a path identifier along with, for simple nodes, the atomized value, and for complex nodes, a null value. For a complex node with a descendant text node, a separate entry is stored for the descendant text node in the relational table.

    摘要翻译: 提供了在XML文档集合内进行搜索的技术。 关系表存储XML文档集合中的一组节点的每个节点的条目。 关系表的每个条目存储订单密钥和路径标识符以及节点的雾化值。 不必将全部文本索引中的雾化值存储起来,可以创建虚拟列,以便为每个节点表示节点的雾化值。 或者,关系表的每个条目存储订单密钥和路径标识符,对于简单节点,存在雾化值,对于复杂节点存储空值。 对于具有后代文本节点的复杂节点,为关系表中的后代文本节点存储单独的条目。

    Indexing Strategy With Improved DML Performance and Space Usage for Node-Aware Full-Text Search Over XML
    4.
    发明申请
    Indexing Strategy With Improved DML Performance and Space Usage for Node-Aware Full-Text Search Over XML 有权
    具有改进的DML性能的索引策略和用于节点感知的XML文档的全文搜索的空间使用

    公开(公告)号:US20100185683A1

    公开(公告)日:2010-07-22

    申请号:US12346393

    申请日:2008-12-30

    IPC分类号: G06F7/06 G06F17/30

    CPC分类号: G06F17/30911

    摘要: Techniques are provided for searching within a collection of XML documents. A relational table stores an entry for each node of a set of nodes in a collection of XML documents. Each entry of the relational table stores an order key and a path identifier along with the atomized value of the node. Instead of storing the atomized value in a full-text index, a virtual column can be created to represent, for each node, the atomized value of the node. Alternately, each entry of the relational table stores an order key and a path identifier along with, for simple nodes, the atomized value, and for complex nodes, a null value. For a complex node with a descendant text node, a separate entry is stored for the descendant text node in the relational table.

    摘要翻译: 提供了在XML文档集合内进行搜索的技术。 关系表存储XML文档集合中的一组节点的每个节点的条目。 关系表的每个条目存储订单密钥和路径标识符以及节点的雾化值。 不必将全部文本索引中的雾化值存储起来,可以创建虚拟列,以便为每个节点表示节点的雾化值。 或者,关系表的每个条目存储订单密钥和路径标识符,对于简单节点,存在雾化值,对于复杂节点存储空值。 对于具有后代文本节点的复杂节点,为关系表中的后代文本节点存储单独的条目。

    Searching backward to speed up query
    5.
    发明授权
    Searching backward to speed up query 有权
    向后搜索以加快查询速度

    公开(公告)号:US08566343B2

    公开(公告)日:2013-10-22

    申请号:US12871869

    申请日:2010-08-30

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30539

    摘要: A method, computing device, and a non-transitory computer-readable medium are provided for performing a context-aware search by finding a set of nodes that are mapped to a given text or other value and, for each node in the set of nodes, performing a reverse path lookup to determine whether the node satisfies a given context. The query processor performs the reverse path lookup for a node by traversing up a node tree away from the node, using a stored mapping from the node to a parent of the node. Using mappings from nodes to parent nodes, the node tree is traversed backwards from the node up to distant ancestor nodes through parent nodes. An optimizer instructs the query processor to perform a value-based portion of the search before a path-based portion of the search based on value distribution statistics and path distribution statistics.

    摘要翻译: 提供了一种方法,计算设备和非暂时计算机可读介质,用于通过找到映射到给定文本或其他值的节点集合来执行上下文感知搜索,并且对于节点集合中的每个节点 ,执行反向路径查找以确定节点是否满足给定的上下文。 查询处理器通过使用存储的从节点到节点的父节点的映射来遍历节点之外的节点树来执行对节点的反向路径查找。 使用从节点到父节点的映射,节点树通过父节点从节点向后穿过远处的祖先节点。 优化器指示查询处理器在基于值分布统计和路径分布统计的基于路径的搜索部分之前执行搜索的基于值的部分。

    Technique and Framework to Provide Diagnosability for XML Query/DML Rewrite and XML Index Selection
    6.
    发明申请
    Technique and Framework to Provide Diagnosability for XML Query/DML Rewrite and XML Index Selection 有权
    为XML查询/ DML重写和XML索引选择提供可诊断性的技术和框架

    公开(公告)号:US20130006964A1

    公开(公告)日:2013-01-03

    申请号:US13172573

    申请日:2011-06-29

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30929

    摘要: A method and apparatus for automatically analyzing and providing feedback regarding the optimizability of a relational database query. A query developer's primary goal is to ensure that queries and DML operations are rewritten for the most efficient execution. Rewrite diagnosability captures metadata for each attempted query optimization including success or failure and the reasons for failure. The metadata is stored in association with the operators that were not removed through rewriting. Once all optimizations have been attempted and rewriting is complete, the metadata is selectively displayed based on the cost to perform the associated operation. The context of performing the operation may affect the cost. The cost may be based at least on the type of operation and where within the query tree the operation is located. A query developer may configure the database system not to execute the resulting query plan based on one or more criteria.

    摘要翻译: 一种用于自动分析和提供关于关系数据库查询的可优化性的反馈的方法和装置。 查询开发人员的主要目标是确保查询和DML操作被重写以实现最有效的执行。 重写诊断能力捕获每个尝试的查询优化的元数据,包括成功或失败以及失败的原因。 元数据与未通过重写删除的运算符相关联存储。 一旦尝试了所有优化并重写完成后,将根据执行相关操作的成本选择性地显示元数据。 执行操作的上下文可能会影响成本。 成本可以至少基于操作类型以及操作所在查询树中的哪一个。 查询开发人员可以将数据库系统配置为不基于一个或多个标准执行生成的查询计划。

    CREATING STORAGE FOR XML SCHEMAS WITH LIMITED NUMBERS OF COLUMNS PER TABLE
    7.
    发明申请
    CREATING STORAGE FOR XML SCHEMAS WITH LIMITED NUMBERS OF COLUMNS PER TABLE 有权
    创建具有每表列的有限数量的XML方案的存储

    公开(公告)号:US20090287719A1

    公开(公告)日:2009-11-19

    申请号:US12122589

    申请日:2008-05-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30917

    摘要: Techniques are described herein for automatically generating multiple interrelated database tables to store XML data, while ensuring that each such table has no more than the maximum DBMS-allowed number of columns. In response to the registration of an XML schema with a database server, the server determines whether any of the elements specified in the XML schema are complex elements that have more than a threshold number of descendant elements. If a complex element has more than the threshold number of descendant elements, then the server automatically generates one or more separate “out-of-line” database tables for storing at least some of those descendant elements, so that the table created to store the complex element will have no more than the permitted number of columns. Each of the out-of-line database tables is similarly generated so as to have no more than the permitted number of columns.

    摘要翻译: 这里描述了用于自动生成多个相互关联的数据库表以存储XML数据的技术,同时确保每个这样的表具有不超过DBMS允许的最大列数。 响应于XML模式与数据库服务器的注册,服务器确定XML模式中指定的任何元素是否是具有多于阈值数量的后代元素的复杂元素。 如果复杂元素具有超过阈值数量的后代元素,则服务器自动生成一个或多个单独的“行外”数据库表,用于存储这些后代元素中的至少一些,使得创建用于存储 复杂元素将不超过允许的列数。 类似地生成每个外联数据库表,以便不超过允许的列数。

    EFFICIENT XML TREE INDEXING STRUCTURE OVER XML CONTENT
    8.
    发明申请
    EFFICIENT XML TREE INDEXING STRUCTURE OVER XML CONTENT 审中-公开
    有效的XML树在XML内容中引用结构

    公开(公告)号:US20140067819A1

    公开(公告)日:2014-03-06

    申请号:US13604402

    申请日:2012-09-05

    IPC分类号: G06F17/30

    CPC分类号: G06F16/83

    摘要: A method and apparatus are provided for building and using a persistent XML tree index for navigating an XML document. The XML tree index is stored separately from the XML document content, and thus is able to optimize performance through the use of fixed-sized index entries. The XML document hierarchy need not be constructed in volatile memory, so creating and using the XML tree index scales even for large documents. To evaluate a path expression including descendent or ancestral syntax, navigation links can be read from persistent storage and used directly to find the nodes specified in the path expression. The use of an abstract navigational interface allows applications to be written that are independent of the storage implementation of the index and the content. Thus, the XML tree index can index documents stored at least in a database, a persistent file system, or as a sequence of in memory.

    摘要翻译: 提供了一种用于构建和使用用于导航XML文档的持久XML树索引的方法和装置。 XML树索引与XML文档内容分开存储,因此能够通过使用固定大小的索引条目来优化性能。 XML文档层次结构不需要在易失性存储器中构建,因此创建和使用XML树索引即使对于大型文档也会进行缩放。 要评估包含后代或祖先语法的路径表达式,导航链接可以从持久存储读取,并直接用于查找路径表达式中指定的节点。 使用抽象导航界面允许写入独立于索引和内容的存储实现的应用程序。 因此,XML树索引可以索引至少存储在数据库中的文档,持久文件系统或作为内存中的序列。

    EFFICIENT EVALUATION OF XQUERY AND XPATH FULL TEXT EXTENSION
    9.
    发明申请
    EFFICIENT EVALUATION OF XQUERY AND XPATH FULL TEXT EXTENSION 有权
    XQUERY和XPATH全文扩展的有效评估

    公开(公告)号:US20100211560A1

    公开(公告)日:2010-08-19

    申请号:US12388249

    申请日:2009-02-18

    IPC分类号: G06F17/30 G06F12/02 G06F7/00

    CPC分类号: G06F17/30929

    摘要: Techniques are provided for efficiently evaluating XML queries that conform to an extension of an XML language (e.g., XQuery or XPath). The extension allows XML queries to have full-text search capabilities. Such an XML query is compiled to generate a tree of nodes that correspond to one or more conditions in the full-text portion of the query. In one technique, the amount of memory for the execution state of the tree is determined at compile time and allocated only once throughout execution of the query. In another technique, to ensure at most a single scan of a document, all the words or phrases in the full-text portion of an XML query are located before any of the other conditions in the full-text portion are evaluated. In another technique, the elements of the full-text portion of an XML query are analyzed to determine, based at least in part on cost, which evaluation strategy, of a plurality of evaluation strategies, should be employed.

    摘要翻译: 提供了有效评估符合XML语言(例如,XQuery或XPath)的扩展的XML查询的技术。 该扩展允许XML查询具有全文搜索功能。 编译这样的XML查询以生成与查询的全文部分中的一个或多个条件相对应的节点树。 在一种技术中,树的执行状态的内存量在编译时确定,并且在执行查询时只分配一次。 在另一种技术中,为了确保文档的单次扫描,XML查询的全文部分中的所有单词或短语位于全文部分中的任何其他条件之前。 在另一技术中,分析XML查询的全文部分的元素,至少部分地基于成本来确定应当采用多个评估策略的哪个评估策略。

    Efficient XML tree indexing structure over XML content

    公开(公告)号:US10698953B2

    公开(公告)日:2020-06-30

    申请号:US13604402

    申请日:2012-09-05

    IPC分类号: G06F7/00 G06F17/30 G06F16/83

    摘要: A method and apparatus are provided for building and using a persistent XML tree index for navigating an XML document. The XML tree index is stored separately from the XML document content, and thus is able to optimize performance through the use of fixed-sized index entries. The XML document hierarchy need not be constructed in volatile memory, so creating and using the XML tree index scales even for large documents. To evaluate a path expression including descendent or ancestral syntax, navigation links can be read from persistent storage and used directly to find the nodes specified in the path expression. The use of an abstract navigational interface allows applications to be written that are independent of the storage implementation of the index and the content. Thus, the XML tree index can index documents stored at least in a database, a persistent file system, or as a sequence of in memory.