SYSTEM AND METHOD FOR PARALLEL SEARCHING OF A DOCUMENT STREAM
    1.
    发明申请
    SYSTEM AND METHOD FOR PARALLEL SEARCHING OF A DOCUMENT STREAM 有权
    用于并行搜索文件流的系统和方法

    公开(公告)号:US20120166440A1

    公开(公告)日:2012-06-28

    申请号:US13016199

    申请日:2011-01-28

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30625

    摘要: A system and method for searching a document for a query pattern. A plurality of streams may be stored each including a linear sequence of nodes. Each stream may be associated with nodes having a common label in a data tree of the document. A query pattern may be searched for in the streams by executing a plurality of threads. Each of two or more of the threads may be used to search different sub-streams of the plurality of streams. Each of the different sub-streams searched for by each thread in each stream may be uniquely correlated with one or more disjoint sub-trees of a partition of the tree into a plurality of sub-trees. The two or more of the plurality of threads may be executed in parallel. A result of the query pattern search may be generated using at least one of the threads.

    摘要翻译: 用于搜索文档以查询模式的系统和方法。 可以存储多个流,每个流包括节点的线性序列。 每个流可以与在文档的数据树中具有公共标签的节点相关联。 可以通过执行多个线程在流中搜索查询模式。 两个或更多个线程中的每一个可以用于搜索多个流的不同子流。 每个流中每个线程搜索的每个不同子流可以与树的分区的一个或多个不相交的子树唯一地相关联成多个子树。 可以并行地执行多个线程中的两个或更多个。 可以使用至少一个线程来生成查询模式搜索的结果。

    Incremental clustering of indexed XML data
    2.
    发明授权
    Incremental clustering of indexed XML data 有权
    索引XML数据的增量聚类

    公开(公告)号:US08930407B2

    公开(公告)日:2015-01-06

    申请号:US13000022

    申请日:2009-06-18

    IPC分类号: G06F17/30

    摘要: In a data storage and retrieval system wherein data is stored and retrieved in pages, said data comprising connected nodes arranged such that each page stores only complete nodes, said connected nodes being connected via a plurality of overlapping tree structures, a method of minimizing page retrieval in the face of changing relationships between nodes comprising: selecting at least two of said overlapping tree structures; incrementally adjusting a page node structure dynamically based on real time workload, separately according to each selected tree structure, to form modified partitions for each tree structure, each modified partition being so as to minimize page faults; for each modified partition calculating a modification gain to indicate which partition has provided a greater minimization of page faults; and selecting the tree structure and modified partition corresponding to the best modification gain.

    摘要翻译: 在其中以页面存储和检索数据的数据存储和检索系统中,所述数据包括被布置为使得每个页面仅存储完整节点的连接节点,所述连接的节点通过多个重叠的树结构连接,一种使页面检索最小化的方法 面对改变节点之间的关系,包括:选择所述重叠树结构中的至少两个; 根据每个选择的树结构,分别根据实时工作量动态地调整页面节点结构,以形成每个树结构的修改的分区,每个修改的分区是最小化页面错误; 对于每个修改的分区计算修改增益以指示哪个分区提供了更大的页面错误的最小化; 并选择对应于最佳修改增益的树结构和修改分区。

    APPARATUS AND METHOD FOR INCREMENTAL PHYSICAL DATA CLUSTERING
    3.
    发明申请
    APPARATUS AND METHOD FOR INCREMENTAL PHYSICAL DATA CLUSTERING 有权
    增加物理数据聚类的装置和方法

    公开(公告)号:US20110208737A1

    公开(公告)日:2011-08-25

    申请号:US12989664

    申请日:2009-05-19

    IPC分类号: G06F17/30

    摘要: In a data storage and retrieval system wherein data arranged in nodes is stored and retrieved in pages, each page comprising a cluster of nodes, a method comprising: monitoring ongoing data retrieval to find retrieval patterns of nodes which are retrieved together and to identify changes in said retrieval patterns over time; and periodically reclustering the data nodes among said pages dynamically during usage of the data to reflect said changes, so that nodes more often retrieved together are migrated to cluster together and nodes more often required separately are migrated to cluster separately, thereby to keep small an overall number of page accesses of said data storage and retrieval system during data retrieval despite dynamic changes in patterns of data retrieval.

    摘要翻译: 在数据存储和检索系统中,其中布置在节点中的数据被存储和检索在页面中,每个页面包括一簇节点,一种方法包括:监视正在进行的数据检索,以查找一起检索的节点的检索模式, 表示检索模式随时间推移; 并且在使用数据期间动态地重新聚集所述页面中的数据节点以反映所述改变,使得更经常一起检索的节点被迁移到群集,并且更多地需要分开的节点被分别迁移到集群,从而保持小的整体 数据检索期间所述数据存储和检索系统的页面访问数量,尽管数据检索模式的动态变化。

    Apparatus and method for incremental physical data clustering
    4.
    发明授权
    Apparatus and method for incremental physical data clustering 有权
    增量物理数据聚类的装置和方法

    公开(公告)号:US08572085B2

    公开(公告)日:2013-10-29

    申请号:US12989664

    申请日:2009-05-19

    IPC分类号: G06F17/30

    摘要: In a data storage and retrieval system wherein data arranged in nodes is stored and retrieved in pages, each page comprising a cluster of nodes, a method comprising: monitoring ongoing data retrieval to find retrieval patterns of nodes which are retrieved together and to identify changes in said retrieval patterns over time; and periodically reclustering the data nodes among said pages dynamically during usage of the data to reflect said changes, so that nodes more often retrieved together are migrated to cluster together and nodes more often required separately are migrated to cluster separately, thereby to keep small an overall number of page accesses of said data storage and retrieval system during data retrieval despite dynamic changes in patterns of data retrieval.

    摘要翻译: 在数据存储和检索系统中,其中布置在节点中的数据被存储和检索在页面中,每个页面包括一簇节点,一种方法包括:监视正在进行的数据检索,以查找一起检索的节点的检索模式, 表示检索模式随时间推移; 并且在使用数据期间动态地重新聚集所述页面中的数据节点以反映所述改变,使得更经常一起检索的节点被迁移到群集,并且更多地需要分开的节点被分别迁移到集群,从而保持小的整体 数据检索期间所述数据存储和检索系统的页面访问数量,尽管数据检索模式的动态变化。

    System and method for parallel searching of a document stream
    5.
    发明授权
    System and method for parallel searching of a document stream 有权
    用于并行搜索文档流的系统和方法

    公开(公告)号:US09405820B2

    公开(公告)日:2016-08-02

    申请号:US13016199

    申请日:2011-01-28

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30625

    摘要: A system and method for searching a document for a query pattern. A plurality of streams may be stored each including a linear sequence of nodes. Each stream may be associated with nodes having a common label in a data tree of the document. A query pattern may be searched for in the streams by executing a plurality of threads. Each of two or more of the threads may be used to search different sub-streams of the plurality of streams. Each of the different sub-streams searched for by each thread in each stream may be uniquely correlated with one or more disjoint sub-trees of a partition of the tree into a plurality of sub-trees. The two or more of the plurality of threads may be executed in parallel. A result of the query pattern search may be generated using at least one of the threads.

    摘要翻译: 一种用于搜索文档查询模式的系统和方法。 可以存储多个流,每个流包括节点的线性序列。 每个流可以与在文档的数据树中具有公共标签的节点相关联。 可以通过执行多个线程在流中搜索查询模式。 两个或更多个线程中的每一个可以用于搜索多个流的不同子流。 每个流中每个线程搜索的每个不同子流可以与树的分区的一个或多个不相交的子树唯一地相关联成多个子树。 可以并行地执行多个线程中的两个或更多个。 可以使用至少一个线程来生成查询模式搜索的结果。

    INCREMENTAL CLUSTERING OF INDEXED XML DATA
    6.
    发明申请
    INCREMENTAL CLUSTERING OF INDEXED XML DATA 有权
    索引XML数据的增量聚类

    公开(公告)号:US20110099205A1

    公开(公告)日:2011-04-28

    申请号:US13000022

    申请日:2009-06-18

    IPC分类号: G06F7/00

    摘要: In a data storage and retrieval system wherein data is stored and retrieved in pages, said data comprising connected nodes arranged such that each page stores only complete nodes, said connected nodes being connected via a plurality of overlapping tree structures, a method of minimizing page retrieval in the face of changing relationships between nodes comprising: selecting at least two of said overlapping tree structures; incrementally adjusting a page node structure dynamically based on real time workload, separately according to each selected tree structure, to form modified partitions for each tree structure, each modified partition being so as to minimize page faults; for each modified partition calculating a modification gain to indicate which partition has provided a greater minimization of page faults; and selecting the tree structure and modified partition corresponding to the best modification gain.

    摘要翻译: 在其中以页面存储和检索数据的数据存储和检索系统中,所述数据包括被布置为使得每个页面仅存储完整节点的连接节点,所述连接的节点通过多个重叠的树结构连接,一种使页面检索最小化的方法 面对改变节点之间的关系,包括:选择所述重叠树结构中的至少两个; 根据每个选择的树结构,分别根据实时工作量动态地调整页面节点结构,以形成每个树结构的修改的分区,每个修改的分区是最小化页面错误; 对于每个修改的分区计算修改增益以指示哪个分区提供了更大的页面错误的最小化; 并选择对应于最佳修改增益的树结构和修改分区。