TECHNIQUES FOR EFFICIENT LOADING OF BINARY XML DATA
    1.
    发明申请
    TECHNIQUES FOR EFFICIENT LOADING OF BINARY XML DATA 有权
    有效加载二进制XML数据的技术

    公开(公告)号:US20080098001A1

    公开(公告)日:2008-04-24

    申请号:US11743563

    申请日:2007-05-02

    IPC分类号: G06F17/30

    摘要: Various techniques are described hereafter for improving the efficiency of binary XML encoding and loading operations. In particular, techniques are described for incrementally encoding XML in response to amount-based requests. After encoding enough binary XML to satisfy an amount-based request, the encoder stops encoding the XML until a subsequent request is received. The incremental encoding may take place on the client-side or the server-side. Techniques are also described for reducing the character set conversion operations by having a parser convert tokens in text XML into one character set while converting non-token text in the text XML into another character set. Techniques are also described for generating self-contained binary XML documents, and for improving remap operations by providing a binary XML document on a chunk-by-chunk basis.

    摘要翻译: 以下描述了用于提高二进制XML编码和加载操作的效率的各种技术。 特别地,描述了用于响应于基于量的请求逐渐编码XML的技术。 在编码足够的二进制XML以满足基于数量的请求之后,编码器停止对XML的编码,直到接收到后续请求。 增量编码可能发生在客户端或服务器端。 还描述了通过使解析器将文本XML中的令牌转换为一个字符集同时将文本XML中的非令牌文本转换为另一个字符集来减少字符集转换操作的技术。 还描述了用于生成自包含二进制XML文档以及通过以块为单位提供二进制XML文档来改进重映射操作的技术。

    Techniques for efficient loading of binary XML data
    2.
    发明授权
    Techniques for efficient loading of binary XML data 有权
    用于高效加载二进制XML数据的技术

    公开(公告)号:US08010889B2

    公开(公告)日:2011-08-30

    申请号:US11743563

    申请日:2007-05-02

    IPC分类号: G06N3/00

    摘要: Various techniques are described hereafter for improving the efficiency of binary XML encoding and loading operations. In particular, techniques are described for incrementally encoding XML in response to amount-based requests. After encoding enough binary XML to satisfy an amount-based request, the encoder stops encoding the XML until a subsequent request is received. The incremental encoding may take place on the client-side or the server-side. Techniques are also described for reducing the character set conversion operations by having a parser convert tokens in text XML into one character set while converting non-token text in the text XML into another character set. Techniques are also described for generating self-contained binary XML documents, and for improving remap operations by providing a binary XML document on a chunk-by-chunk basis.

    摘要翻译: 以下描述了用于提高二进制XML编码和加载操作的效率的各种技术。 特别地,描述了用于响应于基于量的请求逐渐编码XML的技术。 在编码足够的二进制XML以满足基于数量的请求之后,编码器停止对XML的编码,直到接收到后续请求。 增量编码可能发生在客户端或服务器端。 还描述了通过使解析器将文本XML中的令牌转换为一个字符集同时将文本XML中的非令牌文本转换为另一个字符集来减少字符集转换操作的技术。 还描述了用于生成自包含二进制XML文档以及通过以块为单位提供二进制XML文档来改进重映射操作的技术。

    Incremental maintenance of an XML index on binary XML data
    3.
    发明申请
    Incremental maintenance of an XML index on binary XML data 有权
    对二进制XML数据的XML索引的增量维护

    公开(公告)号:US20080098020A1

    公开(公告)日:2008-04-24

    申请号:US11715603

    申请日:2007-03-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30911 G06F17/30569

    摘要: Techniques are provided for incrementally maintaining an XML index built to access XML data that is encoded in binary XML form. Rather than delete and reinsert index entries of all the nodes of a modified XML document, only the index entries of the affected nodes are modified. Consequently, the order key values stored in the index may become inconsistent with the current hierarchical locations of the nodes to which the order key values correspond. Techniques are described for resolving the inconsistencies, and for addressing additional problems that result when the XML index is path-subsetted.

    摘要翻译: 提供了用于增量维护构建以访问以二进制XML形式编码的XML数据的XML索引的技术。 修改的XML文档的所有节点的删除和重新插入索引条目,而不是修改受影响的节点的索引条目。 因此,存储在索引中的订单键值可能与订单键值对应的节点的当前分层位置变得不一致。 描述了解决不一致的技术,以及解决当XML索引是路径子集化时产生的其他问题。

    Incremental maintenance of an XML index on binary XML data
    4.
    发明授权
    Incremental maintenance of an XML index on binary XML data 有权
    对二进制XML数据的XML索引的增量维护

    公开(公告)号:US07739251B2

    公开(公告)日:2010-06-15

    申请号:US11715603

    申请日:2007-03-07

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30911 G06F17/30569

    摘要: Techniques are provided for incrementally maintaining an XML index built to access XML data that is encoded in binary XML form. Rather than delete and reinsert index entries of all the nodes of a modified XML document, only the index entries of the affected nodes are modified. Consequently, the order key values stored in the index may become inconsistent with the current hierarchical locations of the nodes to which the order key values correspond. Techniques are described for resolving the inconsistencies, and for addressing additional problems that result when the XML index is path-subsetted.

    摘要翻译: 提供了用于增量维护构建以访问以二进制XML形式编码的XML数据的XML索引的技术。 修改的XML文档的所有节点的删除和重新插入索引条目,而不是修改受影响的节点的索引条目。 因此,存储在索引中的订单键值可能与订单键值对应的节点的当前分层位置变得不一致。 描述了解决不一致的技术,以及解决当XML索引是路径子集化时产生的其他问题。

    Efficient piece-wise updates of binary encoded XML data
    5.
    发明申请
    Efficient piece-wise updates of binary encoded XML data 有权
    二进制编码的XML数据的高效分片更新

    公开(公告)号:US20070271305A1

    公开(公告)日:2007-11-22

    申请号:US11437512

    申请日:2006-05-18

    IPC分类号: G06F17/30 G06F7/00 G06F17/00

    摘要: An XML document can be represented in a compact binary form that maintains all of the features of XML data in a useable form. In response to a request for a modification (e.g., insert, delete or update a node) to an XML document that is stored in the compact binary form, a certain representation of the requested modification is computed for application directly to the binary form of the document. Thus, the requested modification is applied directly to the persistently stored binary form without constructing an object tree or materializing the XML document into a corresponding textual form. Taking into account the nature of the binary form in which the document is encoded, the bytes that actually require change are identified, including identifying where in the binary representation the corresponding actual changes need to be made.

    摘要翻译: XML文档可以以紧凑的二进制形式表示,以可用的形式维护XML数据的所有功能。 响应于以紧凑二进制形式存储的XML文档的修改(例如,插入,删除或更新节点)的请求,对所请求的修改的特定表示被直接计算到二进制形式的 文件。 因此,所请求的修改直接应用于永久存储的二进制形式,而不构造对象树或将XML文档实现为相应的文本形式。 考虑到文档编码的二进制形式的性质,确定实际需要更改的字节,包括识别二进制表示中需要进行相应实际更改的位置。

    Efficient piece-wise updates of binary encoded XML data
    6.
    发明授权
    Efficient piece-wise updates of binary encoded XML data 有权
    二进制编码的XML数据的高效分片更新

    公开(公告)号:US09460064B2

    公开(公告)日:2016-10-04

    申请号:US11437512

    申请日:2006-05-18

    摘要: An XML document can be represented in a compact binary form that maintains all of the features of XML data in a useable form. In response to a request for a modification (e.g., insert, delete or update a node) to an XML document that is stored in the compact binary form, a certain representation of the requested modification is computed for application directly to the binary form of the document. Thus, the requested modification is applied directly to the persistently stored binary form without constructing an object tree or materializing the XML document into a corresponding textual form. Taking into account the nature of the binary form in which the document is encoded, the bytes that actually require change are identified, including identifying where in the binary representation the corresponding actual changes need to be made.

    摘要翻译: XML文档可以以紧凑的二进制形式表示,以可用的形式维护XML数据的所有功能。 响应于以紧凑二进制形式存储的XML文档的修改(例如,插入,删除或更新节点)的请求,对所请求的修改的特定表示被直接计算到二进制形式的 文件。 因此,所请求的修改直接应用于永久存储的二进制形式,而不构造对象树或将XML文档实现为相应的文本形式。 考虑到文档编码的二进制形式的性质,确定实际需要更改的字节,包括识别二进制表示中需要进行相应实际更改的位置。

    Name Disambiguation Using Context Terms
    9.
    发明申请
    Name Disambiguation Using Context Terms 有权
    使用上下文术语命名消歧

    公开(公告)号:US20140214840A1

    公开(公告)日:2014-07-31

    申请号:US12955253

    申请日:2010-11-29

    IPC分类号: G06F17/30

    CPC分类号: G06F17/3064

    摘要: Methods, systems and apparatus, including computer programs encoded on a computer storage medium, for disambiguating names in a document corpus. In an aspect, a method includes generating context term lists for a person name, each context term list being a list of context terms from a resource for the person name; clustering the context term lists into a plurality of clusters, each of the clusters of context term lists including context term lists that are most similar to the cluster relative to other clusters; for each of the clusters, selecting a representative term for the cluster; receiving the person name as a search query; and generating a plurality of query suggestions from the search query and the representative terms for the clusters, each query suggesting being a combination of the person name and one representative term.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于消除文档语料库中的名称。 一方面,一种方法包括为个人名称生成上下文词列表,每个上下文词列表是来自人名的资源的上下文术语列表; 将上下文术语列表聚类成多个集群,每个上下文术语表的集群包括与集群相对于其他集群最相似的上下文术语列表; 对于每个集群,选择集群的代表性术语; 接收人名作为搜索查询; 以及从所述搜索查询和所述群集的代表性条件生成多个查询建议,每个查询建议是所述人名和一个代表词的组合。

    Automatic framing selection
    10.
    发明授权
    Automatic framing selection 有权
    自动选框

    公开(公告)号:US08792493B2

    公开(公告)日:2014-07-29

    申请号:US13491501

    申请日:2012-06-07

    IPC分类号: H04L12/28 H04L12/26 H04L12/56

    摘要: Network traffic is monitored and an optimal framing heuristic is automatically determined and applied. Framing heuristics specify different rules for framing network traffic. While a framing heuristic is applied to the network traffic, alternative framing heuristics are speculatively evaluated for the network traffic. The results of these evaluations are used to rank the framing heuristics. The framing heuristic with the best rank is selected for framing subsequent network traffic. Each client/server traffic flow may have a separate framing heuristic. The framing heuristics may be deterministic based on byte count and/or time or based on traffic characteristics that indicate a plausible point for framing to occur. The choice of available framing heuristics may be determined partly by manual configuration, which specifies which framing heuristics are available, and partly by automatic processes, which determine the best framing heuristic to apply to the current network traffic from the set of available framing heuristics.

    摘要翻译: 监控网络流量,并自动确定和应用最优的框架启发式。 成帧启发式规定了组网网络流量的不同规则。 虽然将框架启发式应用于网络流量,但是针对网络流量推测性地评估了替代成帧启发式算法。 这些评估的结果用于对框架启发式进行排序。 选择具有最佳排名的成帧启发式来构建后续网络流量。 每个客户端/服务器流量流可以具有单独的成帧启发式。 帧启发式可以是基于字节计数和/或时间的确定性的,或者基于指示发生框架的合理点的业务特性。 可选帧启发式可以部分地通过手动配置来确定,该手动配置指定哪些帧启发式是可用的,并且部分地由自动进程确定,自动进程决定了应用于可用成帧启发式组的当前网络业务的最佳成帧启发式。