System and method for tree structure indexing that provides at least one constraint sequence to preserve query-equivalence between xml document structure match and subsequence match
    11.
    发明授权
    System and method for tree structure indexing that provides at least one constraint sequence to preserve query-equivalence between xml document structure match and subsequence match 失效
    用于树结构索引的系统和方法,其提供至少一个约束序列以保持xml文档结构匹配和子序列匹配之间的查询等价

    公开(公告)号:US07475070B2

    公开(公告)日:2009-01-06

    申请号:US11035889

    申请日:2005-01-14

    IPC分类号: G06F17/30 G06F17/00

    摘要: Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically through subsequence matching. Herein, there is addressed the problem of query equivalence with respect to this transformation, and thereis introduced a performance-oriented principle for sequencing tree structures. With query equivalence, XML queries can be performed through subsequence matching without join operations, post-processing, or other special handling for problems such as false alarms. There is identified a class of sequencing methods for this purpose, and there is presented a novel subsequence matching algorithm that observe query equivalence. Also introduced is a performance-oriented principle to guide the sequencing of tree structures. For any given XML dataset, the principle finds an optimal sequencing strategy according to its schema and its data distribution; there is thus presented herein a novel method that realizes this principle.

    摘要翻译: 基于序列的XML索引旨在避免查询处理中的昂贵的联接操作。 它将结构化XML数据转换为序列,以便可以通过子序列匹配整体回答结构化查询。 这里,针对这种转换的查询等价问题,提出了一种用于排序树结构的性能导向原理。 通过查询等价,可以通过子序列匹配执行XML查询,无需连接操作,后处理或其他特殊处理,例如虚假警报等问题。 确定了一类用于此目的的测序方法,并提出了一种观察查询等价性的新颖的子序列匹配算法。 还引入了一种以性能为导向的原则来指导树结构的排序。 对于任何给定的XML数据集,该原理根据其模式及其数据分布找到最佳排序策略; 因此在此呈现了实现这一原理的新颖方法。

    System and method for continuous diagnosis of data streams
    12.
    发明授权
    System and method for continuous diagnosis of data streams 失效
    用于连续诊断数据流的系统和方法

    公开(公告)号:US07464068B2

    公开(公告)日:2008-12-09

    申请号:US10880913

    申请日:2004-06-30

    IPC分类号: G06F17/30

    摘要: In connection with the mining of time-evolving data streams, a general framework that mines changes and reconstructs models from a data stream with unlabeled instances or a limited number of labeled instances. In particular, there are defined herein statistical profiling methods that extend a classification tree in order to guess the percentage of drifts in the data stream without any labelled data. Exact error can be estimated by actively sampling a small number of true labels. If the estimated error is significantly higher than empirical expectations, there preferably re-sampled a small number of true labels to reconstruct the decision tree from the leaf node level.

    摘要翻译: 与挖掘时间不断变化的数据流有关的一般框架,即从具有未标记实例的数据流或有限数量的标记实例中挖掘变更和重建模型。 特别地,这里定义了扩展分类树的统计分析方法,以便在没有任何标记数据的情况下猜测数据流中漂移的百分比。 可以通过主动抽取少量真实标签来估计精确误差。 如果估计的误差明显高于经验期望值,则最好重新采样少量的真实标签,以从叶节点级别重建决策树。

    METHOD AND APPARATUS FOR ADAPTIVE IN-OPERATOR LOAD SHEDDING
    13.
    发明申请
    METHOD AND APPARATUS FOR ADAPTIVE IN-OPERATOR LOAD SHEDDING 失效
    自适应操作员负载分离的方法和装置

    公开(公告)号:US20080270640A1

    公开(公告)日:2008-10-30

    申请号:US12164671

    申请日:2008-06-30

    IPC分类号: G06F3/00

    摘要: One embodiment of the present method and apparatus adaptive in-operator load shedding includes receiving at least two data streams (each comprising a plurality of tuples, or data items) into respective sliding windows of memory. A throttling fraction is then calculated based on input rates associated with the data streams and on currently available processing resources. Tuples are then selected for processing from the data streams in accordance with the throttling fraction, where the selected tuples represent a subset of all tuples contained within the sliding window.

    摘要翻译: 本发明的方法和设备的一个实施例是自适应操作员卸载包括将至少两个数据流(每个包括多个元组或数据项)接收到存储器的相应滑动窗口中。 然后基于与数据流相关联的输入速率和当前可用的处理资源来计算节流分数。 然后根据节流分数从数据流中选择元组进行处理,其中所选元组表示包含在滑动窗口内的所有元组的子集。

    PRESERVING PRIVACY OF ONE-DIMENSIONAL DATA STREAMS USING DYNAMIC AUTOCORRELATION
    14.
    发明申请
    PRESERVING PRIVACY OF ONE-DIMENSIONAL DATA STREAMS USING DYNAMIC AUTOCORRELATION 失效
    使用动态自动保存保护一维数据流的隐私

    公开(公告)号:US20080205641A1

    公开(公告)日:2008-08-28

    申请号:US11678808

    申请日:2007-02-26

    IPC分类号: H04L9/18

    CPC分类号: G06F21/755

    摘要: A method, information processing system, and computer readable medium are provided for preserving privacy of one-dimensional nonstationary data streams. The method includes receiving a one-dimensional nonstationary data stream. A set of first-moment statistical values are calculated, for a given instant of sub-space of time, for the data. The first moment statistical values include a principal component for the sub-space of time. The data is perturbed with noise along the principal component in proportion to the first-moment of statistical values so that at least part of a set of second-moment statistical values for the data is perturbed by the noise only within a predetermined variance.

    摘要翻译: 提供了一种方法,信息处理系统和计算机可读介质,用于保持一维非平稳数据流的隐私。 该方法包括接收一维非平稳数据流。 对于数据的子时间空间的给定时刻,计算一组一阶统计值。 第一时刻统计值包括时间子空间的主成分。 数据按照与主要分量成比例的噪声与第一时刻的统计值相互扰动,使得数据的至少一部分二维统计值仅在预定方差内被噪声扰动。

    Systems and methods for structural clustering of time sequences
    15.
    发明授权
    Systems and methods for structural clustering of time sequences 有权
    时间序列结构聚类的系统和方法

    公开(公告)号:US07369961B2

    公开(公告)日:2008-05-06

    申请号:US11096485

    申请日:2005-03-31

    IPC分类号: G06F15/00

    摘要: Arrangements and methods for performing structural clustering between different time series. Time series data relating to a plurality of time series is accepted, structural features relating to the time series data are ascertained, and at least one distance between different time series via employing the structural features is determined. The different time series may be partitioned into clusters based on the at least one distance, and/or the k closest matches to a given time series query based on the at least one distance may be returned.

    摘要翻译: 在不同时间序列之间进行结构聚类的布置和方法。 接收与多个时间序列相关的时间序列数据,确定与时间序列数据相关的结构特征,并且确定通过采用结构特征的不同时间序列之间的至少一个距离。 可以基于至少一个距离将不同的时间序列划分成簇,并且可以返回基于至少一个距离的/或与给定时间序列查询的k个最接近的匹配。

    Method and apparatus for web farm traffic control
    16.
    发明授权
    Method and apparatus for web farm traffic control 失效
    网络农场交通管制的方法和装置

    公开(公告)号:US07356592B2

    公开(公告)日:2008-04-08

    申请号:US10057516

    申请日:2002-01-24

    IPC分类号: G06F15/173 G06F15/16

    摘要: Disclosed is a method for controlling a web farm having a plurality of websites and servers, the method comprising categorizing customer requests received from said websites into a plurality of categories, said categories comprising a shareable customer requests and unshareable customer requests, routing said shareable customer requests such that any of said servers may process shareable customer requests received from different said websites, and routing said unshareable customer requests from specific said websites only to specific servers to which said specific websites have been assigned.

    摘要翻译: 公开了一种用于控制具有多个网站和服务器的网络农场的方法,所述方法包括将从所述网站接收的客户请求分类为多个类别,所述类别包括可共享的客户请求和不可客户的客户请求,路由所述可共享的客户请求 使得任何所述服务器可以处理从不同的所述网站接收到的可共享的客户请求,并且将来自特定的所述网站的所述不可共享的客户请求仅路由到已经分配了所述特定网站的特定服务器。

    Systems and methods for sequential modeling in less than one sequential scan
    17.
    发明授权
    Systems and methods for sequential modeling in less than one sequential scan 失效
    在不到一次顺序扫描中进行顺序建模的系统和方法

    公开(公告)号:US07337161B2

    公开(公告)日:2008-02-26

    申请号:US10903336

    申请日:2004-07-30

    IPC分类号: G06F17/30

    CPC分类号: G06N99/005 Y10S707/99931

    摘要: Most recent research of scalable inductive learning on very large streaming dataset focuses on eliminating memory constraints and reducing the number of sequential data scans. However, state-of-the-art algorithms still require multiple scans over the data set and use sophisticated control mechanisms and data structures. There is discussed herein a general inductive learning framework that scans the dataset exactly once. Then, there is proposed an extension based on Hoeffding's inequality that scans the dataset less than once. The proposed frameworks are applicable to a wide range of inductive learners.

    摘要翻译: 对最大流式数据集的可伸缩归纳学习的最新研究着重于消除记忆限制并减少顺序数据扫描的次数。 然而,最先进的算法仍然需要对数据集进行多次扫描,并使用复杂的控制机制和数据结构。 这里讨论了一般的归纳学习框架,该框架一次扫描数据集。 然后,提出了一种基于Hoeffding不等式的扩展,可以扫描数据集不止一次。 提出的框架适用于广泛的归纳学习者。

    User-defined online interaction method and device
    20.
    发明授权
    User-defined online interaction method and device 有权
    用户定义的在线交互方式和设备

    公开(公告)号:US06944655B1

    公开(公告)日:2005-09-13

    申请号:US09541804

    申请日:2000-04-03

    IPC分类号: G06F15/173

    CPC分类号: G06Q10/10

    摘要: The present invention provides a method, system and apparatus enabling user-defined, genre-structured interaction online. The present invention enables users to define their own genres, including rules of interaction, as well as rules of enforcement. Genre definitions also can include the specification of roles, parameters, and states. The present invention also facilitates a given user to modify a given genre definition. Allowable modifications include addition, modification, and deletion of parameters and interaction and enforcement rules. The present invention also provides dynamically updated graphical representations of the state of genre instances, these graphical representations definable by the users.

    摘要翻译: 本发明提供了一种使用户定义的类型结构在线互动的方法,系统和装置。 本发明使用户能够定义自己的类型,包括交互规则以及执行规则。 类型定义也可以包括角色,参数和状态的规范。 本发明还有助于给定用户修改给定的类型定义。 允许的修改包括添加,修改和删除参数以及交互和执行规则。 本发明还提供了类型实例的状态的动态更新的图形表示,这些图形表示可由用户定义。