Cross-feature analysis
    31.
    发明申请
    Cross-feature analysis 审中-公开
    跨特征分析

    公开(公告)号:US20050283511A1

    公开(公告)日:2005-12-22

    申请号:US10658623

    申请日:2003-09-09

    申请人: Wei Fan Philip Yu

    发明人: Wei Fan Philip Yu

    IPC分类号: G06F7/38 G06F15/173

    CPC分类号: G06F11/008

    摘要: Disclosed is a method of automatically identifying anomalous situations during computerized system operations that records actions performed by the computerized system as features in a history file, automatically creates a model for each feature only from normal data in the history file, performs training by calculating anomaly scores of the features, establishes a threshold to evaluate whether features are abnormal, automatically identifies abnormal actions of the computerized system based on the anomaly scores and said threshold, and periodically repeats the training process.

    摘要翻译: 公开了一种在计算机化系统操作期间自动识别异常情况的方法,其将由计算机化系统执行的动作记录为历史文件中的特征,仅从历史文件中的正常数据自动创建每个特征的模型,通过计算异常得分进行训练 建立一个阈值来评估特征是否异常,根据异常得分和阈值自动识别计算机化系统的异常动作,并定期重复训练过程。

    System and method for searching using a temporal dimension
    32.
    发明申请
    System and method for searching using a temporal dimension 审中-公开
    使用时间维度进行搜索的系统和方法

    公开(公告)号:US20050234877A1

    公开(公告)日:2005-10-20

    申请号:US10820888

    申请日:2004-04-08

    申请人: Philip Yu

    发明人: Philip Yu

    IPC分类号: G06F7/00

    CPC分类号: G06F7/00

    摘要: The present invention is directed to a system and a method for generating a temporally ranked set of search results in response to a query. Each result in the set of search results can be ranked temporally or based on the reputation associated with authors of each result and the reputation associated with the repository where each result is located. Temporal ranking takes into account a present importance weight and a future importance weight are assigned to each result. The present importance of each result uses creation date, publication date, in-link dates and search frequency, and the future importance uses an aging factor based on the elapsed time from publication for each search result and a rate at which each search result decreases in importance. Temporal ranking can be applied as a modification of existing and common search engine algorithms include PageRank and HITS.

    摘要翻译: 本发明涉及一种用于响应于查询来产生时间上排序的搜索结果集合的系统和方法。 搜索结果集中的每个结果可以在时间上排列,或者基于与每个结果的作者相关联的声誉以及与每个结果所在的存储库相关联的声誉。 时间排名考虑到当前的重要性权重,并且将未来重要权重分配给每个结果。 每个结果的当前重要性使用创建日期,发布日期,链接日期和搜索频率,并且未来重要性基于每个搜索结果的出版时间以及每个搜索结果减少的速率,使用老化因子 重要性。 时间排名可以作为现有和常见的搜索引擎算法的修改,包括PageRank和HITS。

    System and method for scalable cost-sensitive learning
    33.
    发明申请
    System and method for scalable cost-sensitive learning 审中-公开
    可扩展成本敏感学习的系统和方法

    公开(公告)号:US20050125434A1

    公开(公告)日:2005-06-09

    申请号:US10725378

    申请日:2003-12-03

    IPC分类号: G06F7/00

    CPC分类号: G06N20/00

    摘要: A method (and structure) for processing an inductive learning model for a dataset of examples, includes dividing the dataset into N subsets of data and developing an estimated learning model for the dataset by developing a learning model for a first subset of the N subsets.

    摘要翻译: 一种用于处理实例数据集的归纳学习模型的方法(和结构),包括将数据集划分成N个数据子集,并通过开发用于N个子集的第一子集的学习模型来开发数据集的估计学习模型。

    Methods and Apparatus for Performing Structural Joins for Answering Containment Queries
    34.
    发明申请
    Methods and Apparatus for Performing Structural Joins for Answering Containment Queries 失效
    执行遏制查询结构连接的方法和装置

    公开(公告)号:US20080104038A1

    公开(公告)日:2008-05-01

    申请号:US11966537

    申请日:2007-12-28

    IPC分类号: G06F17/30

    摘要: Techniques are provided for performing structural joins for answering containment queries. Such inventive techniques may be used to perform efficient structural joins of two interval lists which are neither sorted nor pre-indexed. For example, in an illustrative aspect of the invention, a technique for performing structural joins of two element sets of a tree-structured document, wherein one of the two element sets is an ancestor element set and the other of the two element sets is a descendant element set, and further wherein each element is represented as an interval representing a start position and an end position of the element in the document, comprises the following steps/operations. An index is dynamically built for the ancestor element set. Then, one or more structural joins are performed by searching the index with the interval start position of each element in the descendant element set.

    摘要翻译: 提供技术来执行用于回答遏制查询的结构连接。 这样的创造性技术可以用于执行两个间隔列表的有效结构连接,这两个间隔列表既不被分类也未预索引。 例如,在本发明的说明性方面,一种用于执行树结构化文档的两个元素集合的结构连接的技术,其中两个元素集合中的一个是祖先元素集合,并且两个元素集合中的另一个是 后代元素集合,并且其中每个元素被表示为表示文档中元素的开始位置和结束位置的间隔,包括以下步骤/操作。 为祖先元素集动态构建索引。 然后,通过用后代元素集中的每个元素的间隔开始位置搜索索引来执行一个或多个结构连接。

    SYSTEMS AND METHODS FOR SEQUENTIAL MODELING IN LESS THAN ONE SEQUENTIAL SCAN
    35.
    发明申请
    SYSTEMS AND METHODS FOR SEQUENTIAL MODELING IN LESS THAN ONE SEQUENTIAL SCAN 失效
    用于顺序建模的系统和方法不超过一次连续扫描

    公开(公告)号:US20080052255A1

    公开(公告)日:2008-02-28

    申请号:US11931129

    申请日:2007-10-31

    IPC分类号: G06F15/18 G06N7/00

    CPC分类号: G06N99/005 Y10S707/99931

    摘要: Most recent research of scalable inductive learning on very large streaming dataset focuses on eliminating memory constraints and reducing the number of sequential data scans. However, state-of-the-art algorithms still require multiple scans over the data set and use sophisticated control mechanisms and data structures. There is discussed herein a general inductive learning framework that scans the dataset exactly once. Then, there is proposed an extension based on Hoeffding's inequality that scans the dataset less than once. The proposed frameworks are applicable to a wide range of inductive learners.

    摘要翻译: 对最大流式数据集的可伸缩归纳学习的最新研究着重于消除记忆限制并减少顺序数据扫描的次数。 然而,最先进的算法仍然需要对数据集进行多次扫描,并使用复杂的控制机制和数据结构。 这里讨论了一般的归纳学习框架,该框架一次扫描数据集。 然后,提出了一种基于Hoeffding不等式的扩展,可以扫描数据集不止一次。 提出的框架适用于广泛的归纳学习者。

    System and method for peer-to-peer multi-party voice-over-IP services
    36.
    发明申请
    System and method for peer-to-peer multi-party voice-over-IP services 有权
    用于点对点多方语音IP服务的系统和方法

    公开(公告)号:US20070211703A1

    公开(公告)日:2007-09-13

    申请号:US11372634

    申请日:2006-03-10

    IPC分类号: H04L12/66

    摘要: A system, method, and computer program product for establishing multi-party VoIP conference audio calls in a distributed, peer-to-peer network where any number of nodes are able to arbitrarily and asynchronously start or stop producing audio output to be mixed into a single composite audio stream that is distributed to all nodes. A single distribution tree is used that has optimal communications characteristics to distribute the composite audio signal to all nodes. An audio mixing tree is established and maintained by adaptively and dynamically adding and merging intermediate mixing nodes operating between user nodes and the root of the single distribution tree. The intermediate mixing nodes and the root of the single distribution tree are all hosted, in an exemplary embodiment, on user nodes that are endpoints of the distribution tree.

    摘要翻译: 一种用于在分布式对等网络中建立多方VoIP会议音频呼叫的系统,方法和计算机程序产品,其中任何数量的节点能够任意地和异步地开始或停止产生混合到 单个复合音频流分配给所有节点。 使用具有最佳通信特性以将复合音频信号分配给所有节点的单个分发树。 通过自适应地动态地添加和合并在用户节点和单个分发树的根之间运行的中间混合节点来建立和维护音频混合树。 在示例性实施例中,分发树的中间混合节点和根分别在作为分发树的端点的用户节点上托管。

    Method and apparatus for providing load diffusion in data stream correlations
    37.
    发明申请
    Method and apparatus for providing load diffusion in data stream correlations 失效
    用于在数据流相关中提供负载扩散的方法和装置

    公开(公告)号:US20070016560A1

    公开(公告)日:2007-01-18

    申请号:US11183149

    申请日:2005-07-15

    申请人: Xiaohui Gu Philip Yu

    发明人: Xiaohui Gu Philip Yu

    IPC分类号: G06F17/30

    摘要: A computer implemented method, apparatus, and computer usable program code for performing load diffusion to process data stream pairs. A data stream pair is received for correlation. The data stream pair is partitioned into portions to meet correlation constraints for correlating data in the data stream pair to form a partitioned data stream pair. The partitioned data stream pair is sent to a set of nodes for correlation processing to perform the load diffusion.

    摘要翻译: 用于执行负载扩散以处理数据流对的计算机实现的方法,装置和计算机可用程序代码。 接收数据流对以进行相关。 将数据流对划分成部分以满足用于使数据流对中的数据相关的相关约束,以形成分区数据流对。 分区数据流对被发送到一组节点进行相关处理以执行负载扩散。

    System and method for efficiently performing similarity searches of structural data
    38.
    发明申请
    System and method for efficiently performing similarity searches of structural data 有权
    有效执行结构数据相似性检索的系统和方法

    公开(公告)号:US20060224562A1

    公开(公告)日:2006-10-05

    申请号:US11096165

    申请日:2005-03-31

    申请人: Xifeng Yan Philip Yu

    发明人: Xifeng Yan Philip Yu

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30536 G06F19/705

    摘要: Techniques for similarity searching are provided. In one aspect, a method of searching structural data in a database against one or more structural queries comprises the following steps. A desired minimum degree of similarity between the one or more queries and the structural data in the database is first specified. One or more indices are then used to exclude from consideration any structural data in the database that does not share the minimum degree of similarity with one or more of the queries.

    摘要翻译: 提供了相似搜索的技术。 在一个方面,一种在一个或多个结构性查询中搜索数据库中的结构数据的方法包括以下步骤。 首先指定一个或多个查询与数据库中的结构数据之间期望的最小相似程度。 然后使用一个或多个索引从考虑中排除不与一个或多个查询共享最小相似度的数据库中的任何结构数据。

    Systems and methods for optimal component composition in a stream processing system

    公开(公告)号:US20060200251A1

    公开(公告)日:2006-09-07

    申请号:US11068785

    申请日:2005-03-01

    申请人: Xiaohui Gu Philip Yu

    发明人: Xiaohui Gu Philip Yu

    IPC分类号: C10G9/16

    CPC分类号: H04L12/4641

    摘要: A system and method are provided for optimizing component composition in a distributed stream-processing environment having a plurality of nodes capable of being associated with one or more of a plurality of stream processing components. The system includes an adaptive composition probing (ACP) module and a hierarchical state manager. The ACP module probes a subset of the plurality of stream processing components to determine the optimal component composition in response to a stream processing request. The hierarchical state manager manages local and global information for use by said ACP module in determining the optimal component composition.

    Systems and methods for maintaining closed frequent itemsets over a data stream sliding window

    公开(公告)号:US20060174024A1

    公开(公告)日:2006-08-03

    申请号:US11046926

    申请日:2005-01-31

    IPC分类号: G06F15/16

    摘要: Towards mining closed frequent itemsets over a sliding window using limited memory space, a synopsis data structure to monitor transactions in the sliding window so that one can output the current closed frequent itemsets at any time. Due to time and memory constraints, the synopsis data structure cannot monitor all possible itemsets, but monitoring only frequent itemsets makes it difficult to detect new itemsets when they become frequent. Herein, there is introduced a compact data structure, the closed enumeration tree (CET), to maintain a dynamically selected set of itemsets over a sliding-window. The selected itemsets include a boundary between closed frequent itemsets and the rest of the itemsets Because the boundary is relatively stable, the cost of mining closed frequent itemsets over a sliding window is dramatically reduced to that of mining transactions that can possibly cause boundary movements in the CET.