Interactive data exploration apparatus and methods
    1.
    发明授权
    Interactive data exploration apparatus and methods 失效
    交互式数据挖掘设备和方法

    公开(公告)号:US5999192A

    公开(公告)日:1999-12-07

    申请号:US640411

    申请日:1996-04-30

    IPC分类号: G06F17/30 G06F15/00

    CPC分类号: G06F17/30572

    摘要: A data exploration tool which has a graphical user interface that employs directed graphs to provide histories of the data exploration operations. Nodes in the directed graphs represent operations on data; the edges represent relationships between the operations. One type of the directed graphs is the derivation graph, in which the root of the graph is a node representing a data set and an edge leading from a first node to a second node indicates that the operation represented by the second node is performed on the result of the operation represented by the first node. Operations include query, segmentation, aggregation, and data view operations. A user may edit the derivation graph and may select a node for execution. When that is done, all of the operations represented by the nodes between the root node and the selected node are performed as indicated in the graph. The operations are performed using techniques of lazy evaluation and encachement of results with the nodes. Another type of the directed graphs is the subsumption graph, in which an edge leading from a first node to a second node indicates that the second node stands in a subsumption relationship to the first node. If a result of the operation represented by the first node has been computed, the result is available to calculate the result of the operation represented by the second node.

    摘要翻译: 数据探索工具,其具有图形用户界面,其使用有向图来提供数据勘探操作的历史。 有向图中的节点表示对数据的操作; 边缘表示操作之间的关系。 一种类型的有向图是导出图,其中图的根是表示数据集的节点,从第一节点到第二节点的边缘指示由第二节点表示的操作在 由第一节点表示的操作结果。 操作包括查询,分段,聚合和数据视图操作。 用户可以编辑推导图并且可以选择要执行的节点。 当这样做时,根节点和所选节点之间的节点所表示的所有操作如图所示执行。 这些操作使用懒惰评估技术和结点与节点的结合来执行。 另一类型的有向图是包含图,其中从第一节点到第二节点的边缘指示第二节点处于与第一节点的包含关系中。 如果已经计算了由第一节点表示的操作的结果,则结果可用于计算由第二节点表示的操作的结果。

    User-powered recommendation system
    2.
    发明授权
    User-powered recommendation system 有权
    用户推荐系统

    公开(公告)号:US08943081B2

    公开(公告)日:2015-01-27

    申请号:US12616892

    申请日:2009-11-12

    摘要: Recommendation systems are widely used in Internet applications. In current recommendation systems, users only play a passive role and have limited control over the recommendation generation process. As a result, there is often considerable mismatch between the recommendations made by these systems and the actual user interests, which are fine-grained and constantly evolving. With a user-powered distributed recommendation architecture, individual users can flexibly define fine-grained communities of interest in a declarative fashion and obtain recommendations accurately tailored to their interests by aggregating opinions of users in such communities. By combining a progressive sampling technique with data perturbation methods, the recommendation system is both scalable and privacy-preserving.

    摘要翻译: 推荐系统广泛应用于互联网应用。 在目前的推荐系统中,用户只能发挥被动的作用,对推荐生成过程的控制有限。 因此,这些系统提出的建议和实际用户兴趣之间经常存在很大的不匹配,这些建议是细粒度和不断发展的。 通过用户分配的推荐体系结构,个人用户可以灵活地定义精细的社区,并以声明方式定义感兴趣的社区,通过汇总用户在这些社区的意见,获得准确定制的兴趣建议。 通过将逐行采样技术与数据扰动方法相结合,推荐系统既可扩展又保密。

    Processing data using sequential dependencies
    3.
    发明授权
    Processing data using sequential dependencies 有权
    使用顺序依赖来处理数据

    公开(公告)号:US08645309B2

    公开(公告)日:2014-02-04

    申请号:US12592586

    申请日:2009-11-30

    CPC分类号: G06N7/00 G06N5/00

    摘要: The specification describes data processes for analyzing large data steams for target anomalies. “Sequential dependencies” (SDs) are chosen for ordered data and present a framework for discovering which subsets of the data obey a given sequential dependency. Given an interval G, an SD on attributes X and Y, written as X→G Y, denotes that the distance between the Y-values of any two consecutive records, when sorted on X, are within G. SDs may be extended to Conditional Sequential Dependencies (CSDs), consisting of an underlying SD plus a representation of the subsets of the data that satisfy the SD. The conditional approximate sequential dependencies may be expressed as pattern tableaux, i.e., compact representations of the subsets of the data that satisfy the underlying dependency.

    摘要翻译: 该规范描述了用于分析目标异常的大型数据流的数据处理。 为有序数据选择“顺序依赖”(SDs),并提供一个框架,用于发现数据的哪些子集服从给定的顺序依赖。 给定一个间隔G,写入X> GY的属性X和Y上的SD表示当在X上排序时,任何两个连续记录的Y值之间的距离在G之内可以扩展到条件 顺序依赖性(CSD)由基础SD加上满足SD的数据子集的表示组成。 条件近似顺序依赖性可以表示为模式表,即满足基础依赖性的数据子集的紧凑表示。

    Online Data Fusion
    4.
    发明申请
    Online Data Fusion 有权
    在线数据融合

    公开(公告)号:US20130144843A1

    公开(公告)日:2013-06-06

    申请号:US13311034

    申请日:2011-12-05

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30634

    摘要: An online data fusion system receives a query, probes a first source for an answer to the query, returns the answer from the first source, refreshes the answer while probing an additional source, and applies fusion techniques on data associated with an answer that is retrieved from the additional source. For each retrieved answer, the online data fusion system computes the probability that the answer is correct and stops retrieving data for the answer after gaining enough confidence that data retrieved from the unprocessed sources are unlikely to change the answer. The online data fusion system returns correct answers and terminates probing additional sources in an expeditious manner without sacrificing the quality of the answers.

    摘要翻译: 在线数据融合系统接收查询,探索第一个来源以获得查询的答案,从第一个源返回答案,在探索附加的源时刷新答案,并对与检索到的答案相关联的数据应用融合技术 从额外的来源。 对于每个检索到的答案,在线数据融合系统计算出答案正确的概率,并且在获得足够的信心从而从未处理的源中检索的数据不太可能改变答案之后,停止检索答案数据。 在线数据融合系统返回正确的答案,并以迅速的方式终止探测附加来源,而不牺牲答案的质量。

    Methods and systems to store state used to forward multicast traffic
    5.
    发明授权
    Methods and systems to store state used to forward multicast traffic 有权
    存储用于转发组播流量的状态的方法和系统

    公开(公告)号:US08295203B2

    公开(公告)日:2012-10-23

    申请号:US12060709

    申请日:2008-04-01

    IPC分类号: H04L12/28

    摘要: Methods and systems are described to store state used to forward multicast traffic. The system includes a receiving module to receive request to add a first node to a membership tree. The membership tree includes a first plurality of nodes associated with a multicast group. The system further includes a processing module to identify a second node in the first plurality of nodes and to communicate a node identifier that identifies the first node over a network to the second node. The node identifier is to be stored at the second node to add the first node to the membership tree. The node identifier is further to be stored in the membership tree exclusively at the second node to enable the second node to forward the multicast traffic to the first node.

    摘要翻译: 描述了用于存储用于转发组播流量的状态的方法和系统。 该系统包括接收模块,用于接收向成员树添加第一个节点的请求。 隶属树包括与多播组相关联的第一多个节点。 所述系统还包括处理模块,用于识别所述第一多个节点中的第二节点,并将通过网络识别所述第一节点的节点标识符传送到所述第二节点。 节点标识符将存储在第二个节点,以将第一个节点添加到成员树中。 节点标识符进一步被存储在专属于第二节点的成员树中,以使得第二节点能够将多播业务转发到第一节点。

    Selectivity estimation of set similarity selection queries
    6.
    发明授权
    Selectivity estimation of set similarity selection queries 失效
    集合相似性选择查询的选择性估计

    公开(公告)号:US08161046B2

    公开(公告)日:2012-04-17

    申请号:US12274546

    申请日:2008-11-20

    IPC分类号: G06F17/30 G06F7/00

    CPC分类号: G06F17/30469

    摘要: The invention relates to a system and/or methodology for selectivity estimation of set similarity queries. More specifically, the invention relates to a selectivity estimation technique employing hashed sampling. The invention providing for samples constructed a priori that can efficiently and quickly provide accurate estimates for arbitrary queries, and can be updated efficiently as well.

    摘要翻译: 本发明涉及用于组合相似性查询的选择性估计的系统和/或方法。 更具体地,本发明涉及采用散列采样的选择性估计技术。 本发明提供了可以有效地和快速地为任意查询提供准确估计的先验构建的样本,并且还可以有效地更新。

    FORWARD DECAY TEMPORAL DATA ANALYSIS
    9.
    发明申请
    FORWARD DECAY TEMPORAL DATA ANALYSIS 有权
    前向衰减时间数据分析

    公开(公告)号:US20110066600A1

    公开(公告)日:2011-03-17

    申请号:US12560214

    申请日:2009-09-15

    IPC分类号: G06F17/30

    摘要: A disclosed method for implementing time decay in the analysis of streaming data objects is based on the age, referred to herein as the forward age, of a data object measured from a landmark time in the past to a time associated with the occurrence of the data object, e.g., an object's timestamp. A forward time decay function is parameterized on the forward age. Because a data object's forward age does not depend on the current time, a value of the forward time decay function is determined just once for each data object. A scaling factor or weight associated with a data object may be weighted according to its decay function value. Forward time decay functions are beneficial in determining decayed aggregates, including decayed counts, sums, and averages, decayed minimums and maximums, and for drawing decay-influenced samples.

    摘要翻译: 用于在流数据对象的分析中实现时间衰减的公开方法基于从过去的地标时间测量到与数据的出现相关联的时间的数据对象的年龄(这里称为远期时间) 对象,例如对象的时间戳。 前进时间衰减函数在前进时间参数化。 因为数据对象的转发时间不依赖于当前时间,因此对于每个数据对象仅确定一次正向时间衰减函数的值。 可以根据其衰减函数值对与数据对象相关联的缩放因子或权重进行加权。 前向时间衰减函数有助于确定衰变的聚集体,包括衰变计数,总和和平均值,衰减最小值和最大值,以及绘制衰变影响样本。

    SELECTIVITY ESTIMATION OF SET SIMILARITY SELECTION QUERIES
    10.
    发明申请
    SELECTIVITY ESTIMATION OF SET SIMILARITY SELECTION QUERIES 失效
    选择性相似性选择问题的选择性估计

    公开(公告)号:US20100125559A1

    公开(公告)日:2010-05-20

    申请号:US12274546

    申请日:2008-11-20

    IPC分类号: G06F7/06 G06F17/30

    CPC分类号: G06F17/30469

    摘要: The invention relates to a system and/or methodology for selectivity estimation of set similarity queries. More specifically, the invention relates to a selectivity estimation technique employing hashed sampling. The invention providing for samples constructed a priori that can efficiently and quickly provide accurate estimates for arbitrary queries, and can be updated efficiently as well.

    摘要翻译: 本发明涉及用于组合相似性查询的选择性估计的系统和/或方法。 更具体地,本发明涉及采用散列采样的选择性估计技术。 本发明提供了可以有效地和快速地为任意查询提供准确估计的先验构建的样本,并且还可以有效地更新。