Algorithms and estimators for summarization of unaggregated data streams
    1.
    发明授权
    Algorithms and estimators for summarization of unaggregated data streams 失效
    用于汇总未分类数据流的算法和估计

    公开(公告)号:US07746808B2

    公开(公告)日:2010-06-29

    申请号:US12136725

    申请日:2008-06-10

    CPC classification number: H04L43/024

    Abstract: The invention relates to streaming algorithms useful for obtaining summaries over unaggregated packet streams and for providing unbiased estimators for characteristics, such as, the amount of traffic that belongs to a specified subpopulation of flows. Packets are sampled from a packet stream and aggregated into flows and counted by implementation of Adaptive Sample-and-Hold (ASH) or Adaptive NetFlow (ANF), adjusting the sampling rate based on a quantity of flows to obtain a sketch having a predetermined size, the sampling rate being adjusted in steps; and transferring the count of aggregated packets from SRAM to DRAM and initializing the count in SRAM following adjustment of the sampling rate.

    Abstract translation: 本发明涉及用于在未分组的分组流上获得摘要的用于提供用于特征的无偏估计器的流式传输算法,例如属于指定的流量子群的业务量。 分组从分组流中采样并聚合成流,并通过实施自适应采样保持(ASH)或自适应净流(ANF)进行计数,根据流量调整采样率,以获得具有预定尺寸的草图 采样率逐步调整; 并将汇总数据包从SRAM传输到DRAM,并在采样率调整后初始化SRAM中的计数。

    Method and apparatus for improving end to end performance of a data network
    2.
    发明授权
    Method and apparatus for improving end to end performance of a data network 失效
    一种改善数据网络端到端性能的方法和装置

    公开(公告)号:US06330561B1

    公开(公告)日:2001-12-11

    申请号:US09105018

    申请日:1998-06-26

    Abstract: A method and apparatus provide improved cache coherency and more effective caching operations without placing an undue burden on network links. A proxy receives a request for a resource and then, depending on information in the proxy cache, generates a resource request for transmission to a resource server. The proxy appends a proxy filter to the request. The resource server maintains one or more volumes of resources based on some predetermined criterion that can be either static or dynamic in nature. Upon receipt of the request and the proxy filter the resource server generates a request response and a piggybacked list of additional resources selected from the volume with which the requested resource is associated.

    Abstract translation: 一种方法和装置提供改进的高速缓存一致性和更有效的高速缓存操作,而不会对网络链路造成不必要的负担。 代理接收对资源的请求,然后根据代理缓存中的信息生成资源请求以传送到资源服务器。 该代理为请求附加一个代理筛选器。 资源服务器基于某些可以是静态或动态的预定标准来维护一个或多个资源量。 在接收到请求和代理过滤器之后,资源服务器生成从所请求的资源与之相关联的卷中选择的附加资源的请求响应和附带的列表。

    Retrieval system and method
    3.
    发明授权
    Retrieval system and method 失效
    检索系统和方法

    公开(公告)号:US5950189A

    公开(公告)日:1999-09-07

    申请号:US775913

    申请日:1997-01-02

    Abstract: The invention is an improved retrieval system and method. Many pattern recognition tasks, including estimation, classification, and the finding of similar objects, make use of linear models. For example, many text retrieval systems represent queries as linear functions, and retrieve documents whose vector representation has a high dot product with the query. The fundamental operation in such tasks is the computation of the dot product between a query vector and a large database of instance vectors. Often instance vectors which have high dot products with the query are of interest. The invention relates to a random sampling based retrieval system that can identify, for any given query vector, those instance vectors which have large dot products, while avoiding explicit computation of all dot products.

    Abstract translation: 本发明是一种改进的检索系统和方法。 许多模式识别任务,包括估计,分类和类似对象的发现,都使用线性模型。 例如,许多文本检索系统将查询表示为线性函数,并且检索其向量表示与查询具有高点积的文档。 这些任务的基本操作是计算查询向量和实例向量的大型数据库之间的点积。 通常,具有查询的高点积的实例向量是感兴趣的。 本发明涉及一种基于随机抽样的检索系统,可以为任何给定的查询向量识别具有大点积的那些实例向量,同时避免所有点产品的显式计算。

    Method for summarizing data in unaggregated data streams
    5.
    发明申请
    Method for summarizing data in unaggregated data streams 有权
    用于汇总未分组数据流中的数据的方法

    公开(公告)号:US20110153554A1

    公开(公告)日:2011-06-23

    申请号:US12653831

    申请日:2009-12-18

    CPC classification number: H04L43/028 H04L43/04

    Abstract: A method for producing a summary A of data points in an unaggregated data stream wherein the data points are in the form of weighted keys (a, w) where a is a key and w is a weight, and the summary is a sample of k keys a with adjusted weights wa. A first reservoir L includes keys having adjusted weights which are additions of weights of individual data points of included keys and a second reservoir T includes keys having adjusted weights which are each equal to a threshold value τ whose value is adjusted based upon tests of new data points arriving in the data stream. The summary combines the keys and adjusted weights of the first reservoir L with the keys and adjusted weights of the second reservoir T to form the sample representing the data stream upon which further analysis may be performed. The method proceeds by first merging new data points in the stream into the reservoir L until the reservoir contains k different keys and thereafter applying a series of tests to new arriving data points to determine what keys and weights are to be added to or removed the reservoirs L and T to provide a summary with a variance that approaches the minimum possible for aggregated data sets. The method is composable, can be applied to high speed data streams such as those found on the Internet, and can be implemented efficiently.

    Abstract translation: 一种用于产生未聚集数据流中的数据点的摘要A的方法,其中数据点是加权密钥(a,w)的形式,其中a是密钥,w是权重,并且摘要是k的样本 键a与调整权重wa。 第一储存器L包括具有调整权重的密钥,这些密钥是附加密钥的各个数据点的加权的加法,而第二储存器T包括具有调整的权重的密钥,其各自等于基于新数据的测试来调整其值的阈值τ 到达数据流的点。 总结将第一储层L的密钥和调整的权重与密钥和第二储存器T的调整权重组合,以形成表示可以进行进一步分析的数据流的样本。 该方法通过首先将流中的新数据点合并到储存器L中,直到储存器包含k个不同的密钥,然后对新的到达数据点应用一系列测试,以确定要添加到或移除存储器的哪些密钥和权重 L和T提供一个总结,其方差接近汇总数据集的最小可能性。 该方法是可组合的,可以应用于诸如在因特网上发现的高速数据流,并且可以有效地实现。

    Variance-Optimal Sampling-Based Estimation of Subset Sums
    6.
    发明申请
    Variance-Optimal Sampling-Based Estimation of Subset Sums 失效
    子集合的方差最优采样估计

    公开(公告)号:US20100138529A1

    公开(公告)日:2010-06-03

    申请号:US12325340

    申请日:2008-12-01

    CPC classification number: G06F17/18 H04L41/142 H04L43/024 H04L43/16

    Abstract: The present invention relates to a method of obtaining a generic sample of an input stream. The method is designated as VAROPTk. The method comprises receiving an input stream of items arriving one at a time, and maintaining a sample S of items i. The sample S has a capacity for at most k items i. The sample S is filled with k items i. An nth item i is received. It is determined whether the nth item i should be included in sample S. If the nth item i is included in sample S, then a previously included item i is dropped from sample S. The determination is made based on weights of items without distinguishing between previously included items i and the nth item i. The determination is implemented thereby updating weights of items i in sample S. The method is repeated until no more items are received.

    Abstract translation: 本发明涉及一种获得输入流的通用样本的方法。 该方法被指定为VAROPTk。 该方法包括一次接收一个物品的输入流,并且保持项目i的样本S. 样本S具有最多k个项目i的容量。 样本S填充有k个项目i。 收到第n项。 确定第n个项目i是否应该包含在样本S中。如果第n个项目i包括在样本S中,则先前包括的项目i从样本S中丢弃。根据项目的权重进行确定,而不区分 以前包括项目i和第n项目i。 由此实现确定,从而更新样本S中的项目i的权重。重复该方法,直到不再收到项目。

    METHOD AND APPARATUS FOR EFFICIENT ROUTING OF VARIABLE TRAFFIC
    8.
    发明申请
    METHOD AND APPARATUS FOR EFFICIENT ROUTING OF VARIABLE TRAFFIC 有权
    用于有效路由可变交通的方法和装置

    公开(公告)号:US20080239991A1

    公开(公告)日:2008-10-02

    申请号:US12132532

    申请日:2008-06-03

    CPC classification number: H04L41/5009 H04L43/0882 H04L45/12 H04L45/14

    Abstract: A method and apparatus for provide highly efficient traffic routing for a wide range of possible traffic matrices (TM) in an intra-domain network. That routing optimally balances the traffic loads over a range of traffic matrices so as to minimize the deviation for any particular traffic matrix from the optimal routing. Such a routing provides a guaranteed performance ratio against the best possible network routing. The invention utilizes a method of optimally configuring a traffic network based on solving a linear program to obtain the optimal routing, and then configuring the routing on the network accordingly.

    Abstract translation: 一种用于为域内网络中的广泛范围的可能的业务矩阵(TM)提供高效的业务路由的方法和装置。 该路由最优地平衡一系列业务矩阵上的流量负载,以便最小化任何特定流量矩阵与最优路由的偏差。 这种路由提供了与最佳可能的网络路由保证的性能比。 本发明利用一种基于求解线性程序来最优配置交通网络的方法来获得最优路由,然后相应地在网络上配置路由。

    Methods and systems to estimate query responses based on data set sketches
    10.
    发明授权
    Methods and systems to estimate query responses based on data set sketches 有权
    基于数据集草图来估计查询响应的方法和系统

    公开(公告)号:US08738618B2

    公开(公告)日:2014-05-27

    申请号:US12334152

    申请日:2008-12-12

    CPC classification number: G06F17/3053 G06F17/30979

    Abstract: Methods and systems for estimate derivation are described. In one embodiment, a query may be received with a predicate for sets over a collection of items. Associated samples associated with the query may be accessed. Items of an associated sample may be accessed from the collection of items. A determination of whether the predicate is an attribute-based selection from a union of at least some sets may be made. Available items of the particular associated sample may be selected from the items. Identified items may be identified among the available items in the associated sample that satisfy the predicate. An adjusted weight may be assigned to an item based on a weight of the item and a distribution of the associated samples. An estimate may be generated based on the adjusted weight of the identified items of the associated samples that satisfy the predicate. Additional methods and systems are disclosed.

    Abstract translation: 描述了用于估计推导的方法和系统。 在一个实施例中,可以使用关于项集合的集合的谓词来接收查询。 可以访问与查询相关联的关联样本。 可以从项目集合中访问相关联样本的项目。 可以确定谓词是否是来自至少一些集合的联合的基于属性的选择。 可以从项目中选择特定关联样品的可用项目。 可以在满足谓词的关联样本中的可用项目之间识别所识别的项目。 可以基于项目的权重和相关联样本的分布将调整后的权重分配给项目。 可以基于满足谓词的相关联样本的所识别项目的调整权重来生成估计。 公开了附加的方法和系统。

Patent Agency Ranking