-
公开(公告)号:US08005949B2
公开(公告)日:2011-08-23
申请号:US12325340
申请日:2008-12-01
申请人: Nicholas Duffield , Carsten Lund , Mikkel Thorup , Edith Cohen , Haim Kaplan
发明人: Nicholas Duffield , Carsten Lund , Mikkel Thorup , Edith Cohen , Haim Kaplan
IPC分类号: G06F15/173
CPC分类号: G06F17/18 , H04L41/142 , H04L43/024 , H04L43/16
摘要: The present invention relates to a method of obtaining a generic sample of an input stream. The method is designated as VAROPTk. The method comprises receiving an input stream of items arriving one at a time, and maintaining a sample S of items i. The sample S has a capacity for at most k items i. The sample S is filled with k items i. An nth item i is received. It is determined whether the nth item i should be included in sample S. If the nth item i is included in sample S, then a previously included item i is dropped from sample S. The determination is made based on weights of items without distinguishing between previously included items i and the nth item i. The determination is implemented thereby updating weights of items i in sample S. The method is repeated until no more items are received.
摘要翻译: 本发明涉及一种获得输入流的通用样本的方法。 该方法被指定为VAROPTk。 该方法包括一次接收一个物品的输入流,并且保持项目i的样本S. 样本S具有最多k个项目i的容量。 样本S填充有k个项目i。 收到第n项。 确定第n个项目i是否应该包含在样本S中。如果第n个项目i包括在样本S中,则先前包括的项目i从样本S中丢弃。根据项目的权重进行确定,而不区分 以前包括项目i和第n项目i。 由此实现确定,从而更新样本S中的项目i的权重。重复该方法,直到不再收到项目。
-
2.
公开(公告)号:US07764625B2
公开(公告)日:2010-07-27
申请号:US12136705
申请日:2008-06-10
申请人: Nicholas Duffield , Edith Cohen , Haim Kaplan , Carsten Lund , Mikkel Thorup
发明人: Nicholas Duffield , Edith Cohen , Haim Kaplan , Carsten Lund , Mikkel Thorup
IPC分类号: H04L12/26
CPC分类号: H04L47/10 , H04L43/024 , H04L43/026 , H04L43/045 , H04L47/22 , H04L47/41 , Y02D50/30
摘要: The invention relates to streaming algorithms useful for obtaining summaries over unaggregated packet streams and for providing unbiased estimators for characteristics, such as, the amount of traffic that belongs to a specified subpopulation of flows. Packets are sampled from a packet stream and aggregated into flows and counted by implementation of: (a) Adaptive Sampled NetFlow (ANF), and adjusted weight (AANF) of a flow (f) is calculated as follows: AANF(f)=i(f)/p′; i(f) being the number of packets counted for a flow f, and p′ being the sampling rate at end of a measurement period; or (b) Adaptive Sample-and-Hold (ASH), and adjusted weight (AASH) of a flow (f) is calculated as follows: AASH(f)=i(f)+(1−p′)/p′; i(f) being the number of packets counted for a flow f, and p′ being the sampling rate at end of a measurement period.
摘要翻译: 本发明涉及用于在未分组的分组流上获得摘要的用于提供用于特征的无偏估计器的流式传输算法,例如属于指定的流量子群的业务量。 分组从分组流中采样并聚合成流,并通过实现计算:(a)自适应采样NetFlow(ANF)和流(f)的调整权重(AANF)计算如下:AANF(f)= i (f)/ p'; i(f)是流f计数的分组数,p'是测量周期结束时的采样率; 或(b)自适应采样保持(ASH)和流(f)的调整权重(AASH)如下计算:AASH(f)= i(f)+(1-p')/ p' ; i(f)是流f计数的分组数,p'是测量周期结束时的采样率。
-
3.
公开(公告)号:US07746808B2
公开(公告)日:2010-06-29
申请号:US12136725
申请日:2008-06-10
申请人: Nicholas Duffield , Edith Cohen , Haim Kaplan , Carsten Lund , Mikkel Thorup
发明人: Nicholas Duffield , Edith Cohen , Haim Kaplan , Carsten Lund , Mikkel Thorup
IPC分类号: H04L12/26
CPC分类号: H04L43/024
摘要: The invention relates to streaming algorithms useful for obtaining summaries over unaggregated packet streams and for providing unbiased estimators for characteristics, such as, the amount of traffic that belongs to a specified subpopulation of flows. Packets are sampled from a packet stream and aggregated into flows and counted by implementation of Adaptive Sample-and-Hold (ASH) or Adaptive NetFlow (ANF), adjusting the sampling rate based on a quantity of flows to obtain a sketch having a predetermined size, the sampling rate being adjusted in steps; and transferring the count of aggregated packets from SRAM to DRAM and initializing the count in SRAM following adjustment of the sampling rate.
摘要翻译: 本发明涉及用于在未分组的分组流上获得摘要的用于提供用于特征的无偏估计器的流式传输算法,例如属于指定的流量子群的业务量。 分组从分组流中采样并聚合成流,并通过实施自适应采样保持(ASH)或自适应净流(ANF)进行计数,根据流量调整采样率,以获得具有预定尺寸的草图 采样率逐步调整; 并将汇总数据包从SRAM传输到DRAM,并在采样率调整后初始化SRAM中的计数。
-
公开(公告)号:US20110153554A1
公开(公告)日:2011-06-23
申请号:US12653831
申请日:2009-12-18
申请人: Edith Cohen , Nicholas Duffield , Haim Kaplan , Carsten Lund , Mikkel Thorup
发明人: Edith Cohen , Nicholas Duffield , Haim Kaplan , Carsten Lund , Mikkel Thorup
IPC分类号: G06F17/30
CPC分类号: H04L43/028 , H04L43/04
摘要: A method for producing a summary A of data points in an unaggregated data stream wherein the data points are in the form of weighted keys (a, w) where a is a key and w is a weight, and the summary is a sample of k keys a with adjusted weights wa. A first reservoir L includes keys having adjusted weights which are additions of weights of individual data points of included keys and a second reservoir T includes keys having adjusted weights which are each equal to a threshold value τ whose value is adjusted based upon tests of new data points arriving in the data stream. The summary combines the keys and adjusted weights of the first reservoir L with the keys and adjusted weights of the second reservoir T to form the sample representing the data stream upon which further analysis may be performed. The method proceeds by first merging new data points in the stream into the reservoir L until the reservoir contains k different keys and thereafter applying a series of tests to new arriving data points to determine what keys and weights are to be added to or removed the reservoirs L and T to provide a summary with a variance that approaches the minimum possible for aggregated data sets. The method is composable, can be applied to high speed data streams such as those found on the Internet, and can be implemented efficiently.
摘要翻译: 一种用于产生未聚集数据流中的数据点的摘要A的方法,其中数据点是加权密钥(a,w)的形式,其中a是密钥,w是权重,并且摘要是k的样本 键a与调整权重wa。 第一储存器L包括具有调整权重的密钥,这些密钥是附加密钥的各个数据点的加权的加法,而第二储存器T包括具有调整的权重的密钥,其各自等于基于新数据的测试来调整其值的阈值τ 到达数据流的点。 总结将第一储层L的密钥和调整的权重与密钥和第二储存器T的调整权重组合,以形成表示可以进行进一步分析的数据流的样本。 该方法通过首先将流中的新数据点合并到储存器L中,直到储存器包含k个不同的密钥,然后对新的到达数据点应用一系列测试,以确定要添加到或移除存储器的哪些密钥和权重 L和T提供一个总结,其方差接近汇总数据集的最小可能性。 该方法是可组合的,可以应用于诸如在因特网上发现的高速数据流,并且可以有效地实现。
-
公开(公告)号:US20100138529A1
公开(公告)日:2010-06-03
申请号:US12325340
申请日:2008-12-01
申请人: Nicholas Duffield , Carsten Lund , Mikkel Thorup , Edith Cohen , Haim Kaplan
发明人: Nicholas Duffield , Carsten Lund , Mikkel Thorup , Edith Cohen , Haim Kaplan
CPC分类号: G06F17/18 , H04L41/142 , H04L43/024 , H04L43/16
摘要: The present invention relates to a method of obtaining a generic sample of an input stream. The method is designated as VAROPTk. The method comprises receiving an input stream of items arriving one at a time, and maintaining a sample S of items i. The sample S has a capacity for at most k items i. The sample S is filled with k items i. An nth item i is received. It is determined whether the nth item i should be included in sample S. If the nth item i is included in sample S, then a previously included item i is dropped from sample S. The determination is made based on weights of items without distinguishing between previously included items i and the nth item i. The determination is implemented thereby updating weights of items i in sample S. The method is repeated until no more items are received.
摘要翻译: 本发明涉及一种获得输入流的通用样本的方法。 该方法被指定为VAROPTk。 该方法包括一次接收一个物品的输入流,并且保持项目i的样本S. 样本S具有最多k个项目i的容量。 样本S填充有k个项目i。 收到第n项。 确定第n个项目i是否应该包含在样本S中。如果第n个项目i包括在样本S中,则先前包括的项目i从样本S中丢弃。根据项目的权重进行确定,而不区分 以前包括项目i和第n项目i。 由此实现确定,从而更新样本S中的项目i的权重。重复该方法,直到不再收到项目。
-
公开(公告)号:US08195710B2
公开(公告)日:2012-06-05
申请号:US12653831
申请日:2009-12-18
申请人: Edith Cohen , Nicholas Duffield , Haim Kaplan , Carsten Lund , Mikkel Thorup
发明人: Edith Cohen , Nicholas Duffield , Haim Kaplan , Carsten Lund , Mikkel Thorup
IPC分类号: G06F17/00
CPC分类号: H04L43/028 , H04L43/04
摘要: A method for producing a summary A of data points in an unaggregated data stream wherein the data points are in the form of weighted keys (a, w) where a is a key and w is a weight, and the summary is a sample of k keys a with adjusted weights wa. A first reservoir L includes keys having adjusted weights which are additions of weights of individual data points of included keys and a second reservoir T includes keys having adjusted weights which are each equal to a threshold value τ whose value is adjusted based upon tests of new data points arriving in the data stream. The summary combines the keys and adjusted weights of the first reservoir L with the keys and adjusted weights of the second reservoir T to form the sample representing the data stream upon which further analysis may be performed. The method proceeds by first merging new data points in the stream into the reservoir L until the reservoir contains k different keys and thereafter applying a series of tests to new arriving data points to determine what keys and weights are to be added to or removed the reservoirs L and T to provide a summary with a variance that approaches the minimum possible for aggregated data sets. The method is composable, can be applied to high speed data streams such as those found on the Internet, and can be implemented efficiently.
摘要翻译: 一种用于产生未聚集数据流中的数据点的摘要A的方法,其中数据点是加权密钥(a,w)的形式,其中a是密钥,w是权重,并且摘要是k的样本 键a与调整权重wa。 第一储存器L包括具有调整权重的密钥,这些密钥是附加密钥的各个数据点的加权的加法,而第二储存器T包括具有调整的权重的密钥,其各自等于基于新数据的测试来调整其值的阈值τ 到达数据流的点。 总结将第一储层L的密钥和调整的权重与密钥和第二储存器T的调整权重组合,以形成表示可以进行进一步分析的数据流的样本。 该方法通过首先将流中的新数据点合并到储存器L中,直到储存器包含k个不同的密钥,然后对新的到达数据点应用一系列测试,以确定要添加到或移除存储器的哪些密钥和权重 L和T提供一个总结,其方差接近汇总数据集的最小可能性。 该方法是可组合的,可以应用于诸如在因特网上发现的高速数据流,并且可以有效地实现。
-
7.
公开(公告)号:US20090303901A1
公开(公告)日:2009-12-10
申请号:US12136725
申请日:2008-06-10
申请人: Nicholas Duffield , Edith Cohen , Haim Kaplan , Carsten Lund , Mikkel Thorup
发明人: Nicholas Duffield , Edith Cohen , Haim Kaplan , Carsten Lund , Mikkel Thorup
IPC分类号: H04L12/26
CPC分类号: H04L43/024
摘要: The invention relates to streaming algorithms useful for obtaining summaries over unaggregated packet streams and for providing unbiased estimators for characteristics, such as, the amount of traffic that belongs to a specified subpopulation of flows. Packets are sampled from a packet stream and aggregated into flows and counted by implementation of Adaptive Sample-and-Hold (ASH) or Adaptive NetFlow (ANF), adjusting the sampling rate based on a quantity of flows to obtain a sketch having a predetermined size, the sampling rate being adjusted in steps; and transferring the count of aggregated packets from SRAM to DRAM and initializing the count in SRAM following adjustment of the sampling rate.
摘要翻译: 本发明涉及用于在未分组的分组流上获得摘要的用于提供用于特征的无偏估计器的流式传输算法,例如属于指定的流量子群的业务量。 分组从分组流中采样并聚合成流,并通过实施自适应采样保持(ASH)或自适应净流(ANF)进行计数,根据流量调整采样率,以获得具有预定尺寸的草图 采样率逐步调整; 并将汇总数据包从SRAM传输到DRAM,并在采样率调整后初始化SRAM中的计数。
-
8.
公开(公告)号:US20090303879A1
公开(公告)日:2009-12-10
申请号:US12136705
申请日:2008-06-10
申请人: Nicholas Duffield , Edith Cohen , Haim Kaplan , Carsten Lund , Mikkel Thorup
发明人: Nicholas Duffield , Edith Cohen , Haim Kaplan , Carsten Lund , Mikkel Thorup
IPC分类号: H04L12/56
CPC分类号: H04L47/10 , H04L43/024 , H04L43/026 , H04L43/045 , H04L47/22 , H04L47/41 , Y02D50/30
摘要: The invention relates to streaming algorithms useful for obtaining summaries over unaggregated packet streams and for providing unbiased estimators for characteristics, such as, the amount of traffic that belongs to a specified subpopulation of flows. Packets are sampled from a packet stream and aggregated into flows and counted by implementation of: (a) Adaptive Sampled NetFlow (ANF), and adjusted weight (AANF) of a flow (f) is calculated as follows: AANF(f)=i(f)/p′; i(f) being the number of packets counted for a flow f, and p′ being the sampling rate at end of a measurement period; or (b) Adaptive Sample-and-Hold (ASH), and adjusted weight (AASH) of a flow (f) is calculated as follows: AASH(f)=i(f)+(1−p′)/p′; i(f) being the number of packets counted for a flow f, and p′ being the sampling rate at end of a measurement period.
摘要翻译: 本发明涉及用于在未分组的分组流上获得摘要的用于提供用于特征的无偏估计器的流式传输算法,例如属于指定的流量子群的业务量。 分组从分组流中采样并聚合成流,并通过实现计算:(a)自适应采样NetFlow(ANF)和流(f)的调整权重(AANF)计算如下:AANF(f)= i (f)/ p'; i(f)是流f计数的分组数,p'是测量周期结束时的采样率; 或(b)自适应采样保持(ASH)和流(f)的调整权重(AASH)如下计算:AASH(f)= i(f)+(1-p')/ p' ; i(f)是流f计数的分组数,p'是测量周期结束时的采样率。
-
公开(公告)号:US07852785B2
公开(公告)日:2010-12-14
申请号:US12119939
申请日:2008-05-13
申请人: Carsten Lund , Edith Cohen , Nicholas Duffield , Alexandre Gerber , Adam Hersh , Oliver Spatscheck , Mikkel Thorup , Frederick True
发明人: Carsten Lund , Edith Cohen , Nicholas Duffield , Alexandre Gerber , Adam Hersh , Oliver Spatscheck , Mikkel Thorup , Frederick True
IPC分类号: G01R31/08
CPC分类号: H04L43/026 , H04L43/022
摘要: The preferred embodiments of the present invention can include sampling packets transmitted over a network based on the content of the packets. If a packet is sampled, the sampling unit can add one or more fields to the sampled packet that can include a field for a number of bytes contained in the packet, a packet count, a flow count, a sampling type, and the like. The sampled packets can be analyzed to discern desired information from the packets. The additional fields that are added to the sampled packets can be used during the analysis.
摘要翻译: 本发明的优选实施例可以包括基于分组的内容对通过网络传输的分组进行抽样。 如果分组被采样,则采样单元可以将一个或多个字段添加到采样分组,该分组可以包括分组中包含的多个字节的字段,分组计数,流量计数,采样类型等。 可以分析采样的分组以从分组中辨别所需的信息。 在分析过程中可以使用添加到采样数据包的附加字段。
-
公开(公告)号:US20090316590A1
公开(公告)日:2009-12-24
申请号:US12119939
申请日:2008-05-13
申请人: Carsten Lund , Edith Cohen , Nicholas Duffield , Alexandre Gerber , Adam Hersh , Oliver Spatscheck , Mikkel Thorup , Frederick True
发明人: Carsten Lund , Edith Cohen , Nicholas Duffield , Alexandre Gerber , Adam Hersh , Oliver Spatscheck , Mikkel Thorup , Frederick True
IPC分类号: H04L12/26
CPC分类号: H04L43/026 , H04L43/022
摘要: The preferred embodiments of the present invention can include sampling packets transmitted over a network based on the content of the packets. If a packet is sampled, the sampling unit can add one or more fields to the sampled packet that can include a field for a number of bytes contained in the packet, a packet count, a flow count, a sampling type, and the like. The sampled packets can be analyzed to discern desired information from the packets. The additional fields that are added to the sampled packets can be used during the analysis.
摘要翻译: 本发明的优选实施例可以包括基于分组的内容对通过网络传输的分组进行抽样。 如果分组被采样,则采样单元可以将一个或多个字段添加到采样分组,该分组可以包括分组中包含的多个字节的字段,分组计数,流量计数,采样类型等。 可以分析采样的分组以从分组中辨别所需的信息。 在分析过程中可以使用添加到采样数据包的附加字段。
-
-
-
-
-
-
-
-
-