Single pass space efficent system and method for generating approximate
quantiles satisfying an apriori user-defined approximation error
    1.
    发明授权
    Single pass space efficent system and method for generating approximate quantiles satisfying an apriori user-defined approximation error 失效
    单通道空间效率系统和方法,用于生成满足先验用户定义的近似误差的近似分位数

    公开(公告)号:US6108658A

    公开(公告)日:2000-08-22

    申请号:US50434

    申请日:1998-03-30

    IPC分类号: G06F7/22 G06F17/30

    摘要: A system and method for finding an .epsilon.-approximate .phi.-quantile data element of a data set with N data elements in a single pass over the data set. The .epsilon.-approximate .phi.-quantile data element is guaranteed to lie within a user-specified approximation error .epsilon. of a true .phi.-quantile data element being sought. B buffers, each having a capacity of k elements, initially are filled with sorted data elements from the data set, with the values of b and k depending on .epsilon. and N. The buffers are then collapsed into an output buffer, with the remaining buffers then being refilled with data elements, collapsed (along with the previous output buffer), and so on until the entire data set has been processed and a single output buffer remains. A data element of the output buffer corresponding to the .epsilon.-approximate .phi.-quantile is then output as the approximate .phi.-quantile data element. If desired, the system and method can be practiced with sampling to even further reduce the amount of space required to find a desired .epsilon.-approximate .phi.-quantile data element.

    摘要翻译: 一种用于在数据集中的单次传递中找到具有N个数据元素的数据集的ε-近似phi-量子数据元素的系统和方法。 ε-近似phi - 数量数据元素被保证位于正在寻找的真实phi - 数量数据元素的用户指定的近似误差ε。 每个具有k个元素的容量的B缓冲器最初由来自数据集的排序数据元素填充,其中b和k的值取决于epsilon和N.然后,缓冲器被折叠成输出缓冲器,其余的缓冲器 然后使用数据元素重新填充数据元素,并与之前的输出缓冲区一起折叠,等等,直到整个数据集被处理完毕,并保留单个输出缓冲区。 然后,输出对应于ε-近似phi - 数量的输出缓冲器的数据元素作为近似phi - 数量数据元素。 如果需要,可以采用系统和方法来实施,以进一步减少找到所需的ε-近似phi - 数量数据元素所需的空间量。