Methods and apparatus for representing probabilistic data using a probabilistic histogram
    1.
    发明授权
    Methods and apparatus for representing probabilistic data using a probabilistic histogram 失效
    使用概率直方图表示概率数据的方法和装置

    公开(公告)号:US08145669B2

    公开(公告)日:2012-03-27

    申请号:US12636544

    申请日:2009-12-11

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30536

    摘要: Methods and apparatus for representing probabilistic data using a probabilistic histogram are disclosed. An example method comprises partitioning a plurality of ordered data items into a plurality of buckets, each of the data items capable of having a data value from a plurality of possible data values with a probability characterized by a respective individual probability distribution function (PDF), each bucket associated with a respective subset of the ordered data items bounded by a respective beginning data item and a respective ending data item, and determining a first representative PDF for a first bucket associated with a first subset of the ordered data items by partitioning the plurality of possible data values into a first plurality of representative data ranges and respective representative probabilities based on an error between the first representative PDF and a first plurality of individual PDFs characterizing the first subset of the ordered data items.

    摘要翻译: 公开了使用概率直方图表示概率数据的方法和装置。 一种示例性方法包括将多个有序数据项划分成多个桶,每个数据项能够具有来自多个可能数据值的数据值,其特征在于各自的概率分布函数(PDF), 每个桶与由相应的开始数据项和相应的结束数据项限定的有序数据项的相应子集相关联,并且通过分割多个数据项来确定与有序数据项的第一子集相关联的第一个桶的第一代表性PDF 基于第一代表性PDF和表征有序数据项的第一子集的第一多个单独PDF之间的误差,将可能的数据值转换成第一多个代表性数据范围和相应的代表概率。

    METHODS AND APPARATUS FOR REPRESENTING PROBABILISTIC DATA USING A PROBABILISTIC HISTOGRAM
    2.
    发明申请
    METHODS AND APPARATUS FOR REPRESENTING PROBABILISTIC DATA USING A PROBABILISTIC HISTOGRAM 失效
    使用概率组织表示概率数据的方法和装置

    公开(公告)号:US20110145223A1

    公开(公告)日:2011-06-16

    申请号:US12636544

    申请日:2009-12-11

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30536

    摘要: Methods and apparatus for representing probabilistic data using a probabilistic histogram are disclosed. An example method comprises partitioning a plurality of ordered data items into a plurality of buckets, each of the data items capable of having a data value from a plurality of possible data values with a probability characterized by a respective individual probability distribution function (PDF), each bucket associated with a respective subset of the ordered data items bounded by a respective beginning data item and a respective ending data item, and determining a first representative PDF for a first bucket associated with a first subset of the ordered data items by partitioning the plurality of possible data values into a first plurality of representative data ranges and respective representative probabilities based on an error between the first representative PDF and a first plurality of individual PDFs characterizing the first subset of the ordered data items.

    摘要翻译: 公开了使用概率直方图表示概率数据的方法和装置。 一种示例性方法包括将多个有序数据项划分成多个桶,每个数据项能够具有来自多个可能数据值的数据值,其特征在于各自的概率分布函数(PDF), 每个桶与由相应的开始数据项和相应的结束数据项限定的有序数据项的相应子集相关联,并且通过分割多个数据项来确定与有序数据项的第一子集相关联的第一个桶的第一代表性PDF 基于第一代表性PDF和表征有序数据项的第一子集的第一多个单独PDF之间的误差,将可能的数据值转换成第一多个代表性数据范围和相应的代表概率。

    Probabilistic wavelet synopses for multiple measures
    3.
    发明申请
    Probabilistic wavelet synopses for multiple measures 审中-公开
    用于多种措施的概率小波概要

    公开(公告)号:US20070058871A1

    公开(公告)日:2007-03-15

    申请号:US11225539

    申请日:2005-09-13

    IPC分类号: G06K9/62 G06F17/30

    CPC分类号: G06F16/2462 G06F16/283

    摘要: A technique for building probabilistic wavelet synopses for multi-measure data sets is provided. In the presence of multiple measures, it is demonstrated that the problem of exact probabilistic coefficient thresholding becomes significantly more complex. An algorithmic formulation for probabilistic multi-measure wavelet thresholding based on the idea of partial-order dynamic programming (PODP) is provided. A fast, greedy approximation algorithm for probabilistic multi-measure thresholding based on the idea of marginal error gains is provided. An empirical study with both synthetic and real-life data sets validated the approach, demonstrating that the algorithms outperform naive approaches based on optimizing individual measures independently and the greedy thresholding scheme provides near-optimal and, at the same time, fast and scalable solutions to the probabilistic wavelet synopsis construction problem.

    摘要翻译: 提供了一种用于构建多尺度数据集的概率小波概要的技术。 在存在多种措施的情况下,证明精确概率系数阈值问题变得更加复杂。 提供了基于部分阶次动态规划(PODP)思想的概率多尺度小波阈值算法算法。 提供了一种基于边际误差增益概念的概率多尺度阈值的快速,贪心近似算法。 通过合成和现实生活数据集的实证研究验证了该方法,证明了该算法比独立优化独立度量方法优于天真的方法,并且贪心阈值方案提供了近似最优的方法,并且同时提供了快速和可扩展的解决方案 概率小波概要构造问题。