System and method for generating and using a dynamic bloom filter
    1.
    发明授权
    System and method for generating and using a dynamic bloom filter 失效
    用于生成和使用动态布隆过滤器的系统和方法

    公开(公告)号:US07937428B2

    公开(公告)日:2011-05-03

    申请号:US11614844

    申请日:2006-12-21

    IPC分类号: G06F17/10

    CPC分类号: G06F12/0864

    摘要: A dynamic Bloom filter comprises a cascaded set of Bloom filters. The system estimates or guesses a cardinality of input items, selects a number of hash functions based on the desired false positive rate, and allocates memory for an initial Bloom filter based on the estimated cardinality and desired false positive rate. The system inserts items into the initial Bloom filter and counts the bits set as they are inserted. If the number of bits set in the current Bloom filter reaches a predetermined target, the system declares the current Bloom filter full. The system recursively generates additional Bloom filters as needed for items remaining after the initial Bloom filter is filled; items are checked to eliminate duplicates. Each of the set of Bloom filters is individually queried to identify a positive or negative in response to a query. When the system is configured such that the false positive rate of each successive Bloom filter is decreased by one half, the system guarantees a false positive rate of at most twice the desired false positive rate.

    摘要翻译: 一个动态的Bloom过滤器包括一个级联的Bloom过滤器。 系统估计或猜测输入项的基数,基于所需的假阳性率选择多个散列函数,并且基于估计的基数和期望的假阳性率为初始布隆过滤器分配存储器。 系统将项目插入到初始布隆过滤器中,并对插入的位进行计数。 如果当前布隆过滤器中设置的位数达到预定目标,则系统将声明当前布隆过滤器已满。 系统会根据需要在初始布隆过滤器填充后剩余的项目递归地生成其他布隆过滤器; 检查项目以消除重复。 每一组Bloom过滤器都被单独查询以识别响应于查询的正或负值。 当系统被配置为使得每个连续的Bloom过滤器的假阳性率减少一半时,系统保证假阳性率为期望假阳性率的两倍。

    SYSTEM AND METHOD FOR GENERATING AND USING A DYNAMIC BLOOM FILTER
    2.
    发明申请
    SYSTEM AND METHOD FOR GENERATING AND USING A DYNAMIC BLOOM FILTER 失效
    用于生成和使用动态悬浮过滤器的系统和方法

    公开(公告)号:US20080154852A1

    公开(公告)日:2008-06-26

    申请号:US11614844

    申请日:2006-12-21

    IPC分类号: G06F12/02 G06F17/30

    CPC分类号: G06F12/0864

    摘要: A dynamic Bloom filter comprises a cascaded set of Bloom filters. The system estimates or guesses a cardinality of input items, selects a number of hash functions based on the desired false positive rate, and allocates memory for an initial Bloom filter based on the estimated cardinality and desired false positive rate. The system inserts items into the initial Bloom filter and counts the bits set as they are inserted. If the number of bits set in the current Bloom filter reaches a predetermined target, the system declares the current Bloom filter full. The system recursively generates additional Bloom filters as needed for items remaining after the initial Bloom filter is filled; items are checked to eliminate duplicates. Each of the set of Bloom filters is individually queried to identify a positive or negative in response to a query. When the system is configured such that the false positive rate of each successive Bloom filter is decreased by one half, the system guarantees a false positive rate of at most twice the desired false positive rate.

    摘要翻译: 一个动态的Bloom过滤器包括一个级联的Bloom过滤器。 系统估计或猜测输入项的基数,基于所需的假阳性率选择多个散列函数,并且基于估计的基数和期望的假阳性率为初始布隆过滤器分配存储器。 系统将项目插入到初始布隆过滤器中,并对插入的位进行计数。 如果当前布隆过滤器中设置的位数达到预定目标,则系统将声明当前布隆过滤器已满。 系统会根据需要在初始布隆过滤器填充后剩余的项目递归地生成其他布隆过滤器; 检查项目以消除重复。 每一组Bloom过滤器都被单独查询以识别响应于查询的正或负值。 当系统被配置为使得每个连续的Bloom过滤器的假阳性率减少一半时,系统保证假阳性率为期望假阳性率的两倍。

    Generating and using a dynamic bloom filter
    3.
    发明授权
    Generating and using a dynamic bloom filter 失效
    生成和使用动态布局过滤器

    公开(公告)号:US08209368B2

    公开(公告)日:2012-06-26

    申请号:US12134148

    申请日:2008-06-05

    IPC分类号: G06F17/10

    CPC分类号: G06F12/0864

    摘要: A dynamic Bloom filter comprises a cascaded set of Bloom filters. The system estimates or guesses a cardinality of input items, selects a number of hash functions based on the desired false positive rate, and allocates memory for an initial Bloom filter based on the estimated cardinality and desired false positive rate. The system inserts items into the initial Bloom filter and counts the bits set as they are inserted. If the number of bits set in the current Bloom filter reaches a predetermined target, the system declares the current Bloom filter full. The system recursively generates additional Bloom filters as needed for items remaining after the initial Bloom filter is filled; items are checked to eliminate duplicates. Each of the set of Bloom filters is individually queried to identify a positive or negative in response to a query. When the system is configured such that the false positive rate of each successive Bloom filter is decreased by one half, the system guarantees a false positive rate of at most twice the desired false positive rate.

    摘要翻译: 一个动态的Bloom过滤器包括一个级联的Bloom过滤器。 系统估计或猜测输入项的基数,基于所需的假阳性率选择多个散列函数,并且基于估计的基数和期望的假阳性率为初始布隆过滤器分配存储器。 系统将项目插入到初始布隆过滤器中,并对插入的位进行计数。 如果当前布隆过滤器中设置的位数达到预定目标,则系统将声明当前布隆过滤器已满。 系统会根据需要在初始布隆过滤器填充后剩余的项目递归地生成其他布隆过滤器; 检查项目以消除重复。 每一组Bloom过滤器都被单独查询以识别响应于查询的正或负。 当系统被配置为使得每个连续的Bloom过滤器的假阳性率减少一半时,系统保证假阳性率为期望假阳性率的两倍。

    System and method for generating a cache-aware bloom filter
    4.
    发明授权
    System and method for generating a cache-aware bloom filter 失效
    用于生成缓存感知的布隆过滤器的系统和方法

    公开(公告)号:US08032732B2

    公开(公告)日:2011-10-04

    申请号:US12134125

    申请日:2008-06-05

    IPC分类号: G06F12/00

    CPC分类号: G06F17/10

    摘要: A cache-aware Bloom filter system segments a bit vector of a cache-aware Bloom filter into fixed-size blocks. The system hashes an item to be inserted into the cache-aware Bloom filter to identify one of the fixed-size blocks as a selected block for receiving the item and hashes the item k times to generate k hashed values for encoding the item for insertion in the in the selected block. The system sets bits within the selected block with addresses corresponding to the k hashed values such that accessing the item in the cache-aware Bloom filter requires accessing only the selected block to check the k hashed values. The size of the fixed-size block corresponds to a cache-line size of an associated computer architecture on which the cache-aware Bloom filter is installed.

    摘要翻译: 一个缓存感知的Bloom过滤器系统将缓存感知的Bloom过滤器的位向量分成固定大小的块。 系统将要插入到缓存感知的布隆过滤器中的项目进行散列,以将固定大小块之一识别为用于接收项目的选定块,并将项目k次哈希,以产生用于编码项目以插入的k个哈希值 在所选的块中。 系统在所选择的块内设置与k个哈希值相对应的地址的位,使得访问缓存感知的Bloom过滤器中的项目只需要访问所选择的块来检查k个哈希值。 固定大小块的大小对应于其上安装有缓存感知布隆过滤器的关联计算机体系结构的高速缓存行大小。

    SYSTEM AND METHOD FOR GENERATING A CACHE-AWARE BLOOM FILTER
    5.
    发明申请
    SYSTEM AND METHOD FOR GENERATING A CACHE-AWARE BLOOM FILTER 审中-公开
    用于生成高速缓存过滤器的系统和方法

    公开(公告)号:US20080155229A1

    公开(公告)日:2008-06-26

    申请号:US11614790

    申请日:2006-12-21

    IPC分类号: G06F9/34

    CPC分类号: G06F17/10

    摘要: A cache-aware Bloom filter system segments a bit vector of a cache-aware Bloom filter into fixed-size blocks. The system hashes an item to be inserted into the cache-aware Bloom filter to identify one of the fixed-size blocks as a selected block for receiving the item and hashes the item k times to generate k hashed values for encoding the item for insertion in the in the selected block. The system sets bits within the selected block with addresses corresponding to the k hashed values such that accessing the item in the cache-aware Bloom filter requires accessing only the selected block to check the k hashed values. The size of the fixed-size block corresponds to a cache-line size of an associated computer architecture on which the cache-aware Bloom filter is installed.

    摘要翻译: 一个缓存感知的Bloom过滤器系统将缓存感知的Bloom过滤器的位向量分成固定大小的块。 系统将要插入到缓存感知的布隆过滤器中的项目进行散列,以将固定大小块之一识别为用于接收项目的选定块,并将项目k次哈希,以产生用于编码项目以插入的k个哈希值 在所选的块中。 系统在所选择的块内设置与k个哈希值相对应的地址的位,使得访问缓存感知的Bloom过滤器中的项目只需要访问所选择的块来检查k个哈希值。 固定大小块的大小对应于其上安装有缓存感知布隆过滤器的关联计算机体系结构的高速缓存行大小。

    Adaptive evaluation of text search queries with blackbox scoring functions
    6.
    发明授权
    Adaptive evaluation of text search queries with blackbox scoring functions 失效
    具有黑盒评分功能的文本搜索查询的自适应评估

    公开(公告)号:US07991771B2

    公开(公告)日:2011-08-02

    申请号:US11561949

    申请日:2006-11-21

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30672

    摘要: Disclosed is an evaluation technique for text search with black-box scoring functions, where it is unnecessary for the evaluation engine to maintain details of the scoring function. Included is a description of a system for dealing with blackbox searching, proofs of correctness, as well experimental evidence showing that the performance of the technique is comparable in efficiency to those techniques used in custom-built engines.

    摘要翻译: 公开了一种用于具有黑匣子评分功能的文本搜索的评估技术,其中评估引擎不需要保持评分功能的细节。 包括处理黑箱搜索的系统的描述,正确性的证明,以及实验证据表明该技术的性能与定制引擎中使用的技术的效率相当。

    SYSTEM AND METHOD FOR GENERATING A CACHE-AWARE BLOOM FILTER

    公开(公告)号:US20080243941A1

    公开(公告)日:2008-10-02

    申请号:US12134125

    申请日:2008-06-05

    IPC分类号: G06F12/00

    CPC分类号: G06F17/10

    摘要: A cache-aware Bloom filter system segments a bit vector of a cache-aware Bloom filter into fixed-size blocks. The system hashes an item to be inserted into the cache-aware Bloom filter to identify one of the fixed-size blocks as a selected block for receiving the item and hashes the item k times to generate k hashed values for encoding the item for insertion in the in the selected block. The system sets bits within the selected block with addresses corresponding to the k hashed values such that accessing the item in the cache-aware Bloom filter requires accessing only the selected block to check the k hashed values. The size of the fixed-size block corresponds to a cache-line size of an associated computer architecture on which the cache-aware Bloom filter is installed.

    Adaptive Evaluation of Text Search Queries With Blackbox Scoring Functions
    9.
    发明申请
    Adaptive Evaluation of Text Search Queries With Blackbox Scoring Functions 失效
    使用Blackbox评分函数自适应评估文本搜索查询

    公开(公告)号:US20070150467A1

    公开(公告)日:2007-06-28

    申请号:US11561949

    申请日:2006-11-21

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30672

    摘要: Disclosed is an evaluation technique for text search with black-box scoring functions, where it is unnecessary for the evaluation engine to maintain details of the scoring function. Included is a description of a system for dealing with blackbox searching, proofs of correctness, as well experimental evidence showing that the performance of the technique is comparable in efficiency to those techniques used in custom-built engines.

    摘要翻译: 公开了一种用于具有黑匣子评分功能的文本搜索的评估技术,其中评估引擎不需要保持评分功能的细节。 包括处理黑箱搜索的系统的描述,正确性的证明,以及实验证据表明该技术的性能与定制引擎中使用的技术的效率相当。