Method and system to estimate the cardinality of sets and set operation results from single and multiple HyperLogLog sketches

Invention Grant

US11561954B2 Method and system to estimate the cardinality of sets and set operation results from single and multiple HyperLogLog sketches 有权

Please log in to see more content

Patent Title: Method and system to estimate the cardinality of sets and set operation results from single and multiple HyperLogLog sketches
Application No.: US17358170

Application Date: 2021-06-25
Publication No.: US11561954B2

Publication Date: 2023-01-24
Inventor: Otmar Ertl
Applicant: Dynatrace LLC
Applicant Address: US MA Waltham
Assignee: Dynatrace LLC
Current Assignee: Dynatrace LLC
Current Assignee Address: US MA Waltham
Agency: Harness, Dickey & Pierce, P.L.C.
Main IPC: G06F16/22
IPC: G06F16/22 ; G06N7/00 ; G06F16/23 ; G06F17/18

Method and system to estimate the cardinality of sets and set operation results from single and multiple HyperLogLog sketches

Abstract:

A system and method for the estimation of the cardinality of large sets of transaction trace data is disclosed. The estimation is based on HyperLogLog data sketches that are capable to store cardinality relevant data of large sets with low and fixed memory requirements. The disclosure contains improvements to the known analysis methods for HyperLogLog data sketches that provide improved relative error behavior by eliminating a cardinality range dependent bias of the relative error. A new analysis method for HyperLogLog data structures is shown that uses maximum likelihood analysis methods on a Poisson based approximated probability model. In addition, a variant of the new analysis model is disclosed that uses multiple HyperLogLog data structured to directly provide estimation results for set operations like intersections or relative complement directly from the HyperLogLog input data.

Public/Granted literature

US20210319006A1 Method And System To Estimate The Cardinality Of Sets And Set Operation Results From Single And Multiple HyperLogLog Sketches Public/Granted day:2021-10-14

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/20	.•结构化数据，例如关系型数据
G06F16/22	..••索引；数据结构；存储结构