Invention Grant
US08819038B1 System and method for performing set operations with defined sketch accuracy distribution
有权
用定义的草图精度分布进行设定操作的系统和方法
- Patent Title: System and method for performing set operations with defined sketch accuracy distribution
- Patent Title (中): 用定义的草图精度分布进行设定操作的系统和方法
-
Application No.: US14078301Application Date: 2013-11-12
-
Publication No.: US08819038B1Publication Date: 2014-08-26
- Inventor: Lee Rhodes , Anirban Dasgupta , Kevin J. Lang
- Applicant: Yahoo! Inc.
- Applicant Address: US CA Sunnyvale
- Assignee: Yahoo! Inc.
- Current Assignee: Yahoo! Inc.
- Current Assignee Address: US CA Sunnyvale
- Agency: Hickman Palermo Truong Becker Bingham Wong LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Techniques are provided for improving the speed and accuracy of analytics on big data using theta sketches, by converting fixed-size sketches to theta sketches, and by performing set operations on sketches. In a technique for performing a set operation, two sketches are analyzed to identify the maximum value of each sketch. The maximum values of the two sketches are compared. Based the comparison, one or more values are removed from the sketch whose maximum value is greater. After the removal, a set operation (e.g., union, intersection, or difference) is performed based on the modified sketch and the unmodified sketch. A result of the set operation is a third sketch, which may be used to estimate a cardinality of the larger data sets that are represented by the two input sketches.
Information query