Fast and accurate rule selection for interpretable decision sets

    公开(公告)号:US11704591B2

    公开(公告)日:2023-07-18

    申请号:US16353076

    申请日:2019-03-14

    Applicant: Adobe Inc.

    Abstract: An IDS generator determines multiple classes for electronic data items. The IDS generator determines, for each class, a class-specific candidate ruleset. The IDS generator performs a differential analysis of each class-specific candidate ruleset. The differential analysis is based on differences between result values of a scoring objective function. In some cases, the differential analysis determines at least one of the differences based on additional data structures, such as an augmented frequent-pattern tree. A probability function based on the differences is compared to a threshold probability At least one testing ruleset is modified based on the comparison. The IDS generator determines, for each class, a class-specific optimized ruleset based on the differential analysis of each class-specific candidate ruleset. The IDS generator creates an optimized interpretable decision set based on combined class-specific optimized rulesets for the multiple classes.

    Identifying high value segments in categorical data

    公开(公告)号:US10929438B2

    公开(公告)日:2021-02-23

    申请号:US16008601

    申请日:2018-06-14

    Applicant: Adobe Inc.

    Abstract: Systems and techniques for identifying segments in categorical data include receiving multiple transaction ID (TID) lists with univariate values that satisfy a thresholding metric with each TID list representing an occurrence of a single attribute in a set of transactions. The TID lists are stored with the univariate values that satisfy the thresholding metric in a data structure. In a loop, candidate itemsets to form from combinations of TID lists are determined using only the combinations of TID lists that satisfy categorical constraints. In the loop, for the candidate itemsets that satisfy categorical constraints, both the thresholding metric and a similarity metric are applied to the candidate itemsets. Final itemsets are formed from only the candidate itemsets that satisfy both the thresholding metric and the similarity metric.

    IDENTIFYING HIGH VALUE SEGMENTS IN CATEGORICAL DATA

    公开(公告)号:US20190384853A1

    公开(公告)日:2019-12-19

    申请号:US16008601

    申请日:2018-06-14

    Applicant: Adobe Inc.

    Abstract: Systems and techniques for identifying segments in categorical data include receiving multiple transaction ID (TID) lists with univariate values that satisfy a thresholding metric with each TID list representing an occurrence of a single attribute in a set of transactions. The TID lists are stored with the univariate values that satisfy the thresholding metric in a data structure. In a loop, candidate itemsets to form from combinations of TID lists are determined using only the combinations of TID lists that satisfy categorical constraints. In the loop, for the candidate itemsets that satisfy categorical constraints, both the thresholding metric and a similarity metric are applied to the candidate itemsets. Final itemsets are formed from only the candidate itemsets that satisfy both the thresholding metric and the similarity metric.

    ACCURATE AND INTERPRETABLE RULES FOR USER SEGMENTATION

    公开(公告)号:US20190180193A1

    公开(公告)日:2019-06-13

    申请号:US15837929

    申请日:2017-12-11

    Applicant: Adobe Inc.

    Abstract: Various embodiments describe user segmentation. In an example, potential rules are generated by applying a frequency-based analysis to user interaction data points. Each of the potential rules includes a set of attributes of the user interaction data points and indicates that these data points belong to a segment of interest. An objective function is used to select an optimal set of rules from the potential rules for the segment of interest. The potential rules are used as variable inputs to the objective function and this function is optimized based on interpretability and accuracy parameters. Each rule from the optimal set is associated with a group of the segment of interest. The user interaction data points are segments into the groups by matching attributes of these data points with the rules.

Patent Agency Ranking