Trait expansion techniques in binary matrix datasets

    公开(公告)号:US11899693B2

    公开(公告)日:2024-02-13

    申请号:US17677323

    申请日:2022-02-22

    Applicant: Adobe Inc.

    CPC classification number: G06F16/285

    Abstract: A cluster generation system identifies data elements, from a first binary record, that each have a particular value and correspond to respective binary traits. A candidate description function describing the binary traits is generated, the candidate description function including a model factor that describes the data elements. Responsive to determining that a second record has additional data elements having the particular value and corresponding to the respective binary traits, the candidate description function is modified to indicate that the model factor describes the additional elements. The candidate description function is also modified to include a correction factor describing an additional binary trait excluded from the respective binary traits. Based on the modified candidate description function, the cluster generation system generates a data summary cluster, which includes a compact representation of the binary traits of the data elements and additional data elements.

    GENERATING OVERLAP ESTIMATIONS BETWEEN HIGH-VOLUME DIGITAL DATA SETS BASED ON MULTIPLE SKETCH VECTOR SIMILARITY ESTIMATORS

    公开(公告)号:US20220138218A1

    公开(公告)日:2022-05-05

    申请号:US17090556

    申请日:2020-11-05

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.

    Generating overlap estimations between high-volume digital data sets based on multiple sketch vector similarity estimators

    公开(公告)号:US11720592B2

    公开(公告)日:2023-08-08

    申请号:US17818974

    申请日:2022-08-10

    Applicant: Adobe Inc.

    CPC classification number: G06F16/26 G06F16/285 G06F16/288 G06T11/206

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.

    UTILIZING ONE HASH PERMUTATION AND POPULATED-VALUE-SLOT-BASED DENSIFICATION FOR GENERATING AUDIENCE SEGMENT TRAIT RECOMMENDATIONS

    公开(公告)号:US20200314472A1

    公开(公告)日:2020-10-01

    申请号:US16367628

    申请日:2019-03-28

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to training a recommendation model to generate trait recommendations using one permutation hashing and populated-value-slot-based densification. In particular, the disclosed systems can train the recommendation model by computing sketch vectors corresponding to traits using one permutation hashing. The disclosed systems can then fill in unpopulated value slots of the sketch vectors using populated-value-slot-based densification. The disclosed systems can combine the resulting densified sketches to generate the trained recommendation model. For example, in some embodiments, the disclosed systems can combine the sketches by generating a plurality of locality sensitive hashing tables based on the sketches. In some embodiments, the disclosed systems generate a count sketch matrix based on the sketches and generate trait embeddings based on the count sketch matrix using spectral embedding. Based on the trait embeddings, the disclosed systems can utilize the recommendation model to flexibly and accurately determine the similarity between traits.

    Tunable algorithmic segments
    6.
    发明授权

    公开(公告)号:US10373197B2

    公开(公告)日:2019-08-06

    申请号:US13726308

    申请日:2012-12-24

    Applicant: Adobe Inc.

    Abstract: Tunable algorithmic segment techniques are described. In one or more implementations, a target audience definition is obtained that is input to initiate creation of a look-alike model. The target audience definition indicates traits associated with a baseline group of consumers who have interacted with online resources in a designated manner, such as by buying a product, visiting a website, using a service, and so forth. Tuning parameters designated for the look-alike model are ascertained and the look-alike model is built based on the target audience definition and the tuning parameters. The tuning parameters may include at least a setting selectable to control reach versus accuracy for the look-alike model. Segment data indicative of market segments generated according to the look-alike model may then be exposed for manipulation by a client. The manipulation may include selectable control over the tuning parameters to generate different look-alike groups from the segment data.

    Trait Expansion Techniques in Binary Matrix Datasets

    公开(公告)号:US20230267132A1

    公开(公告)日:2023-08-24

    申请号:US17677323

    申请日:2022-02-22

    Applicant: Adobe Inc.

    CPC classification number: G06F16/285

    Abstract: A cluster generation system identifies data elements, from a first binary record, that each have a particular value and correspond to respective binary traits. A candidate description function describing the binary traits is generated, the candidate description function including a model factor that describes the data elements. Responsive to determining that a second record has additional data elements having the particular value and corresponding to the respective binary traits, the candidate description function is modified to indicate that the model factor describes the additional elements. The candidate description function is also modified to include a correction factor describing an additional binary trait excluded from the respective binary traits. Based on the modified candidate description function, the cluster generation system generates a data summary cluster, which includes a compact representation of the binary traits of the data elements and additional data elements.

    Generating overlap estimations between high-volume digital data sets based on multiple sketch vector similarity estimators

    公开(公告)号:US11449523B2

    公开(公告)日:2022-09-20

    申请号:US17090556

    申请日:2020-11-05

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.

    Utilizing one hash permutation and populated-value-slot-based densification for generating audience segment trait recommendations

    公开(公告)号:US11109085B2

    公开(公告)日:2021-08-31

    申请号:US16367628

    申请日:2019-03-28

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to training a recommendation model to generate trait recommendations using one permutation hashing and populated-value-slot-based densification. In particular, the disclosed systems can train the recommendation model by computing sketch vectors corresponding to traits using one permutation hashing. The disclosed systems can then fill in unpopulated value slots of the sketch vectors using populated-value-slot-based densification. The disclosed systems can combine the resulting densified sketches to generate the trained recommendation model. For example, in some embodiments, the disclosed systems can combine the sketches by generating a plurality of locality sensitive hashing tables based on the sketches. In some embodiments, the disclosed systems generate a count sketch matrix based on the sketches and generate trait embeddings based on the count sketch matrix using spectral embedding. Based on the trait embeddings, the disclosed systems can utilize the recommendation model to flexibly and accurately determine the similarity between traits.

Patent Agency Ranking