-
公开(公告)号:US11899693B2
公开(公告)日:2024-02-13
申请号:US17677323
申请日:2022-02-22
Applicant: Adobe Inc.
Inventor: Yeuk-yin Chan , Tung Mai , Ryan Rossi , Moumita Sinha , Matvey Kapilevich , Margarita Savova , Fan Du , Charles Menguy , Anup Rao
CPC classification number: G06F16/285
Abstract: A cluster generation system identifies data elements, from a first binary record, that each have a particular value and correspond to respective binary traits. A candidate description function describing the binary traits is generated, the candidate description function including a model factor that describes the data elements. Responsive to determining that a second record has additional data elements having the particular value and corresponding to the respective binary traits, the candidate description function is modified to indicate that the model factor describes the additional elements. The candidate description function is also modified to include a correction factor describing an additional binary trait excluded from the respective binary traits. Based on the modified candidate description function, the cluster generation system generates a data summary cluster, which includes a compact representation of the binary traits of the data elements and additional data elements.
-
公开(公告)号:US11093565B2
公开(公告)日:2021-08-17
申请号:US16287546
申请日:2019-02-27
Applicant: Adobe Inc.
Inventor: Karthik Raman , Nedim Lipka , Matvey Kapilevich
IPC: G06F16/9535 , G06F16/28 , G06N7/00
Abstract: Systems and methods are disclosed for clustering multiple devices that are associated with particular users by utilizing both probabilistic and deterministic data derived from analytics information on the users. An analytics computing system generates at least one deterministic device cluster that groups a first set of devices associated with a first user. The first set of devices share deterministic user identifiers specific to the first user. The analytics computing system also identifies a probabilistic link between a device in the first set of devices and additional devices. The probabilistic link indicates common usage patterns between two devices. Based on the probabilistic link, the analytics computing system generates a data structure that includes the deterministic device cluster and the additional devices.
-
公开(公告)号:US20220138218A1
公开(公告)日:2022-05-05
申请号:US17090556
申请日:2020-11-05
Applicant: Adobe Inc.
Inventor: Anup Rao , Tung Mai , Matvey Kapilevich
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.
-
公开(公告)号:US11720592B2
公开(公告)日:2023-08-08
申请号:US17818974
申请日:2022-08-10
Applicant: Adobe Inc.
Inventor: Anup Rao , Tung Mai , Matvey Kapilevich
CPC classification number: G06F16/26 , G06F16/285 , G06F16/288 , G06T11/206
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.
-
公开(公告)号:US20200314472A1
公开(公告)日:2020-10-01
申请号:US16367628
申请日:2019-03-28
Applicant: Adobe Inc.
Inventor: Anup Rao , Yasin Abbasi Yadkori , Tung Mai , Ryan Rossi , Ritwik Sinha , Matvey Kapilevich , Alexandru Ionut Hodorogea
IPC: H04N21/258 , H04N21/2668 , H04N21/482
Abstract: The present disclosure relates to training a recommendation model to generate trait recommendations using one permutation hashing and populated-value-slot-based densification. In particular, the disclosed systems can train the recommendation model by computing sketch vectors corresponding to traits using one permutation hashing. The disclosed systems can then fill in unpopulated value slots of the sketch vectors using populated-value-slot-based densification. The disclosed systems can combine the resulting densified sketches to generate the trained recommendation model. For example, in some embodiments, the disclosed systems can combine the sketches by generating a plurality of locality sensitive hashing tables based on the sketches. In some embodiments, the disclosed systems generate a count sketch matrix based on the sketches and generate trait embeddings based on the count sketch matrix using spectral embedding. Based on the trait embeddings, the disclosed systems can utilize the recommendation model to flexibly and accurately determine the similarity between traits.
-
公开(公告)号:US10373197B2
公开(公告)日:2019-08-06
申请号:US13726308
申请日:2012-12-24
Applicant: Adobe Inc.
Inventor: Nicholas M. Jordon , Margarita R. Savova , Matvey Kapilevich , Paul Mackles , David M. Weinstein
IPC: G06Q30/02
Abstract: Tunable algorithmic segment techniques are described. In one or more implementations, a target audience definition is obtained that is input to initiate creation of a look-alike model. The target audience definition indicates traits associated with a baseline group of consumers who have interacted with online resources in a designated manner, such as by buying a product, visiting a website, using a service, and so forth. Tuning parameters designated for the look-alike model are ascertained and the look-alike model is built based on the target audience definition and the tuning parameters. The tuning parameters may include at least a setting selectable to control reach versus accuracy for the look-alike model. Segment data indicative of market segments generated according to the look-alike model may then be exposed for manipulation by a client. The manipulation may include selectable control over the tuning parameters to generate different look-alike groups from the segment data.
-
公开(公告)号:US20230267158A1
公开(公告)日:2023-08-24
申请号:US17675290
申请日:2022-02-18
Applicant: Adobe Inc.
Inventor: Matvey Kapilevich , Margarita R. Savova , Anup Bandigadi Rao , Tung Thanh Mai , Lakshmi Shivalingaiah , Liron Goren Snai , Charles Menguy , Vijeth Lomada , Moumita Sinha , Harleen Sahni
IPC: G06F16/9538 , G06F16/901 , G06F16/28
CPC classification number: G06F16/9538 , G06F16/9024 , G06F16/283 , G06N20/00
Abstract: Multi-modal machine-learning model training techniques for search are described that overcome conventional challenges and inefficiencies to support real time output, which is not possible in conventional training techniques. In one example, a search system is configured to support multi-modal machine-learning model training. This includes use of a preview mode and an expanded mode. In the preview mode, a preview segment is generated as part of real time training of a machine learning model. In the expanded mode, the preview segment is persisted as an expanded segment that is used to train and utilize an expanded machine-learning model as part of search.
-
公开(公告)号:US20230267132A1
公开(公告)日:2023-08-24
申请号:US17677323
申请日:2022-02-22
Applicant: Adobe Inc.
Inventor: Yeuk-yin Chan , Tung Mai , Ryan Rossi , Moumita Sinha , Matvey Kapilevich , Margarita Savova , Fan Du , Charles Menguy , Anup Rao
IPC: G06F16/28
CPC classification number: G06F16/285
Abstract: A cluster generation system identifies data elements, from a first binary record, that each have a particular value and correspond to respective binary traits. A candidate description function describing the binary traits is generated, the candidate description function including a model factor that describes the data elements. Responsive to determining that a second record has additional data elements having the particular value and corresponding to the respective binary traits, the candidate description function is modified to indicate that the model factor describes the additional elements. The candidate description function is also modified to include a correction factor describing an additional binary trait excluded from the respective binary traits. Based on the modified candidate description function, the cluster generation system generates a data summary cluster, which includes a compact representation of the binary traits of the data elements and additional data elements.
-
公开(公告)号:US11449523B2
公开(公告)日:2022-09-20
申请号:US17090556
申请日:2020-11-05
Applicant: Adobe Inc.
Inventor: Anup Rao , Tung Mai , Matvey Kapilevich
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that estimate the overlap between sets of data samples. In particular, in one or more embodiments, the disclosed systems utilize a sketch-based sampling routine and a flexible, accurate estimator to determine the overlap (e.g., the intersection) between sets of data samples. For example, in some implementations, the disclosed systems generate a sketch vector—such as a one permutation hashing vector—for each set of data samples. The disclosed systems further compare the sketch vectors to determine an equal bin similarity estimator, a lesser bin similarity estimator, and a greater bin similarity estimator. The disclosed systems utilize one or more of the determined similarity estimators in generating an overlap estimation for the sets of data samples.
-
公开(公告)号:US11109085B2
公开(公告)日:2021-08-31
申请号:US16367628
申请日:2019-03-28
Applicant: Adobe Inc.
Inventor: Anup Rao , Yasin Abbasi Yadkori , Tung Mai , Ryan Rossi , Ritwik Sinha , Matvey Kapilevich , Alexandru Ionut Hodorogea
IPC: G06F7/00 , G06F16/00 , H04N21/258 , H04N21/482 , H04N21/2668
Abstract: The present disclosure relates to training a recommendation model to generate trait recommendations using one permutation hashing and populated-value-slot-based densification. In particular, the disclosed systems can train the recommendation model by computing sketch vectors corresponding to traits using one permutation hashing. The disclosed systems can then fill in unpopulated value slots of the sketch vectors using populated-value-slot-based densification. The disclosed systems can combine the resulting densified sketches to generate the trained recommendation model. For example, in some embodiments, the disclosed systems can combine the sketches by generating a plurality of locality sensitive hashing tables based on the sketches. In some embodiments, the disclosed systems generate a count sketch matrix based on the sketches and generate trait embeddings based on the count sketch matrix using spectral embedding. Based on the trait embeddings, the disclosed systems can utilize the recommendation model to flexibly and accurately determine the similarity between traits.
-
-
-
-
-
-
-
-
-