-
公开(公告)号:US20250124257A1
公开(公告)日:2025-04-17
申请号:US18487460
申请日:2023-10-16
Applicant: YAHOO ASSETS LLC
Inventor: Ariel Raviv , Noa Avigdor-Elgrabli , Stav Yanovsky Daye , Michael Viderman , Guy Horowitz
IPC: G06N3/0455 , G06N3/0895
Abstract: The present teaching relates to content categorization. Supervised training data and unlabeled data clusters are used to generate augmented training data. Each unlabeled data cluster includes data samples with varying features. Weakly labeled training data is created with new data samples generated via generative augmentation based on supervised training data and the unlabeled data clusters. Each new data sample is assigned a label from a corresponding data sample from the supervised training data with generated varying characteristics. Augmented training data is created from the supervised and the weakly labeled training data and is used to train a robust content categorization model via machine learning.
-
公开(公告)号:US20250124258A1
公开(公告)日:2025-04-17
申请号:US18487487
申请日:2023-10-16
Applicant: YAHOO ASSETS LLC
Inventor: Ariel Raviv , Noa Avigdor-Elgrabli , Stav Yanovsky Daye , Michael Viderman , Guy Horowitz
IPC: G06N3/0455 , G06N3/0895
Abstract: The present teaching relates to content categorization. Supervised training data and unlabeled data clusters are used to generate augmented training data. Each unlabeled data cluster includes data samples with varying features. Weakly labeled training data is created based on supervised training data and the unlabeled data clusters with data samples therein with cluster labels via consistent self-training so that a labeled data sample in the supervised training data and a data sample in the weakly labeled training data with the same label have varying characteristics. Augmented training data is created from the supervised and the weakly labeled training data and is used to train a robust content categorization model via machine learning.
-
公开(公告)号:US11921846B2
公开(公告)日:2024-03-05
申请号:US16835871
申请日:2020-03-31
Applicant: YAHOO ASSETS LLC
Inventor: Stav Yanovsky Daye , Ran Wolff
IPC: G06F21/55 , G06F18/214 , G06F18/22 , G06F21/62
CPC classification number: G06F21/552 , G06F18/214 , G06F18/22 , G06F21/55 , G06F21/6218 , G06F2221/2141
Abstract: Disclosed are systems and methods for improving interactions with and between computers in distributional similarity identification using randomized observations. In connection with an intrusion detection system monitoring a computing system, a pair of perturbed sample sets are generating using a pair of real sample set (or real observations) and a pair of random sample sets (of randomly-selected observations), and a similarity measuring representing a level of consistency in user behavior is determined. The systems improve the quality and accuracy of the similarity determination for use in intrusion detection.
-
-