-
公开(公告)号:US12019771B2
公开(公告)日:2024-06-25
申请号:US18539851
申请日:2023-12-14
Applicant: Lemon Inc.
Inventor: Xin Yang , Junyuan Xie , Jiankai Sun , Yuanshun Yao , Chong Wang
Abstract: There are proposed a method, device, apparatus, and medium for protecting sensitive data. In a method, to-be-processed data is received from a server device. A processing result of a user for the to-be-processed data is received, the processing result comprising sensitive data of the user for the processing of the to-be-processed data. A gradient for training a server model at the server device is determined based on a comparison between the processing result and a prediction result for the to-be-processed data. The gradient is updated in a change direction associated with the gradient so as to generate an updated gradient to be sent to the server device. Noise is added only in the change direction associated with the gradient. The corresponding overhead of processing noise in a plurality of directions can be reduced, and no excessive noise data interfering with training will be introduced to the updated gradient.
-
公开(公告)号:US20230098656A1
公开(公告)日:2023-03-30
申请号:US18070461
申请日:2022-11-28
Applicant: Lemon Inc.
Inventor: Aonan Zhang , Jiankai Sun , Ruocheng Guo , Taiqing Wang , Xiaohui Chen
IPC: G06N20/20
Abstract: The present disclosure describes techniques for improving data subsampling for recommendation systems. A user-item graph associated with training data may be constructed. An importance of user-item interactions may be estimated via graph conductance based on the user-item graph. An importance of the training data may be measured via sample hardness using a pre-trained pilot model. A subsampling rate may be generated based on the importance estimated from the user-item graph and the importance measured by the pre-trained pilot model.
-
公开(公告)号:US20240070525A1
公开(公告)日:2024-02-29
申请号:US17897697
申请日:2022-08-29
Applicant: Lemon Inc.
Inventor: Jiankai Sun , Xinlei Xu , Xin Yang , Yuanshun Yao , Chong Wang
CPC classification number: G06N20/00 , G06F21/6245
Abstract: The present disclosure describes techniques of performing machine unlearning in a recommendation model. An unlearning process of the recommendation model may be initiated in response to receiving a request for deleting a fraction of user data from any particular user. The recommendation model may be pre-trained to recommend content to users based at least in part on user data. Values of entries in a matrix corresponding to the fraction of user data may be configured as zero. The matrix may comprise entries denoting preferences of users with respect to content items. Confidence values associated with the fraction of user data may be configured as zero to block influence of the fraction of user data on performance of the recommendation model. The unlearning process may be implemented by performing a number of iterations until the recommendation model has converged.
-
公开(公告)号:US20230161899A1
公开(公告)日:2023-05-25
申请号:US17535398
申请日:2021-11-24
Applicant: Lemon Inc.
Inventor: Xin Yang , Yuanshun Yao , Tianyi Liu , Jiankai Sun , Chong Wang , Ruihan Wu
IPC: G06F21/62
CPC classification number: G06F21/6245
Abstract: The present disclosure describes techniques of releasing data while protecting individual privacy. A dataset may be compressed by applying a first random matrix. The dataset may be owned by a party among a plurality of parties and there may be a plurality of datasets owned by the plurality of parties. A noise may be added by applying a random Gaussian matrix to the compressed dataset to obtain a processed dataset. The processed dataset ensures data privacy protection. The processed dataset may be released to other parties.
-
公开(公告)号:US20230143789A1
公开(公告)日:2023-05-11
申请号:US18149462
申请日:2023-01-03
Applicant: Lemon Inc.
Inventor: Shangyu Xie , Jiankai Sun , Xin Yang , Yuanshun Yao , Tianyi Liu , Taiqing Wang
Abstract: Split learning is provided to train a composite neural network (CNN) model that is split into first and second submodels, including receiving a noise-laden backpropagation gradient, training the surrogate submodel by optimizing a gradient distance loss, and computing an updated dummy label using the first submodel and the trained surrogate submodel to infer label information of the second submodel. Noise can be added to a label of the second submodel or a shared backpropagation gradient to protect the label information.
-
-
-
-