-
公开(公告)号:US20240005210A1
公开(公告)日:2024-01-04
申请号:US18252559
申请日:2021-11-06
Applicant: Lemon Inc.
Inventor: Jiankai SUN , Weihao GAO , Chong WANG , Hongyi ZHANG , Xiaobing LIU , Runliang LI , Xin YANG
Abstract: The present disclosure relates to a data protection method, an apparatus, a medium and a device. The method includes: acquiring gradient association information respectively corresponding to reference samples of a target batch of an active party of a joint training model; according to the proportion occupied respectively by reference samples of positive examples and reference samples of negative examples in all reference samples of the target batch, determining a constraint condition of the data noise to be added; determining information of said data noise according to the gradient association information and the constraint condition corresponding to the reference samples; correcting, according to the information of said data noise, an initial gradient transmission value corresponding to each reference sample, so as to obtain target gradient transmission information; and sending the target gradient transmission information to a passive party of the joint training model.
-
公开(公告)号:US20240249004A1
公开(公告)日:2024-07-25
申请号:US18565000
申请日:2022-04-28
Applicant: Lemon Inc.
Inventor: Xin YANG , Jiankai SUN , Weihao GAO , Junyuan XIE , Chong WANG
CPC classification number: G06F21/602 , G06N20/20
Abstract: The present disclosure relates to a method and a device for data protection, a readable medium and an electronic apparatus, and the method comprises: acquiring a target identification information union set, wherein the target identification information union set comprises target encryption identification information of a first party of a joint training model and target encryption identification information of a second party of the joint training model, the target encryption identification information in the target identification information union set being obtained by encrypting according to a secret key of the first party and a secret key of the second party; and determining, according to the target identification information union set, a target sample data set for training the joint training model. Therefore, an identification information intersection of the first party and the second party does not need to be determined in advance as in the related technology.
-
公开(公告)号:US20240220641A1
公开(公告)日:2024-07-04
申请号:US18565962
申请日:2022-07-15
Applicant: Lemon Inc.
Inventor: Jiankai SUN , Xin YANG , Aonan ZHANG , Weihao GAO , Junyuan XIE , Chong WANG
CPC classification number: G06F21/602 , G06N20/00
Abstract: The present disclosure relates to a data protection method, apparatus, medium and electronic device. The method comprises: obtaining a specified batch of reference samples of an active participant of a joint training model; determining generation gradient information of the first reference sample; determining target gradient information sent to the passive participant according to the generation gradient information, and sending the target gradient information to the passive participant, to update, by the passive participant, parameters of the joint training model according to the target gradient information. Through the above solution, the influence of the generated data on the training process and model performance of the joint training model is avoided as much as possible, and the privacy and security of data are improved.
-
公开(公告)号:US20240126899A1
公开(公告)日:2024-04-18
申请号:US18539851
申请日:2023-12-14
Applicant: Lemon Inc.
Inventor: Xin YANG , Junyuan XIE , Jiankai SUN , Yuanshun YAO , Chong WANG
Abstract: There are proposed a method, device, apparatus, and medium for protecting sensitive data. In a method, to-be-processed data is received from a server device. A processing result of a user for the to-be-processed data is received, the processing result comprising sensitive data of the user for the processing of the to-be-processed data. A gradient for training a server model at the server device is determined based on a comparison between the processing result and a prediction result for the to-be-processed data. The gradient is updated in a change direction associated with the gradient so as to generate an updated gradient to be sent to the server device. Noise is added only in the change direction associated with the gradient. The corresponding overhead of processing noise in a plurality of directions can be reduced, and no excessive noise data interfering with training will be introduced to the updated gradient.
-
公开(公告)号:US20240119341A1
公开(公告)日:2024-04-11
申请号:US17953255
申请日:2022-09-26
Applicant: Lemon Inc.
Inventor: Xin YANG , Hanlin ZHU , Tianyi LIU , Jiankai SUN , Yuanshun YAO , Aonan ZHANG , Chong WANG
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: The present disclosure describes techniques for determining performance of a classifier. A first machine learning model and a second machine learning model may be trained by aggregating updates to the first machine learning model and the second machine learning model received from a plurality of client computing devices. A cumulative distribution function (CDF) associated with a distribution of the positive samples in the user data may be estimated using the trained first machine learning model. A probability density function (PDF) associated with a distribution of the negative samples in the user data may be estimated using the trained second machine learning model. An integration-based computation of an area under the receiver operating characteristic curve (AUC) of the classifier may be performed using the PDF and the CDF.
-
-
-
-