Training Machine-Learned Models with Label Differential Privacy

    公开(公告)号:US20240265294A1

    公开(公告)日:2024-08-08

    申请号:US18156915

    申请日:2023-01-19

    Applicant: Google LLC

    CPC classification number: G06N20/00

    Abstract: An example method is provided for conducting differentially private communication of training data for training a machine-learned model. Initial label data can be obtained that corresponds to feature data. A plurality of label bins can be determined to respectively provide representative values for initial label values assigned to the plurality of label bins. Noised label data can be generated, based on a probability distribution over the plurality of label bins, to correspond to the initial label data, the probability distribution characterized by, for a respective noised label corresponding to a respective initial label of the initial label data, a first probability for returning a representative value of a label bin to which the respective initial label is assigned, and a second probability for returning another value. The noised label data can be communicated for training the machine-learned model.

    Pure Differentially Private Algorithms for Summation in the Shuffled Model

    公开(公告)号:US20240236052A1

    公开(公告)日:2024-07-11

    申请号:US18403339

    申请日:2024-01-03

    Applicant: Google LLC

    CPC classification number: H04L63/0428 G06N5/04 G06N20/00

    Abstract: An encoding method for enabling privacy-preserving aggregation of private data can include obtaining private data including a private value, determining a probabilistic status defining one of a first condition and a second condition, producing a multiset including a plurality of multiset values, and providing the multiset for aggregation with a plurality of additional multisets respectively generated for a plurality of additional private values. In response to the probabilistic status having the first condition, the plurality of multiset values is based at least in part on the private value, and in response to the probabilistic status having the second condition, the plurality of multiset values is a noise message. The noise message is produced based at least in part on a noise distribution that comprises a discretization of a continuous unimodal distribution supported on a range from zero to a number of multiset values included in the plurality of multiset values.

    Portion-Specific Model Compression for Optimization of Machine-Learned Models

    公开(公告)号:US20240232686A1

    公开(公告)日:2024-07-11

    申请号:US18012292

    申请日:2022-07-29

    Applicant: Google LLC

    CPC classification number: G06N20/00

    Abstract: Systems and methods of the present disclosure are directed to portion-specific compression and optimization of machine-learned models. For example, a method for portion-specific compression and optimization of machine-learned models includes obtaining data descriptive of one or more respective sets of compression schemes for one or more model portions of a plurality of model portions of a machine-learned model. The method includes evaluating a cost function to respectively select one or more candidate compression schemes from the one or more sets of compression schemes. The method includes respectively applying the one or more candidate compression schemes to the one or more model portions to obtain a compressed machine-learned model comprising one or more compressed model portions that correspond to the one or more model portions.

    Systems and Methods for Locally Private Non-Interactive Communications

    公开(公告)号:US20230308422A1

    公开(公告)日:2023-09-28

    申请号:US18011995

    申请日:2021-12-20

    Applicant: Google LLC

    CPC classification number: H04L63/0428 G06F21/604

    Abstract: A computer-implemented method for encoding data for communications with improved privacy includes obtaining, by a computing system comprising one or more computing devices, input data including one or more input data points. The method can include constructing, by the computing system, a net tree including potential representatives of the one or more input data points, the potential representatives arranged in a plurality of levels, the net tree including a hierarchical data structure including a plurality of hierarchically organized nodes. The method can include determining, by the computing system, a representative of each of the one or more input data points from the potential representatives of the net tree, the representative including one of the plurality of hierarchically organized nodes. The method can include encoding, by the computing system, the representative of each of the one or more input data points for communication.

    SCHEDULING OPERATIONS ON A COMPUTATION GRAPH

    公开(公告)号:US20210216367A1

    公开(公告)日:2021-07-15

    申请号:US17214699

    申请日:2021-03-26

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented on a computation graph. One of the methods receiving, by a computation graph system, a request to generate a schedule for processing a computation graph, obtaining data representing the computation graph generating a separator of the computation graph; and generating the schedule to perform the operations represented in the computation graph, wherein generating the schedule comprises: initializing the schedule with zero nodes; for each node in the separator: determining whether the node has any predecessor nodes in the computation graph, when the node has any predecessor nodes, adding the predecessor nodes to the schedule, and adding the node in the schedule, and adding to the schedule each node in each subgraph that is not a predecessor to any node in the separator on the computation graph.

Patent Agency Ranking