MODELING DISJOINT MANIFOLDS
    1.
    发明公开

    公开(公告)号:US20230386190A1

    公开(公告)日:2023-11-30

    申请号:US18202455

    申请日:2023-05-26

    CPC classification number: G06V10/82 G06V10/7625

    Abstract: A computer model is trained to account for data samples in a high-dimensional space as lying on different manifolds, rather than a single manifold to represent the data set, accounting for the data set as a whole as a union of manifolds. Different data samples that may be expected to belong to the same underlying manifold are determined by grouping the data. For generative models, a generative model may be trained that includes a sub-model for each group trained on that group's data samples, such that each sub-model can account for the manifold of that group. The overall generative model includes information describing the frequency to sample from each sub-model to correctly represent the data set as a whole in sampling. Multi-class classification models may also use the grouping to improve classification accuracy by weighing group data samples according to the estimated latent dimensionality of the group.

    CALIBRATED MODEL INTERVENTION WITH CONFORMAL THRESHOLD

    公开(公告)号:US20240330772A1

    公开(公告)日:2024-10-03

    申请号:US18618757

    申请日:2024-03-27

    CPC classification number: G06N20/00

    Abstract: A classification model is calibrated with a conformal threshold to determine a known error rate for classifications. Rather than directly use the model outputs, the classification model outputs are processed to a conformal score that is compared with a conformal threshold for determining whether a data sample is a member of a class. When a number of classes for the data sample that pass the conformal threshold for inclusion is a single class, an action associated with the class can confidently be applied with a known error rate. When the number of classes is zero or multiple classes, it may indicate sufficient uncertainty in the model prediction and the data sample may be escalated to another decision mechanism, such as manual review or a more complex classification model.

    LOW-DIMENSIONAL PROBABILISTIC DENSITY OF HIGH-DIMENSIONAL DATA MANIFOLD

    公开(公告)号:US20230004694A1

    公开(公告)日:2023-01-05

    申请号:US17735884

    申请日:2022-05-03

    Abstract: A computer models a high-dimensional data with a low-dimensional manifold in conjunction with a low-dimensional base probability density. A first transform (a manifold transform) may be used to transform the high-dimensional data to a low-dimensional manifold, and a second transform (a density transform) may be used to transform the low-dimensional manifold to a low-dimensional probability distribution. To enable the model to tractably learn the manifold transformation from the high-dimensional to low-dimensional spaces, the manifold transformation includes conformal flows, which simplify the probabilistic volume transform and enables tractable learning of the transform. This may also allow the manifold transform to be jointly learned with density transform.

    DETECTING MODEL DEVIATION WITH INTER-GROUP METRICS

    公开(公告)号:US20250165375A1

    公开(公告)日:2025-05-22

    申请号:US18949310

    申请日:2024-11-15

    Abstract: A computer model is monitored during operation to evaluate performance of the model with respect to different groups evaluated by the model. Performance for each group is evaluated to determine an inter-group performance metric describing how model predictions across groups differs. A threshold for excess inter-group performance differences can be calibrated using withheld training data or out-of-time data to provide a statistical guarantee for detecting meaningful variation in inter-group performance metric differences. When the inter-group performance exceeds the threshold, the computer model may be considered to deviate from expected behavior and the monitoring can act to correct its operation, for example, by modifying actions that may otherwise occur due to model predictions or by initiating model retraining.

    IDENTIFYING AND MITIGATING DISPARATE GROUP IMPACT IN DIFFERENTIAL-PRIVACY MACHINE-LEARNED MODELS

    公开(公告)号:US20230385443A1

    公开(公告)日:2023-11-30

    申请号:US18202435

    申请日:2023-05-26

    CPC classification number: G06F21/6245

    Abstract: A model evaluation system evaluates the extent to which privacy-aware training processes affect the direction of training gradients for groups. A modified differential-privacy (“DP”) training process provides per-sample gradient adjustments with parameters that may be adaptively modified for different data batches. Per-sample gradients are modified with respect to a reference bound and a clipping bound. A scaling factor may be determined for each per-sample gradient based on the higher of the reference bound or a magnitude of the per-sample gradient. Per-sample gradients may then be adjusted based on a ratio of the clipping bound to the scaling factor. A relative privacy cost between groups may be determined as excess training risk based on a difference in group gradient direction relative to an unadjusted batch gradient and the adjusted batch gradient according to the privacy-aware training.

    DETECTING MODEL DEVIATION WITH INTER-GROUP METRICS

    公开(公告)号:US20250165866A1

    公开(公告)日:2025-05-22

    申请号:US18949328

    申请日:2024-11-15

    Abstract: A computer model is monitored during operation to evaluate performance of the model with respect to different groups evaluated by the model. Performance for each group is evaluated to determine an inter-group performance metric describing how model predictions across groups differs. A threshold for excess inter-group performance differences can be calibrated using withheld training data or out-of-time data to provide a statistical guarantee for detecting meaningful variation in inter-group performance metric differences. When the inter-group performance exceeds the threshold, the computer model may be considered to deviate from expected behavior and the monitoring can act to correct its operation, for example, by modifying actions that may otherwise occur due to model predictions or by initiating model retraining.

    OUT-OF-DISTRIBUTION DETECTION WITH GENERATIVE MODELS

    公开(公告)号:US20250103961A1

    公开(公告)日:2025-03-27

    申请号:US18893616

    申请日:2024-09-23

    Abstract: Generative models are used to determine whether a data sample is in-distribution or out-of-distribution with respect to a training data set. To address potential errors in generative models that attribute high likelihoods to known out-of-distribution data samples, in addition to the likelihood for a data sample, the local intrinsic dimensionality is also evaluated for the data sample. A data sample is determined to belong to the distribution of the training data when the data sample both has sufficient likelihood and local intrinsic dimensionality around its region in the generative model. Different actions may then be determined for the data sample with respect to a data application model based on whether the data sample is in- or out-of-distribution.

    DISTRIBUTED MODEL TRAINING WITH COLLABORATION WEIGHTS FOR PRIVATE DATA SETS

    公开(公告)号:US20230385694A1

    公开(公告)日:2023-11-30

    申请号:US18202459

    申请日:2023-05-26

    CPC classification number: G06N20/00

    Abstract: Model training systems collaborate on model training without revealing respective private data sets. Each private data set learns a set of client weights for a set of computer models that are also learned during training. Inference for a particular private data set is determined as a mixture of the computer model parameters according to the client weights. During training, at each iteration, the client weights are updated in one step based on how well sampled models represent the private data set. In another step, gradients are determined for each sampled model and may be weighed according to the client weight for that model, relatively increasing the gradient contribution of a private data set for model parameters that correspond more highly to that private data set.

Patent Agency Ranking