DATA DIFFERENCE EVALUATION VIA MODEL COMPARISON

    公开(公告)号:US20250117443A1

    公开(公告)日:2025-04-10

    申请号:US18482975

    申请日:2023-10-09

    Abstract: A computer-implemented method for performing data difference evaluation is provided. Aspects include obtaining a first data set and a second data set, creating a first plurality of feature vectors by inputting the first data set into each of a plurality of models, and creating a second plurality of feature vectors by inputting the second data set into each of the plurality of models. Aspects also include identifying a mapping between elements of the first plurality of vectors and elements the second plurality of feature vectors created by a same model of the plurality of models, calculating, for each of the plurality of models based at least in part on the mapping, a model distance between the first data set and the second data set, and calculating, based at least in part on the model distances, an ensemble distance between first data set and the second data set.

    Web smart exploration and management in browser

    公开(公告)号:US11748436B2

    公开(公告)日:2023-09-05

    申请号:US17483714

    申请日:2021-09-23

    CPC classification number: G06F16/9574 G06F16/955 G06F16/9535

    Abstract: In an approach for detecting web browsing subject-oriented event interactions and intelligently organizing web pages based on insights from important interactions for better exploration and efficient management, a processor extracts time series data associated with a plurality of web browsing events based on browsing historical actions of a user. A processor identifies the subject of each web browsing event. A processor determines major events based on the time series data and subjects of the plurality of web browsing events. A processor organizes the plurality of web browsing events based on subject hierarchy and timeline from the time series data. A processor highlights one or more uniform resource locators based on the subject hierarchy and timeline.

    INCREMENTAL MACHINE LEARNING FOR A PARAMETRIC MACHINE LEARNING MODEL

    公开(公告)号:US20230137184A1

    公开(公告)日:2023-05-04

    申请号:US17453540

    申请日:2021-11-04

    Abstract: A method, system, and computer program product for incremental machine learning for a parametric machine learning model are disclosed. The method may include processing samples comprising historical samples and new samples with an existing parametric machine learning model to obtain at least one prediction residual of each of the samples, wherein the existing parametric machine learning model was trained based on the historical samples. The method may further include clustering the samples based on the at least one prediction residual of each of the samples and features of each of the samples. The method may further include sampling samples in each cluster to ensure that each cluster includes substantially similar number of sampled samples. The method may further include updating the existing parametric machine learning model to obtain an updated parametric machine learning model based on sampled samples in each cluster.

    Feature Generation for Training Data Sets Based on Unlabeled Data

    公开(公告)号:US20230073137A1

    公开(公告)日:2023-03-09

    申请号:US17447258

    申请日:2021-09-09

    Abstract: A computer implemented method for machine learning model training. A number of processor units creates a cluster model comprising labeled samples and unlabeled samples. The number of processor units identifies cluster information for the labeled samples from the cluster model. The number of processor units adds a set of new features to a set of original features for the labeled samples using the cluster information to form an extended set of features for the labeled samples, wherein the labeled samples with the set of original features and the set of new features form a training data set for training a machine learning model.

    ARTIFICIAL INTELLIGENCE MODEL GENERATION USING DATA WITH DESIRED DIAGNOSTIC CONTENT

    公开(公告)号:US20220101044A1

    公开(公告)日:2022-03-31

    申请号:US17035816

    申请日:2020-09-29

    Abstract: A computer receives a general predictive model and training data. The computer builds a clustering feature tree model to condense the training data into data groups. The computer applies a leave-one-out evaluation method to determine an impact value for each data groups with regard to said general predictive model. The computer identifies a diagnostic category for each data group selected from a list of categories including model-harmful data, model-neutral data, and model-helping data, in accordance with said impact value. The computer removes data in groups labelled as model-harmful from the training data and builds a modified general predictive model based on data in groups labelled as model-neutral or model-helping.

    ARTIFICIAL DATA GENERATION FOR DIFFERENTIAL PRIVACY

    公开(公告)号:US20250131116A1

    公开(公告)日:2025-04-24

    申请号:US18490914

    申请日:2023-10-20

    Abstract: An embodiment configures a plurality of parameters, the parameters being usable to generate artificial data from original data, the configuring adjusting a level of privacy in the artificial data. An embodiment fits a distribution type to a variable of the original data. An embodiment adjusts, using a desired level of privacy and the distribution type, a level of noise, wherein the level of noise corresponds to the desired level of privacy. An embodiment generates, using the distribution type and the level of noise, the artificial data, the artificial data achieving the desired level of privacy by including noise data corresponding to the level of noise.

Patent Agency Ranking