-
公开(公告)号:US20250117443A1
公开(公告)日:2025-04-10
申请号:US18482975
申请日:2023-10-09
Applicant: International Business Machines Corporation
Inventor: Lei Tian , Han Zhang , Jing James Xu , Xue Ying Zhang , Si Er Han
IPC: G06F18/2325
Abstract: A computer-implemented method for performing data difference evaluation is provided. Aspects include obtaining a first data set and a second data set, creating a first plurality of feature vectors by inputting the first data set into each of a plurality of models, and creating a second plurality of feature vectors by inputting the second data set into each of the plurality of models. Aspects also include identifying a mapping between elements of the first plurality of vectors and elements the second plurality of feature vectors created by a same model of the plurality of models, calculating, for each of the plurality of models based at least in part on the mapping, a model distance between the first data set and the second data set, and calculating, based at least in part on the model distances, an ensemble distance between first data set and the second data set.
-
公开(公告)号:US20250094267A1
公开(公告)日:2025-03-20
申请号:US18368656
申请日:2023-09-15
Applicant: International Business Machines Corporation
Inventor: Jun Wang , Jing Xu , Xiao Ming Ma , Xue Ying Zhang , Si Er Han , Jing James Xu , Wen Pei Yu
IPC: G06F11/07
Abstract: A time series anomaly detection method, system, and computer program product that processes time series data includes absorbing profiles of the time series data and anomaly types of a model as features, optimizing biased ranks to create optimized ranks through merging initial ranks with new ranks generated by real anomalies, and auto-suggesting the optimized ranks for saving a predetermined amount of data operation.
-
公开(公告)号:US11847539B2
公开(公告)日:2023-12-19
申请号:US17370071
申请日:2021-07-08
Applicant: International Business Machines Corporation
Inventor: Ning Zhang , Yi Shao , Jing Xu , Xue Ying Zhang , Na Zhao
Abstract: An approach is provided in which the approach trains a first machine learning model using a set of features corresponding to a set of build blocks. The set of build blocks include at least one dependency build block and at least one artifact package build block. The approach predicts a set of risk values of the set of build blocks using the trained first machine learning model, and marks at least one of the build blocks as a bottleneck in response to comparing the set of risk values against a risk threshold.
-
公开(公告)号:US11783177B2
公开(公告)日:2023-10-10
申请号:US16574163
申请日:2019-09-18
Applicant: International Business Machines Corporation
Inventor: Damir Spisic , Jing Xu , Xue Ying Zhang , Xing Wei
IPC: G06N3/08 , G06F18/243 , G06N3/047 , G06N3/048
CPC classification number: G06N3/08 , G06F18/24323 , G06N3/047 , G06N3/048
Abstract: A set of classifiable data containing a plurality of classes is ingested. A target class within the plurality of classes is determined. Using the set of classifiable data, an interactive recall rate chart is generated, and the interactive recall rate chart shows a set of target class recall rates against a set of class recall rates for the remainder of the plurality of classes. The interactive recall rate chart is presented to a user. A target class recall rate selection from the set of target class recall rates is received from the user. The set of classifiable data is reclassified, based on the target class recall rate selection.
-
公开(公告)号:US11748436B2
公开(公告)日:2023-09-05
申请号:US17483714
申请日:2021-09-23
Applicant: International Business Machines Corporation
Inventor: Jun Wang , Xue Ying Zhang , Song Bo , Dong Hai Yu , Jing James Xu
IPC: G06F16/957 , G06F16/955 , G06F16/9535
CPC classification number: G06F16/9574 , G06F16/955 , G06F16/9535
Abstract: In an approach for detecting web browsing subject-oriented event interactions and intelligently organizing web pages based on insights from important interactions for better exploration and efficient management, a processor extracts time series data associated with a plurality of web browsing events based on browsing historical actions of a user. A processor identifies the subject of each web browsing event. A processor determines major events based on the time series data and subjects of the plurality of web browsing events. A processor organizes the plurality of web browsing events based on subject hierarchy and timeline from the time series data. A processor highlights one or more uniform resource locators based on the subject hierarchy and timeline.
-
公开(公告)号:US20230185879A1
公开(公告)日:2023-06-15
申请号:US17644350
申请日:2021-12-15
Applicant: International Business Machines Corporation
Inventor: Si Er Han , Xue Ying Zhang , Jing Xu , Xiao Ming Ma , Ji Hui Yang
CPC classification number: G06K9/6228 , G06K9/6261 , G06K9/6262 , G06N20/00
Abstract: A computer implemented technique including: splitting data of a historical time series data set into subsets; updating a time series model by backwards data selection to obtain an interim version of the time series model; exploring pattern changes in the new data to obtain new predictors of pattern change; and updating the interim version of the time series model by applying the new predictors of pattern change to obtain an updated version of the time series model.
-
公开(公告)号:US20230137184A1
公开(公告)日:2023-05-04
申请号:US17453540
申请日:2021-11-04
Applicant: International Business Machines Corporation
Inventor: Si Er Han , Ji Hui Yang , Xiao Ming Ma , Jing Xu , Xue Ying Zhang
Abstract: A method, system, and computer program product for incremental machine learning for a parametric machine learning model are disclosed. The method may include processing samples comprising historical samples and new samples with an existing parametric machine learning model to obtain at least one prediction residual of each of the samples, wherein the existing parametric machine learning model was trained based on the historical samples. The method may further include clustering the samples based on the at least one prediction residual of each of the samples and features of each of the samples. The method may further include sampling samples in each cluster to ensure that each cluster includes substantially similar number of sampled samples. The method may further include updating the existing parametric machine learning model to obtain an updated parametric machine learning model based on sampled samples in each cluster.
-
公开(公告)号:US20230073137A1
公开(公告)日:2023-03-09
申请号:US17447258
申请日:2021-09-09
Applicant: International Business Machines Corporation
Inventor: Jing Xu , Si Er Han , Xue Ying Zhang , Steven George Barbee , Ji Hui Yang
Abstract: A computer implemented method for machine learning model training. A number of processor units creates a cluster model comprising labeled samples and unlabeled samples. The number of processor units identifies cluster information for the labeled samples from the cluster model. The number of processor units adds a set of new features to a set of original features for the labeled samples using the cluster information to form an extended set of features for the labeled samples, wherein the labeled samples with the set of original features and the set of new features form a training data set for training a machine learning model.
-
公开(公告)号:US20220101044A1
公开(公告)日:2022-03-31
申请号:US17035816
申请日:2020-09-29
Applicant: International Business Machines Corporation
Inventor: Jing Xu , Xue Ying Zhang , Si Er Han , Xiao Ming Ma , Ji Hui Yang
Abstract: A computer receives a general predictive model and training data. The computer builds a clustering feature tree model to condense the training data into data groups. The computer applies a leave-one-out evaluation method to determine an impact value for each data groups with regard to said general predictive model. The computer identifies a diagnostic category for each data group selected from a list of categories including model-harmful data, model-neutral data, and model-helping data, in accordance with said impact value. The computer removes data in groups labelled as model-harmful from the training data and builds a modified general predictive model based on data in groups labelled as model-neutral or model-helping.
-
公开(公告)号:US20250131116A1
公开(公告)日:2025-04-24
申请号:US18490914
申请日:2023-10-20
Applicant: International Business Machines Corporation
Inventor: Si Er Han , Jing Xu , Xiao Ming Ma , Jing James Xu , Jiang Bo Kang , Xue Ying Zhang , Jun Wang , Ji Hui Yang
IPC: G06F21/62
Abstract: An embodiment configures a plurality of parameters, the parameters being usable to generate artificial data from original data, the configuring adjusting a level of privacy in the artificial data. An embodiment fits a distribution type to a variable of the original data. An embodiment adjusts, using a desired level of privacy and the distribution type, a level of noise, wherein the level of noise corresponds to the desired level of privacy. An embodiment generates, using the distribution type and the level of noise, the artificial data, the artificial data achieving the desired level of privacy by including noise data corresponding to the level of noise.
-
-
-
-
-
-
-
-
-