Intelligent Identification of an Execution Environment

    公开(公告)号:US20220326982A1

    公开(公告)日:2022-10-13

    申请号:US17225427

    申请日:2021-04-08

    Abstract: Mechanisms are provided for intelligently identifying an execution environment to execute a computing job. An execution time of the computing job in each execution environment of a plurality of execution environments is predicted by applying a set of existing machine learning models matching execution context information and key parameters of the computing job and execution environment information of the execution environment. The predicted execution time of the machine learning models is aggregated. The aggregated predicted execution times of the computing job are summarized for the plurality of execution environments. Responsive to a selection of an execution environment from the plurality of execution environments based on the summary of the aggregated predicted execution times of the computing job, the computing job is executed in the selected execution environment. Related data during the execution of the computing job in the selected execution environment is collected.

    HIGH DIMENSIONAL CLUSTERS PROFILE GENERATION
    12.
    发明申请

    公开(公告)号:US20170147675A1

    公开(公告)日:2017-05-25

    申请号:US14945853

    申请日:2015-11-19

    CPC classification number: G06F16/35

    Abstract: Refining cluster definition: (i) receiving data items, each characterized by values respectively corresponding to a set of dimension(s); (ii) receiving initial cluster identification that divides the set of data items into multiple initial clusters; (iii) determining a distribution curve, with respect to a first dimension, of data items of a first initial cluster; (iv) determining a distribution curve, with respect to the first dimension, of data items of a second initial cluster; and (v) determining a first-dimension-first-cluster-second-cluster cut-off value such that the following two proportions are substantially equal: (a) a proportion of the area under the first distribution curve and below the first-dimension-first-cluster-second-cluster cut-off value to the total area under the first distribution curve, and (b) a proportion of the area under the second distribution curve and above the first-dimension-first-cluster-second-cluster cut-off value to the total area under the second distribution curve.

    DATA GENERATION PROCESS FOR MULTI-VARIABLE DATA

    公开(公告)号:US20250165492A1

    公开(公告)日:2025-05-22

    申请号:US18513579

    申请日:2023-11-19

    Abstract: An example operation may include one or more of storing an original data set in memory, splitting the original data set into a subset of continuous-type data values and a subset of discrete-type data values based on variable types in the original data set, converting the subset of continuous-type data values into a second subset of discrete-type data values based on a data binning operation, generating a new subset of continuous-type data values based on the subset of continuous-type data values in the original data set, and combining a subset of discrete-type data values from a conditional contingency table within the new subset of continuous-type data values to generate a new data set.

    Privacy protection in a search process

    公开(公告)号:US12099628B2

    公开(公告)日:2024-09-24

    申请号:US17661780

    申请日:2022-05-03

    CPC classification number: G06F21/6245 G06F16/35 G06F18/23

    Abstract: The present disclosure relates to privacy protection in a search process. According to a method, a target emotion vector is extracted from a search interaction, the target emotion vector representing emotional information in the search interaction. Respective emotion distances between the target emotion vector and respective emotion vectors associated with a plurality of text clusters are determined. The plurality of text clusters is clustered from a dictionary of text elements. A first number of text clusters are selected from the plurality of text clusters based on the determined respective emotion distances. The first number of text clusters have emotion distances larger than at least one unselected text cluster among the plurality of text clusters. A plurality of confused search interactions are constructed for the search interaction based on the first number of text clusters, and the plurality of confused search interactions are performed.

    Identifying Node Importance in Machine Learning Pipelines

    公开(公告)号:US20230119654A1

    公开(公告)日:2023-04-20

    申请号:US17451495

    申请日:2021-10-20

    Abstract: Identifying node importance in a machine learning pipeline is provided. Changes in accuracy of the machine learning pipeline are recorded for each respective node setting change in a randomly generated group of node settings inputted into each corresponding node included in the machine learning pipeline. A regression model is generated to determine a relationship between each respective node setting change in the randomly generated group of node settings inputted into each corresponding node and the changes in the accuracy of the machine learning pipeline. A node of importance is identified in the machine learning pipeline using the regression model based on the relationship between each respective node setting change in the randomly generated group of node settings inputted into each corresponding node and the changes in the accuracy of the machine learning pipeline.

    Explanative analysis for records with missing values

    公开(公告)号:US11520757B2

    公开(公告)日:2022-12-06

    申请号:US17019383

    申请日:2020-09-14

    Abstract: Embodiments relate to a system, computer program product, and method for determining missing values in respective data records with an explanatory analysis to provide a context of the determined values. Such method includes receiving a dataset including incomplete data records that are missing predictors and complete data records. A model is trained with the complete data records and candidate predictors for the missing predictors are generated. A predictor importance value is generated for each candidate predictor and the candidate predictors that have a predictor importance value in excess of a first threshold value are promoted. Respective promoted candidate predictors are inserted into the respective incomplete data records, thereby creating tentative data records. The tentative data records are injected into the model, a fit value is determined for each of the tentative data records, and a tentative data record with a fit value exceeding a second threshold value is selected.

    Test case selection
    18.
    发明授权

    公开(公告)号:US11288173B1

    公开(公告)日:2022-03-29

    申请号:US17027780

    申请日:2020-09-22

    Abstract: Test case selection methods are disclosed. A feature of a candidate test case and respective features of a set of test cases are extracted. The set of test cases is clustered into a plurality of clusters based on the respective features of the set of test cases. At least one cluster related to the candidate test case is determined from the plurality of clusters based on the feature of the candidate test case. At least one test case similar to the candidate test case is selected from a plurality of test cases included in the at least one cluster.

    Predictive maintenance utilizing supervised sequence rule mining

    公开(公告)号:US11150630B2

    公开(公告)日:2021-10-19

    申请号:US15787732

    申请日:2017-10-19

    Abstract: Statistically significant event patterns predict the timing for performing entity maintenance. Event patterns are determined based on a target variable having an undesired value for a given entity when the event pattern occurs. Event patterns are filtered based on distributions of the event patterns across multiple entities and distributions of event patterns during desired operation of the entities and undesired operation of the entities. A predictive maintenance process is established having significant event patterns as the basis for maintenance tasks.

Patent Agency Ranking