Continuous feature-independent determination of features for deviation analysis

    公开(公告)号:US11720579B2

    公开(公告)日:2023-08-08

    申请号:US17367882

    申请日:2021-07-06

    CPC classification number: G06F16/2462 G06F16/2457 G06F16/283

    Abstract: Systems and methods include determination, for each of a plurality of discrete features, of statistics based on a number of occurrences of each discrete value of the discrete feature in the data, determination of first summary statistics based on the determined statistics, determine of a dissimilarity for each discrete feature based on the first summary statistics and on the statistics determined for the discrete feature, determination of candidate discrete features based on the determined dissimilarities, determination, for each of the candidate discrete features, of second summary statistics based on values of a continuous feature associated with each discrete value of the candidate discrete feature, determination of a deviation score for each of the candidate discrete features based on the second summary statistics, and transmission of the candidate discrete features for display in association with the continuous feature based on the determined deviation scores.

    PREDICTION INTEGRATION FOR DATA MANAGEMENT PLATFORMS

    公开(公告)号:US20200004891A1

    公开(公告)日:2020-01-02

    申请号:US16134043

    申请日:2018-09-18

    Abstract: Techniques are described for integrating prediction capabilities from data management platforms into applications. Implementations employ a data science platform (DSP) that operates in conjunction with a data management solution (e.g., a data hub). The DSP can be used to orchestrate data pipelines using various machine learning (ML) algorithms and/or data preparation functions. The data hub can also provide various orchestration and data pipelining capabilities to receive and handle data from various types of data sources, such as databases, data warehouses, other data storage solutions, internet-of-things (IoT) platforms, social networks, and/or other data sources. In some examples, users such as data engineers and/or others may use the implementations described herein to handle the orchestration of data into a data management platform.

    Determination of candidate features for deviation analysis

    公开(公告)号:US11681715B2

    公开(公告)日:2023-06-20

    申请号:US17342812

    申请日:2021-06-09

    CPC classification number: G06F16/2462 G06F16/2465 G06F16/285

    Abstract: Systems and methods include determination, determine, for each of a plurality of discrete features, of statistics for each discrete value of the discrete feature based on values of a continuous feature associated with the discrete value, determination, for each discrete feature, of first summary statistics based on the statistics determined for each discrete value of the discrete feature, determination, for each discrete feature, of a dissimilarity based on the first summary statistics determined for the discrete feature and on the statistics determined for each discrete value of the discrete feature, determination of candidate discrete features of the discrete features based on the determined dissimilarities, the candidate discrete features comprising less than all of the discrete features, determination, for each of the candidate discrete features, of second summary statistics based on values of the continuous feature associated with each discrete value of the candidate discrete feature, determine of a deviation score for each of the candidate discrete features based on the second summary statistics, and presentation of the candidate discrete features based on the determined deviation scores.

    Prediction integration for data management platforms

    公开(公告)号:US11574019B2

    公开(公告)日:2023-02-07

    申请号:US16134043

    申请日:2018-09-18

    Abstract: Techniques are described for integrating prediction capabilities from data management platforms into applications. Implementations employ a data science platform (DSP) that operates in conjunction with a data management solution (e.g., a data hub). The DSP can be used to orchestrate data pipelines using various machine learning (ML) algorithms and/or data preparation functions. The data hub can also provide various orchestration and data pipelining capabilities to receive and handle data from various types of data sources, such as databases, data warehouses, other data storage solutions, internet-of-things (IoT) platforms, social networks, and/or other data sources. In some examples, users such as data engineers and/or others may use the implementations described herein to handle the orchestration of data into a data management platform.

    TOP CONTRIBUTOR RECOMMENDATION FOR CLOUD ANALYTICS

    公开(公告)号:US20220382729A1

    公开(公告)日:2022-12-01

    申请号:US17329519

    申请日:2021-05-25

    Abstract: A system and method including determining, for a specified target measure column of a first dataset including a plurality of records, the metadata of the first dataset, including a probability distribution for the specified target column and dimension scores for the dimensions for the first dataset conditioned on the specified target measure column, where the first dataset comprises a plurality of columns including the at least one target measure column and a plurality of non-numeric, dimension columns for the records of the first dataset; determining, for a subset of data of the first dataset based on one or more specified variables, dimension scores for the dimensions of the subset of data approximately derived from the determined metadata of the first dataset; and providing recommendations of top contributors based on the approximated dimension scores of dimensions of the subset of data.

Patent Agency Ranking