Composite relationship discovery framework

    公开(公告)号:US11693879B2

    公开(公告)日:2023-07-04

    申请号:US17324667

    申请日:2021-05-19

    CPC classification number: G06F16/26 G06F16/2465

    Abstract: Systems and methods include reception of a set of data including continuous features and a discrete feature, each continuous feature associated with a plurality of values and the discrete feature associated with a plurality of discrete values, determine, for each continuous feature, a relationship factor representing a relationship between the discrete feature and the continuous feature based on the plurality of values associated with the continuous feature and the plurality of discrete values, identify one of the continuous features associated with a largest one of the determined relationship factors, generate, for each of the other features, a correlation factor representing a correlation between the continuous feature and the identified continuous feature, determine, for each of the continuous features other than the identified continuous feature, a composite relationship score based on the relationship factor and the correlation factor associated with the feature, and present a visualization associated with the discrete feature, the identified continuous feature, and a continuous feature associated with a largest composite relationship score.

    USER INTERFACE DATA ANALYZER HIGHLIGHTER

    公开(公告)号:US20250013819A1

    公开(公告)日:2025-01-09

    申请号:US18348172

    申请日:2023-07-06

    Abstract: A data analyzer highlighter highlights elements of a user interface to enable a user to better understand and analyze the data presented. To do this, a first visualization is generated in a user interface. A configuration panel including elements for selecting statistical techniques is also generated in the user interface. Selections are obtained via the user interface of one or more statistical techniques. Then statistics are determined from the dataset using each of the one or more selected statistical techniques. Rows of data or the columns of data are then sorted based on a number of extreme values in the particular row or column, wherein the extreme value is a minimum value, a maximum value, or an outlier value. A second visualization sorted based on the number of extreme values in the particular row or column is then generated in the user interface.

    Continuous feature-independent determination of features for deviation analysis

    公开(公告)号:US11720579B2

    公开(公告)日:2023-08-08

    申请号:US17367882

    申请日:2021-07-06

    CPC classification number: G06F16/2462 G06F16/2457 G06F16/283

    Abstract: Systems and methods include determination, for each of a plurality of discrete features, of statistics based on a number of occurrences of each discrete value of the discrete feature in the data, determination of first summary statistics based on the determined statistics, determine of a dissimilarity for each discrete feature based on the first summary statistics and on the statistics determined for the discrete feature, determination of candidate discrete features based on the determined dissimilarities, determination, for each of the candidate discrete features, of second summary statistics based on values of a continuous feature associated with each discrete value of the candidate discrete feature, determination of a deviation score for each of the candidate discrete features based on the second summary statistics, and transmission of the candidate discrete features for display in association with the continuous feature based on the determined deviation scores.

    AUTOMATIC HOT AREA DETECTION IN HEAT MAP VISUALIZATIONS

    公开(公告)号:US20210349911A1

    公开(公告)日:2021-11-11

    申请号:US16867036

    申请日:2020-05-05

    Abstract: The present disclosure involves systems, software, and computer implemented methods for automatically detecting hot areas in heat map visualizations. One example method includes identifying a two-dimensional heat map. The identified two-dimensional heat map is converted to a one-dimensional heat map. Cells of the one-dimensional heat map are clustered using a density-based clustering algorithm to generate at least one dense region of cells. A mean value of cells in each dense region is calculated and the dense regions are sorted by mean value in descending order. An approach for identifying hot areas is selected and the selected approach is used to identify at least one dense region as a hot area of the one-dimensional heat map.

    Multiple machine learning model anomaly detection framework

    公开(公告)号:US12050628B1

    公开(公告)日:2024-07-30

    申请号:US18348143

    申请日:2023-07-06

    CPC classification number: G06F16/285 G06F16/2365

    Abstract: Anomalies may be detected using a multiple machine learning model anomaly detection framework. A clustering model is trained using an unsupervised machine learning algorithm on a historical anomaly dataset. A plurality of clusters of records are determined by applying the historical anomaly dataset to the clustering model. Then it is determined whether each cluster of the plurality of clusters is an anomaly-type cluster or a normal-type cluster. The plurality of labels for the plurality of records are updated based on the particular record's cluster classification. Non-pure clusters are determined from among the plurality of clusters based on a purity threshold. A supervised machine learning model is trained for each of the non-pure clusters using the records in the given cluster and the labels for each of those records. Then, predictions of an anomaly are made using the clustering model and the supervised machine learning models.

    Histogram Bin Interval Approximation

    公开(公告)号:US20230133856A1

    公开(公告)日:2023-05-04

    申请号:US17514801

    申请日:2021-10-29

    Abstract: Using approximated bin intervals to label the histograms provides clarity and allows for the histogram to be more intuitively understood. A dataset may comprise a plurality of records having a plurality of features including one or more continuous features. A selection of a continuous feature may be obtained. A bin width based on a number of bins and feature statistics of the continuous feature may be determined. An approximated bin interval range is determined by applying a bin mask based on the bin width to the feature statistics. An approximated bin width is determined based on the number of bins and the approximated bin interval range. Approximated bin intervals for the histogram are determined based on the approximated bin width. A histogram is generated having bins with intervals based the approximated bin intervals.

    MEASURING SUCCESSFUL INSIGHT TOOL INTERACTIONS

    公开(公告)号:US20210374770A1

    公开(公告)日:2021-12-02

    申请号:US16890430

    申请日:2020-06-02

    Abstract: The present disclosure involves systems, software, and computer implemented methods for measuring successful interactions with an insight tool. One example method includes receiving a request for insights for a data point of a data visualization. Insights for the data point are identified and presented in an insights interface in a user session. User interactions with the insights interface are tracked during the user session. A determination is made that the user session has completed. At least one insights success rule is identified for determining whether user sessions with the insights interface are successful. The one or more insights success rules are evaluated to determine whether the user session was successful. In response to determining that the user session was successful, a measure of success for the user session is recorded. In response to determining that the user session was unsuccessful, a measure of failure is recorded for the user session.

Patent Agency Ranking