Integrating Data Quality Analyses for Modeling Metrics

    公开(公告)号:US20220004822A1

    公开(公告)日:2022-01-06

    申请号:US16921579

    申请日:2020-07-06

    Abstract: Techniques for generating a composite score for data quality are disclosed. Univariate analysis is performed on a plurality of data points corresponding to each of a first feature, a second feature, and a third feature of a data set. The univariate analysis includes at least a first type of analysis generating a first score having a first range of possible values, and a second type of analysis generating a second score having a second range of possible values. A first quality score is computed for the data values for the first, second, and third features based on a normalized first score and a normalized second score. Machine learning is performed on the data points corresponding to one or both of the first feature and the second feature having a first quality score above a threshold value to model the third feature.

    Integrating Data Quality Analyses For Modeling Metrics

    公开(公告)号:US20250005456A1

    公开(公告)日:2025-01-02

    申请号:US18766438

    申请日:2024-07-08

    Abstract: Techniques for generating a composite score for data quality are disclosed. Univariate analysis is performed on a plurality of data points corresponding to each of a first feature, a second feature, and a third feature of a data set. The univariate analysis includes at least a first type of analysis generating a first score having a first range of possible values, and a second type of analysis generating a second score having a second range of possible values. A first quality score is computed for the data values for the first, second, and third features based on a normalized first score and a normalized second score. Machine learning is performed on the data points corresponding to one or both of the first feature and the second feature having a first quality score above a threshold value to model the third feature.

    Adaptive pattern recognition for a sensor network

    公开(公告)号:US11762956B2

    公开(公告)日:2023-09-19

    申请号:US17245245

    申请日:2021-04-30

    Abstract: Embodiments match sensor data output by a sensor to a trained pattern. Embodiments form a plurality of windows of an identified pattern from the sensor data, each of the plurality of windows having a substantially equal window length to a length of the trained pattern. For each of the windows, embodiments generate a corresponding first Symbolic Aggregate approximation (“SAX”) word, determine a Hamming distance between the first SAX word and a second SAX word corresponding to the trained pattern, and determine a final distance score based on coefficients between the first SAX word and the second SAX word. For each of the windows, embodiments determine a number of positions in the first SAX word that do not contribute to the final distance score, update the Hamming distance after eliminating the number of positions and determine an average distance based on the final distance score and the updated Hamming distance.

    AUTOMATED CORRELATION ANALYSIS AND SELF-REGULATION OF ATTRIBUTES

    公开(公告)号:US20240330400A1

    公开(公告)日:2024-10-03

    申请号:US18313260

    申请日:2023-05-05

    CPC classification number: G06F17/15 G06F17/18

    Abstract: Operations associated with determining correlations between various attributes are disclosed. The operations may include: identifying a target attribute and a plurality of influencing attributes, determining a first correlation value representing a first correlation between the target attribute and a first influencing attribute of the plurality of influencing attributes, determining a second correlation value representing a second correlation between the target attribute and a second influencing attribute of the plurality of attributes, and based on the first correlation value and the second correlation value, ranking the first influencing attribute higher than the second influencing attribute in a ranked list of the plurality of influencing attributes representing an influence of each of the plurality of influencing attributes on the target attribute.

    Integrating data quality analyses for modeling metrics

    公开(公告)号:US12050969B2

    公开(公告)日:2024-07-30

    申请号:US16921579

    申请日:2020-07-06

    CPC classification number: G06N20/00 G06F18/2113 G06F18/2193

    Abstract: Techniques for generating a composite score for data quality are disclosed. Univariate analysis is performed on a plurality of data points corresponding to each of a first feature, a second feature, and a third feature of a data set. The univariate analysis includes at least a first type of analysis generating a first score having a first range of possible values, and a second type of analysis generating a second score having a second range of possible values. A first quality score is computed for the data values for the first, second, and third features based on a normalized first score and a normalized second score. Machine learning is performed on the data points corresponding to one or both of the first feature and the second feature having a first quality score above a threshold value to model the third feature.

    Automatic asset anomaly detection in a multi-sensor network

    公开(公告)号:US11216247B2

    公开(公告)日:2022-01-04

    申请号:US16806275

    申请日:2020-03-02

    Abstract: Embodiments determine anomalies in sensor data generated by a plurality of sensors that correspond to a single asset. Embodiments receive a first time window of clean sensor input data generated by the sensors, the clean sensor data including anomaly free data comprised of clean data points. Embodiments divide the clean data points into training data points and evaluation data points, and divide the training data points into a pre-defined number of plurality of segments of equal length. Embodiments convert each of the plurality of segments into corresponding segment curves using Kernel Density Estimation (“KDE”) and determine a Jensen-Shannon (“JS”) divergence value for each of the plurality of segments using the segment curves to generate a plurality of JS divergence values. Embodiments then assign the maximum value of the plurality of JS divergence values as a threshold value and validate the threshold value using the evaluation data points.

    Action determination using recommendations based on prediction of sensor-based systems

    公开(公告)号:US12223397B2

    公开(公告)日:2025-02-11

    申请号:US17007601

    申请日:2020-08-31

    Abstract: Techniques for providing actionable recommendations for configuring system parameters are disclosed. A set of environmental constraints and a set of values for a set of parameters for a target device is applied to a machine learning model to predict a first performance value of the target device. Candidate values for the set of parameters are identified that are within a threshold range from the first set of values in a multi-dimensional space. For each particular candidate set of values of the candidate sets of values the machine learning model to predicts a performance value of the target device and identifies a subset of the candidate sets of values with corresponding performance values that meet a performance criteria. A subset of candidate sets of values that meets performance criteria is provided as a recommendation.

Patent Agency Ranking