DATA SKEW DETECTION IN MACHINE LEARNING ENVIRONMENTS

    公开(公告)号:US20240312180A1

    公开(公告)日:2024-09-19

    申请号:US18184465

    申请日:2023-03-15

    CPC classification number: G06V10/762 G06V10/26 G06V10/42

    Abstract: Systems and methods for preventing prediction performance degradation by detecting and extracting skews in data during both training and production environments is described herein. Feature extraction may be performed on training data during the training phase, followed by pattern analysis that assesses similarities across labeled training data sets. A reference pattern may be derived from the pattern analysis and feature extraction of the training data. Feature extraction and pattern analysis may be performed on production data during the serving phase, and a target pattern may be derived from the pattern analysis and feature extraction of the production data. The reference pattern and target pattern may be fed to a discrepancy detection functionality to detect discrepancies by using a sliding window to move the target pattern across the reference pattern to make comparisons between the patterns. The comparison may provide a quantitative skew across the training and production data.

Patent Agency Ranking