USING STATISTICAL DISPERSION IN DATA PROCESS GENERATION

    公开(公告)号:WO2023064517A1

    公开(公告)日:2023-04-20

    申请号:PCT/US2022/046642

    申请日:2022-10-14

    Abstract: Methods and systems are described herein for facilitating data integrity processes using measures of statistical dispersion (e.g., gini impurities) of dataset features. The described mechanism may be also be used for selection and dimensionality reduction. Dimensionality reduction may enable storing the dataset using less storage space or performing other operations on the dataset using less resources. In some embodiments, the above described mechanism may be used for supervised categorial clustering and/or categorical classification.

Patent Agency Ranking