Efficient aggregation of sliding time window features

    公开(公告)号:US11194812B2

    公开(公告)日:2021-12-07

    申请号:US16234331

    申请日:2018-12-27

    摘要: The disclosed embodiments provide a system for processing data. During operation, the system organizes fact data to be aggregated into sliding time window features and observation data associated with the fact data into a set of partitions based on a join key. Next, the system sorts the fact data and the observation data within the set of partitions by the join key and timestamps associated with the fact data and the observation data. For each observation record in the observation data, the system aggregates fact records in the sorted fact data that share a value of the join key with the observation record and that fall within a first time window associated with the observation record to produce a sliding time window feature. The system then stores the sliding time window feature in association with the observation record.

    EFFICIENT AGGREGATION OF SLIDING TIME WINDOW FEATURES

    公开(公告)号:US20200210430A1

    公开(公告)日:2020-07-02

    申请号:US16234331

    申请日:2018-12-27

    摘要: The disclosed embodiments provide a system for processing data. During operation, the system organizes fact data to be aggregated into sliding time window features and observation data associated with the fact data into a set of partitions based on a join key. Next, the system sorts the fact data and the observation data within the set of partitions by the join key and timestamps associated with the fact data and the observation data. For each observation record in the observation data, the system aggregates fact records in the sorted fact data that share a value of the join key with the observation record and that fall within a first time window associated with the observation record to produce a sliding time window feature. The system then stores the sliding time window feature in association with the observation record.

    DECENTRALIZED SHARING OF FEATURES IN FEATURE MANAGEMENT FRAMEWORKS

    公开(公告)号:US20190324767A1

    公开(公告)日:2019-10-24

    申请号:US15959000

    申请日:2018-04-20

    IPC分类号: G06F9/445 G06F17/30 G06F15/18

    摘要: The disclosed embodiments provide a system for sharing features in a feature management framework. During operation, the system creates a repository of feature configurations for a set of features that are accessed across multiple environments. Next, the system identifies dependencies of the repository. The system then copies shared feature configurations from other repositories represented by the dependencies. Finally, the system combines the shared feature configurations with existing feature configurations in the repository for use in retrieving feature values for one or more machine learning models.

    MONITORING AND COMPARING FEATURES ACROSS ENVIRONMENTS

    公开(公告)号:US20190325351A1

    公开(公告)日:2019-10-24

    申请号:US15958999

    申请日:2018-04-20

    IPC分类号: G06N99/00 G06F17/30

    摘要: The disclosed embodiments provide a system for processing data. During operation, the system selects a set of entity keys associated with reference feature values used with one or more machine learning models, wherein the reference feature values are generated in a first environment. Next, the system matches the set of entity keys to feature values from a second environment. The system then compares the feature values and the reference feature values to assess a consistency of a feature across the first and second environments. Finally, the system outputs a result of the assessed consistency for use in managing the feature in the first and second environments.

    EARLY FEEDBACK OF SCHEMATIC CORRECTNESS IN FEATURE MANAGEMENT FRAMEWORKS

    公开(公告)号:US20190325258A1

    公开(公告)日:2019-10-24

    申请号:US15958997

    申请日:2018-04-20

    IPC分类号: G06K9/62 G06F15/18

    摘要: The disclosed embodiments provide a system for processing data. During operation, the system obtains feature configurations for a set of features and a command for inspecting a data set that is produced using the feature configurations. Next, the system obtains, from the feature configurations, one or more anchors containing metadata for accessing the set of features in an environment and a join configuration for joining a feature with one or more additional features. The system then uses the anchors to retrieve feature values of the features and zips the feature values according to the join configuration without matching entity keys associated with the feature values. Finally, the system outputs the zipped feature values in response to the command.