-
公开(公告)号:US11194812B2
公开(公告)日:2021-12-07
申请号:US16234331
申请日:2018-12-27
发明人: Min Shen , Maneesh Varshney , David J. Stein , Jian Qiao
IPC分类号: G06F16/00 , G06F16/2455 , G06F16/2458 , G06F16/22 , G06F16/23
摘要: The disclosed embodiments provide a system for processing data. During operation, the system organizes fact data to be aggregated into sliding time window features and observation data associated with the fact data into a set of partitions based on a join key. Next, the system sorts the fact data and the observation data within the set of partitions by the join key and timestamps associated with the fact data and the observation data. For each observation record in the observation data, the system aggregates fact records in the sorted fact data that share a value of the join key with the observation record and that fall within a first time window associated with the observation record to produce a sliding time window feature. The system then stores the sliding time window feature in association with the observation record.
-
公开(公告)号:US20190325085A1
公开(公告)日:2019-10-24
申请号:US15959005
申请日:2018-04-20
发明人: David J. Stein , Paul T. Ogilvie , Bee-Chung Chen , Shaunak Chatterjee , Priyanka Gariba , Ke Wu , Grace W. Tang , Yangchun Luo , Boyi Chen , Amit Yadav , Ruoyang Wang , Divya Gadde , Wenxuan Gao , Amit Chandak , Varnit Agnihotri , Wei Zhuang , Joel D. Young , Weidong Zhang
摘要: The disclosed embodiments provide a system for processing data. During operation, the system obtains a feature configuration for a feature. Next, the system obtains, from the feature configuration, an anchor containing metadata for accessing the feature in an environment. The system then uses one or more attributes of the anchor to retrieve one or more feature values of the feature from the environment. Finally, the system provides the one or more feature values for use with one or more machine-learning models.
-
公开(公告)号:US20190188243A1
公开(公告)日:2019-06-20
申请号:US15844861
申请日:2017-12-18
发明人: Chen Sun , David J. Stein , Ke Wu , Joel D. Young
CPC分类号: G06F17/18 , G06F7/02 , G06F16/9024 , G06K9/6212 , G06K9/6232
摘要: The disclosed embodiments provide a system for processing data. During operation, the system obtains a set of values and a set of reference values for one or more features used with one or more statistical models. Next, the system applies a hypothesis test to the set of values and the set of reference values to assess a distribution-level consistency in the one or more features. The system then outputs the distribution-level consistency for use in monitoring the distribution of the one or more features. Finally, the system includes, with the outputted distribution-level consistency, one or more factors that contribute to the distribution-level consistency.
-
公开(公告)号:US20200210430A1
公开(公告)日:2020-07-02
申请号:US16234331
申请日:2018-12-27
发明人: Min Shen , Maneesh Varshney , David J. Stein , Jian Qiao
IPC分类号: G06F16/2455 , G06F16/2458 , G06F16/23 , G06F16/22
摘要: The disclosed embodiments provide a system for processing data. During operation, the system organizes fact data to be aggregated into sliding time window features and observation data associated with the fact data into a set of partitions based on a join key. Next, the system sorts the fact data and the observation data within the set of partitions by the join key and timestamps associated with the fact data and the observation data. For each observation record in the observation data, the system aggregates fact records in the sorted fact data that share a value of the join key with the observation record and that fall within a first time window associated with the observation record to produce a sliding time window feature. The system then stores the sliding time window feature in association with the observation record.
-
公开(公告)号:US11704370B2
公开(公告)日:2023-07-18
申请号:US15959005
申请日:2018-04-20
发明人: David J. Stein , Paul T. Ogilvie , Bee-Chung Chen , Shaunak Chatterjee , Priyanka Gariba , Ke Wu , Grace W. Tang , Yangchun Luo , Boyi Chen , Amit Yadav , Ruoyang Wang , Divya Gadde , Wenxuan Gao , Amit Chandak , Varnit Agnihotri , Wei Zhuang , Joel D. Young , Weidong Zhang
IPC分类号: G06F16/907 , G06N20/00 , G06F9/445 , G06F18/214
CPC分类号: G06F16/907 , G06F9/44505 , G06F18/214 , G06N20/00
摘要: The disclosed embodiments provide a system for processing data. During operation, the system obtains a feature configuration for a feature. Next, the system obtains, from the feature configuration, an anchor containing metadata for accessing the feature in an environment. The system then uses one or more attributes of the anchor to retrieve one or more feature values of the feature from the environment. Finally, the system provides the one or more feature values for use with one or more machine-learning models.
-
公开(公告)号:US20190325262A1
公开(公告)日:2019-10-24
申请号:US15958990
申请日:2018-04-20
发明人: David J. Stein , Paul T. Ogilvie , Bee-Chung Chen , Ke Wu , Grace W. Tang , Priyanka Gariba , Yangchun Luo , Boyi Chen , Jian Qiao , Benjamin Hoan Le , Joel D. Young , Wei Zhuang
摘要: The disclosed embodiments provide a system for processing data. During operation, the system obtains feature configurations for a set of features. Next, the system obtains, from the feature configurations, an anchor containing metadata for accessing a first feature in an environment and a feature derivation for generating a second feature from the first feature. The system then uses the anchor to retrieve feature values of the first feature from the environment and uses the feature derivation to generate additional feature values of the second feature from the feature values of the first feature. Finally, the system provides the additional feature values for use with one or more machine learning models.
-
公开(公告)号:US20190324767A1
公开(公告)日:2019-10-24
申请号:US15959000
申请日:2018-04-20
发明人: David J. Stein , Lei Li , Ke Wu , Bee-Chung Chen , Priyanka Gariba
摘要: The disclosed embodiments provide a system for sharing features in a feature management framework. During operation, the system creates a repository of feature configurations for a set of features that are accessed across multiple environments. Next, the system identifies dependencies of the repository. The system then copies shared feature configurations from other repositories represented by the dependencies. Finally, the system combines the shared feature configurations with existing feature configurations in the repository for use in retrieving feature values for one or more machine learning models.
-
公开(公告)号:US20190325351A1
公开(公告)日:2019-10-24
申请号:US15958999
申请日:2018-04-20
发明人: David J. Stein , Ruoyang Wang , Ke Wu , Bee-Chung Chen , Priyanka Gariba
摘要: The disclosed embodiments provide a system for processing data. During operation, the system selects a set of entity keys associated with reference feature values used with one or more machine learning models, wherein the reference feature values are generated in a first environment. Next, the system matches the set of entity keys to feature values from a second environment. The system then compares the feature values and the reference feature values to assess a consistency of a feature across the first and second environments. Finally, the system outputs a result of the assessed consistency for use in managing the feature in the first and second environments.
-
公开(公告)号:US20190325258A1
公开(公告)日:2019-10-24
申请号:US15958997
申请日:2018-04-20
发明人: David J. Stein , Ke Wu , Priyanka Gariba , Grace W. Tang , Yangchun Luo , Songxiang Gu , Bee-Chung Chen
摘要: The disclosed embodiments provide a system for processing data. During operation, the system obtains feature configurations for a set of features and a command for inspecting a data set that is produced using the feature configurations. Next, the system obtains, from the feature configurations, one or more anchors containing metadata for accessing the set of features in an environment and a join configuration for joining a feature with one or more additional features. The system then uses the anchors to retrieve feature values of the features and zips the feature values according to the join configuration without matching entity keys associated with the feature values. Finally, the system outputs the zipped feature values in response to the command.
-
-
-
-
-
-
-
-