UPDATE PROPAGATION IN A DATA STREAM WAREHOUSE
Abstract:
Architectures and techniques are presented that can more efficiently update derived data products in response to updated source data. Source data is typically stored in source tables, whereas a materialized view of a query can generate a derived table based on the state of the source tables at the time the query is executed. When source data changes (e.g., in response to late-arriving input data), rather than recomputing the entire derived table (e.g., by again executing the original query, which can be expensive), an invertible relationship between timestamps can be leveraged to identify only those portions of the derived table that are affected by the update. Therefore, a new defining query can be generated to update only those portions of the derived table that are affected by the source data update.
Public/Granted literature
Information query
Patent Agency Ranking
0/0