Generation and graphical display of data transform provenance metadata

    公开(公告)号:US11314769B2

    公开(公告)日:2022-04-26

    申请号:US16014005

    申请日:2018-06-21

    Abstract: Techniques for propagation of deletion operations among a plurality of related datasets are described herein. In an embodiment, a data processing method comprises, using a distributed database system that is programmed to manage a plurality of different raw datasets and a plurality of derived datasets that have been derived from the raw datasets based on a plurality of derivation relationships that link the raw datasets to the derived datasets: from a first dataset that is stored in the distributed database system, determining a subset of records that are candidates for propagated deletion of specified data values; determining one or more particular raw datasets that contain the subset of records; deleting the specified data values from the particular raw datasets; based on the plurality of derivation relationships and the particular raw datasets, identifying one or more particular derived datasets that have been derived from the particular raw datasets; generating and executing a build of the one or more particular derived datasets to result in creating and storing the one or more particular derived datasets without the specified data values that were deleted from the particular raw datasets; repeating the generating and executing for all derived datasets that have derivation relationships to the particular raw datasets; wherein the method is performed using one or more processors.

    SYSTEMS AND METHODS FOR CREATING AND MANAGING A DATA INTEGRATION WORKSPACE

    公开(公告)号:US20190147114A1

    公开(公告)日:2019-05-16

    申请号:US15956600

    申请日:2018-04-18

    Abstract: Systems and methods are provided for creating and managing a data integration workspace. The workspace may comprise one or more views of data (or datasets) stored in or accessible by the system. Models may be generated and updated based on the plurality of datasets and presented via a graphical user interface. Feedback received via a graphical user interface presenting a model may be used to annotate an underlying dataset associated with the model. Responsive to a modification of the underlying dataset or the rules for using the underlying dataset to generate the model, other related datasets and/or models may be automatically updated accordingly. Templates associated with one or more types of users may be defined. Each template may comprise one or more specific models related to a specific type of user.

Patent Agency Ranking