Self-correcting pipeline flows for schema drift

    公开(公告)号:US12147395B2

    公开(公告)日:2024-11-19

    申请号:US16656372

    申请日:2019-10-17

    Abstract: Techniques describes herein updating pipeline flows in within data systems to maintain data integrity and consistency without manual curation. In certain embodiments, data integration system may detect and/or receive indications of a schema change within a source system of the data integration system. One or more data objects affected by the schema change may be identified, and a set of pipeline rules may be invoked for each of the affected schema changes. The pipeline rules may define a single transformation or a multi-step transformation process by which the data in the source system is provided to one or more target systems. After applying the pipeline rules to the updated source schema, the data received from the source system may be processed using the updated pipeline rules, transformed, and transmitted to the target system(s) to maintain the data integrity of the system.

    TECHNIQUES FOR METADATA VALUE-BASED MAPPING DURING DATA LOAD IN DATA INTEGRATION JOB

    公开(公告)号:US20230289360A1

    公开(公告)日:2023-09-14

    申请号:US17690495

    申请日:2022-03-09

    CPC classification number: G06F16/254

    Abstract: The present embodiments relate to metadata value-based mapping during a data load in a data integration job. A computing device can receive a first data set from a source system and computer-readable instructions to load data into a target system. The device can receive a first metadata set from the target system that describe destinations. The computing device can identify a first data value of the first data set that matches a metadata value of the first metadata set. The device can receive a data integration mapping of the second data value of the first data set to a data field associated with the matching metadata value of the first metadata set. The device can load the second data value of the first data set from the source system into the target system pursuant to the mapping and the computer-readable instructions.

    DISCOVERY OF SOURCE RANGE PARTITIONING INFORMATION IN DATA EXTRACT JOB

    公开(公告)号:US20240202210A1

    公开(公告)日:2024-06-20

    申请号:US18084421

    申请日:2022-12-19

    CPC classification number: G06F16/27

    Abstract: Techniques are described for the discovery of source range partitioning information. An example method includes a device determining a partition boundary value for the data based at least in part on the following steps. The device can determine a first plurality of bounded value sets and a second plurality of bounded value sets. The device can calculate a first average value of a first value and a second average value. The device can determine a first deviation value of the first average value from the first value and a second deviation value of the second average value from a third value. The device can determine the first partition boundary value based at least in part on the first deviation value and the second deviation value, the first partition boundary value being the first candidate partition boundary value or the second candidate partition boundary value.

    Techniques for metadata value-based mapping during data load in data integration job

    公开(公告)号:US11899680B2

    公开(公告)日:2024-02-13

    申请号:US17690495

    申请日:2022-03-09

    CPC classification number: G06F16/254

    Abstract: The present embodiments relate to metadata value-based mapping during a data load in a data integration job. A computing device can receive a first data set from a source system and computer-readable instructions to load data into a target system. The device can receive a first metadata set from the target system that describe destinations. The computing device can identify a first data value of the first data set that matches a metadata value of the first metadata set. The device can receive a data integration mapping of the second data value of the first data set to a data field associated with the matching metadata value of the first metadata set. The device can load the second data value of the first data set from the source system into the target system pursuant to the mapping and the computer-readable instructions.

Patent Agency Ranking