AUTOMATIC DATA LINTING RULES FOR ETL PIPELINES

    公开(公告)号:US20240193174A1

    公开(公告)日:2024-06-13

    申请号:US18080268

    申请日:2022-12-13

    CPC classification number: G06F16/254

    Abstract: In the present disclosure, systems and methods are described for allowing a non-code user to create to transform a database in an ETL pipleline. Specifically, as disclosed herein, a user can take a database and receive a ruleset to apply to the database in an ETL pipeline. The data linting system may take the database and extract a schema and a data sample from it. Further, the data linting system may use the schema and data sample to create two rulesets. With these rulesets, the data linting system combines them to create a final ruleset which may be validated using the data sample. The data linting system then sends the final ruleset and the validation report to the user. With this system, the user only needs to give it a database and will receive a ruleset that is able to be immediately used in an ETL pipeline.

Patent Agency Ranking