-
公开(公告)号:US20230100418A1
公开(公告)日:2023-03-30
申请号:US17665109
申请日:2022-02-04
Applicant: Ab Initio Technology LLC
Inventor: Dusan Radivojevic , Robert Parks , Adam Weiss , Maja Jankovic , John Vickery
IPC: G06F3/06
Abstract: An electronic system for increasing the speed of preparing data with a specified data quality for storage by automatically identifying for a user, with minimal user input, common contexts among (i) fields in disparate datasets, and (ii) names the user has specified as potentially describing the fields, and by using those common contexts to govern the disparate datasets prior to storage to ensure the specified data quality.
-
公开(公告)号:US20220398337A1
公开(公告)日:2022-12-15
申请号:US17834492
申请日:2022-06-07
Applicant: Ab Initio Technology LLC
Inventor: Pierre Franquin , Ken Krigelman , Andy Schon , Justin Voshell
IPC: G06F21/62 , G06F16/2455 , G06F16/28
Abstract: Some embodiments relate to a method for use in connection with governance of a plurality of data assets managed by a data processing system, the method comprising: using at least one computer hardware processor to perform: accessing a data governance policy comprising a first data standard (e.g., by obtaining information about the first standard stored in a database system); generating a first data asset collection at least in part by automatically selecting, from among the plurality of data assets managed by the data processing system and using at least one data asset criterion, one or more data assets that meet the at least one data asset criterion; associating the first data asset collection with the first data standard; and verifying whether at least one of the one or more data assets in the first data asset collection complies with the first data standard.
-
公开(公告)号:US20220374413A1
公开(公告)日:2022-11-24
申请号:US17576572
申请日:2022-01-14
Applicant: Ab Initio Technology LLC
Inventor: Joel Gould , Dusan Radivojevic
IPC: G06F16/23 , G06F16/28 , G06F16/901 , G06F11/36
Abstract: A data processing system configured to perform: obtaining a first data lineage representing relationships among physical data elements, the first data lineage being generated at least in part by performing at least one of: (a) analyzing source code of at least one computer program configured to access the physical data elements; and (b) analyzing information obtained during runtime of the at least one computer program; obtaining, based on user input, a second data lineage representing relationships among business data elements; obtaining an association between at least some of the physical data elements of the first data lineage and at least some of the business data elements of the second data lineage; and generating, based on the association between the physical data elements and the business data elements, an indication of agreement or discrepancy between the first data lineage and the second data lineage.
-
公开(公告)号:US11487534B2
公开(公告)日:2022-11-01
申请号:US17306075
申请日:2021-05-03
Applicant: Ab Initio Technology LLC
Inventor: John Joyce , Marshall A. Isman , Sam Kendall
Abstract: A method for analyzing a computer program ecosystem includes performing a static analysis, including identifying static dependencies among elements of the ecosystem based on values of parameters in one or more parameter sets associated with the ecosystem, the elements of the ecosystem including the computer programs of the ecosystem and data resources associated with the computer programs. The method includes performing a runtime analysis, including identifying elements of the ecosystem that were utilized during execution of the ecosystem to process data records. The method includes performing a schedule analysis, including identifying a computer program of the ecosystem that has a schedule dependency from another computer program of the ecosystem. The method includes identifying a subset of the elements of the ecosystem as an ecosystem unit based on the results of the static, runtime, and schedule analyses. The method includes migrating the ecosystem unit, testing the ecosystem unit, or both.
-
公开(公告)号:US11455229B2
公开(公告)日:2022-09-27
申请号:US17067020
申请日:2020-10-09
Applicant: Ab Initio Technology LLC
Inventor: Ilya Rozenberg , Adam Weiss
Abstract: A method for displaying differences between a first executable dataflow graph and a second executable dataflow graph includes comparing a specification of the first executable dataflow graph and a specification of the second executable dataflow graph, including at least one of identifying a particular node or link of the first dataflow graph that does not correspond to any node or link of the second dataflow graph; and identifying a first node or link of the first dataflow graph that corresponds to a second node or link of the second dataflow graph, and identifying a difference between the first node or link and the second node or link. The method includes formulating and displaying a graphical representation of at least some of the nodes or links of the first dataflow graph or the second dataflow graph, the graphical representation including a graphical indicator of at least one of the identified particular node or link the identified difference between the first node or link and the second node or link.
-
公开(公告)号:US11423083B2
公开(公告)日:2022-08-23
申请号:US15795917
申请日:2017-10-27
Applicant: Ab Initio Technology LLC
Inventor: Jonah Egenolf , Marshall A. Isman , Frederic Wild
IPC: G06F16/901 , G06F16/26 , G06F8/34 , G06F8/10 , G06F16/25
Abstract: A method performed by a computer system including: accessing a specification that specifies a plurality of modules to be implemented by the computer program for processing the one or more values of the one or more fields in the structured data item; transforming the specification into the computer program that implements the plurality of modules, wherein the transforming includes: for each of one or more first modules of the plurality of modules: identifying one or more second modules of the plurality of modules that each receive input that is at least partly based on an output of the first module; and formatting an output data format of the first module such that the first module outputs only one or more values of one or more fields of the structured data item.
-
27.
公开(公告)号:US11409545B2
公开(公告)日:2022-08-09
申请号:US17025513
申请日:2020-09-18
Applicant: Ab Initio Technology LLC
Inventor: Oded Ravid , Trevor Murphy , Larry Paul Rossi , Joel Gould
IPC: G06F7/00 , G06F9/448 , G06F16/901 , G06F16/2455 , G06F16/178 , G06F9/445 , G06F8/41
Abstract: A method is described for processing keyed data items that are each associated with a value of a key, the keyed data items being from a plurality of distinct data streams, the processing including collecting the keyed data items, determining, based on contents of at least one of the keyed data items, satisfaction of one or more specified conditions for execution of one or more actions and causing execution of at least one of the one or more actions responsive to the determining.
-
28.
公开(公告)号:US20220245154A1
公开(公告)日:2022-08-04
申请号:US17587130
申请日:2022-01-28
Applicant: Ab Initio Technology LLC.
Inventor: Halldor Isak Gylfason , Robert Parks , Dusan Radivojevic , Adam Harris Weiss
IPC: G06F16/2455 , G06F16/28 , G06F16/27
Abstract: Techniques for storing data entities by a data processing system are described herein. The data processing system may store a plurality of data entity instances generated using a plurality of data entities. The plurality of data entity instances may include a first data entity instance generated using a first data entity and a second data entity instance generated using a second data entity. The first data entity instance may include a first attribute that is configured to inherit its value from a second attribute of the second data entity instance. The data processing system may provide the inherited value of the second attribute of the second data entity instance as the value of the first attribute of the first data entity instance.
-
公开(公告)号:US11281693B2
公开(公告)日:2022-03-22
申请号:US16175487
申请日:2018-10-30
Applicant: Ab Initio Technology LLC
Inventor: Craig W. Stanfill , Joseph Skeffington Wholey, III
Abstract: A method for processing tasks in a distributed data processing system includes processing sets of tasks. The method includes maintaining, at a first processing node a number of counters including a working counter indicating a current time interval of the number of time intervals in the distributed data processing system, and a replication counter indicating a time interval of the number of time intervals for which at least one of (1) all tasks associated with that time interval, or (2) all corresponding results associated with that time interval, are replicated at multiple processing nodes of the number of processing nodes. The method includes providing messages from the first processing node to the other processing nodes of the number of processing nodes, the messages including the working counter and the replication counter.
-
公开(公告)号:US20210263734A1
公开(公告)日:2021-08-26
申请号:US17306075
申请日:2021-05-03
Applicant: Ab Initio Technology LLC
Inventor: John Joyce , Marshall A. Isman , Sam Kendall
Abstract: A method for analyzing a computer program ecosystem including multiple computer programs includes performing a static analysis of the ecosystem, including identifying static dependencies among elements of the ecosystem based on values of parameters in one or more parameter sets associated with the ecosystem, the elements of the ecosystem including the computer programs of the ecosystem and data resources associated with the computer programs. The method includes performing a runtime analysis of the ecosystem, including identifying elements of the ecosystem that were utilized during execution of the ecosystem to process data records. The method includes performing a schedule analysis of the ecosystem, including identifying a computer program of the ecosystem that has a schedule dependency from another computer program of the ecosystem. The method includes identifying a subset of the elements of the ecosystem as an ecosystem unit based on the results of the static, runtime, and schedule analyses. The method includes migrating the ecosystem unit from a first computer system to a second computer system, testing the ecosystem unit, or both.
-
-
-
-
-
-
-
-
-