-
公开(公告)号:US11734264B2
公开(公告)日:2023-08-22
申请号:US17558097
申请日:2021-12-21
Applicant: Ab Initio Technology LLC
Inventor: Jonah Egenolf , Marshall A. Isman , Ian Schechter
IPC: G06F16/2453 , G06F16/242 , G06F8/34 , G06F8/36 , G06F16/2452 , G06F8/38 , G06F16/23 , G06F16/21 , G06F16/28 , G06Q10/10 , G06Q30/0242
CPC classification number: G06F16/2423 , G06F8/34 , G06F8/36 , G06F8/38 , G06F16/211 , G06F16/2365 , G06F16/2453 , G06F16/24524 , G06F16/24526 , G06F16/24544 , G06F16/24545 , G06F16/288 , G06Q10/10 , G06Q30/0243
Abstract: A method includes accessing a schema that specifies relationships among datasets, computations on the datasets, or transformations of the datasets, selecting a dataset from among the datasets, and identifying, from the schema, other datasets that are related to the selected dataset. Attributes of the datasets are identified, and logical data representing the identified attributes and relationships among the attributes is generated. The logical data is provided to a development environment, which provides access to portions of the logical data representing the identified attributes. A specification that specifies at least one of the identified attributes in performing an operation is received from the development environment. Based on the specification and the relationships among the identified attributes represented by the logical data, a computer program is generated to perform the operation by accessing, from storage, at least one dataset having the at least one of the attributes specified in the specification.
-
公开(公告)号:US20220342935A1
公开(公告)日:2022-10-27
申请号:US17858605
申请日:2022-07-06
Applicant: Ab Initio Technology LLC
Inventor: Jonah Egenolf , Marshall A. Isman , Frederic Wild
IPC: G06F16/901 , G06F16/26 , G06F8/34 , G06F8/10 , G06F16/25
Abstract: A method performed by a computer system including: accessing a specification that specifies a plurality of modules to be implemented by the computer program for processing the one or more values of the one or more fields in the structured data item; transforming the specification into the computer program that implements the plurality of modules, wherein the transforming includes: for each of one or more first modules of the plurality of modules: identifying one or more second modules of the plurality of modules that each receive input that is at least partly based on an output of the first module; and formatting an output data format of the first module such that the first module outputs only one or more values of one or more fields of the structured data item.
-
公开(公告)号:US20220147529A1
公开(公告)日:2022-05-12
申请号:US17558097
申请日:2021-12-21
Applicant: Ab Initio Technology LLC
Inventor: Jonah Egenolf , Marshall A. Isman , Ian Schechter
IPC: G06F16/2453 , G06F16/21 , G06F16/2452 , G06F16/28
Abstract: A method includes accessing a schema that specifies relationships among datasets, computations on the datasets, or transformations of the datasets, selecting a dataset from among the datasets, and identifying, from the schema, other datasets that are related to the selected dataset. Attributes of the datasets are identified, and logical data representing the identified attributes and relationships among the attributes is generated. The logical data is provided to a development environment, which provides access to portions of the logical data representing the identified attributes. A specification that specifies at least one of the identified attributes in performing an operation is received from the development environment. Based on the specification and the relationships among the identified attributes represented by the logical data, a computer program is generated to perform the operation by accessing, from storage, at least one dataset having the at least one of the attributes specified in the specification.
-
公开(公告)号:US11210285B2
公开(公告)日:2021-12-28
申请号:US17025751
申请日:2020-09-18
Applicant: Ab Initio Technology LLC
Inventor: Jonah Egenolf , Marshall A. Isman , Ian Schechter
IPC: G06F9/44 , G06F16/242 , G06F8/34 , G06F8/36 , G06F16/2452 , G06F8/38 , G06F16/23 , G06Q10/10 , G06Q30/02
Abstract: A method includes accessing a schema that specifies relationships among datasets, computations on the datasets, or transformations of the datasets, selecting a dataset from among the datasets, and identifying, from the schema, other datasets that are related to the selected dataset. Attributes of the datasets are identified, and logical data representing the identified attributes and relationships among the attributes is generated. The logical data is provided to a development environment, which provides access to portions of the logical data representing the identified attributes. A specification that specifies at least one of the identified attributes in performing an operation is received from the development environment. Based on the specification and the relationships among the identified attributes represented by the logical data, a computer program is generated to perform the operation by accessing, from storage, at least one dataset having the at least one of the attributes specified in the specification.
-
15.
公开(公告)号:US20210263900A1
公开(公告)日:2021-08-26
申请号:US17006504
申请日:2020-08-28
Applicant: Ab Initio Technology LLC
Inventor: John Joyce , Marshall A. Isman , Sandrick Melbouci
IPC: G06F16/215 , G06F16/28 , G06F16/22 , G06N20/00 , G06N5/04
Abstract: Methods and systems are configured to determine a semantic meaning for data and generate data processing rules based on the semantic meaning of the data. The semantic meaning includes syntactical or contextual meaning for the data that is determined, for example, by profiling, by the data processing system, values stored in a field included in data records of one or more datasets; applying, by the data processing system, one or more classifiers to the profiled values; identifying, based on applying the one or more classifiers, one or more attributes indicative of a logical or syntactical characteristic for the values of the field, with each of the one or more attributes having a respective confidence level that is based on an output of each of the one or more classifiers. The attributes are associated with the fields and are used for generating data processing rules and processing the data.
-
公开(公告)号:US20190266075A1
公开(公告)日:2019-08-29
申请号:US16362964
申请日:2019-03-25
Applicant: Ab Initio Technology LLC
Inventor: Marshall A. Isman , Richard A. Epstein , Ralf Haug , Andrew F. Roberts , John Ralston , John L. Richardson , Justin Pniower
IPC: G06F11/36 , G06F16/21 , G06F16/9535
Abstract: A computer-implemented method includes accessing a plurality of data records, each data record having a plurality of data fields. The method further includes analyzing values for one or more of the data fields for at least some of the plurality of data records and generating a profile of the plurality of data records based on the analyzing. The method further includes formulating at least one subsetting rule based on the profile; and selecting a subset of data records from the plurality of data records based on the at least one subsetting rule.
-
公开(公告)号:US10185641B2
公开(公告)日:2019-01-22
申请号:US14573038
申请日:2014-12-17
Applicant: Ab Initio Technology LLC
Inventor: Marshall A. Isman , Richard Alan Epstein
Abstract: A method includes receiving data indicative of a number of times each of one or more rules was executed by a data processing application during processing of one or more records; based on the number of times each of the rules was executed by the data processing application, determining a content criterion for each of one or more particular fields; generating content for each of the particular fields based on the content criterion; and populating each of the particular fields with the generated content.
-
公开(公告)号:US20170351494A1
公开(公告)日:2017-12-07
申请号:US15433467
申请日:2017-02-15
Applicant: Ab Initio Technology LLC
Inventor: Marshall A. Isman , John Joyce
CPC classification number: G06F8/433 , G06F8/70 , G06F8/71 , G06F9/44526 , G06F11/3624 , G06F11/3664 , G06F16/258 , G06F16/9024
Abstract: A method includes analyzing, by a processor, a first version of a computer program, the analyzing including identifying a first process included in the first version of the computer program, the first process configured to perform an operation on data having a first format; and by a processor, generating a second version of at least a portion of the computer program, including omitting the first process and including in the second version of the at least portion of the computer program one or more second processes configured to perform a second operation on data of a second format different from the first format, wherein the second operation is based on the first operation.
-
19.
公开(公告)号:US12242442B2
公开(公告)日:2025-03-04
申请号:US18399522
申请日:2023-12-28
Applicant: Ab Initio Technology LLC
Inventor: John Joyce , Marshall A. Isman , Sandrick Melbouci
Abstract: Methods and systems are configured to determine a semantic meaning for data and generate data processing rules based on the semantic meaning of the data. The semantic meaning includes syntactical or contextual meaning for the data that is determined, for example, by profiling, by the data processing system, values stored in a field included in data records of one or more datasets; applying, by the data processing system, one or more classifiers to the profiled values; identifying, based on applying the one or more classifiers, one or more attributes indicative of a logical or syntactical characteristic for the values of the field, with each of the one or more attributes having a respective confidence level that is based on an output of each of the one or more classifiers. The attributes are associated with the fields and are used for generating data processing rules and processing the data.
-
公开(公告)号:US20240346051A1
公开(公告)日:2024-10-17
申请号:US18496543
申请日:2023-10-27
Applicant: Ab Initio Technology LLC
Inventor: Marshall A. Isman , Adam Weiss , Jonah Egenolf , Robert Parks , John MacLean , Richard Mellon , Dusan Radivojevic , Paul Veiser , Mazin Woodrow Khader
CPC classification number: G06F16/288 , G06F3/048 , G06F9/4494 , G06F9/451
Abstract: A method implemented by a data processing system for enabling a system to pipeline or otherwise process data in conformance with specified criteria by providing a graphical user interface for selecting data to be processed, determining metadata of selected data, and, based on the metadata, automatically processing the selected data in conformance with the specified criteria.
-
-
-
-
-
-
-
-
-