专利检索 aee:"Ab Initio Technology LLC" 第 10 页

91.

发明授权
Generating rules for data processing values of data fields from semantic labels of the data fields 有权

公开(公告)号：US11886399B2

公开(公告)日：2024-01-30

申请号：US17006504

申请日：2020-08-28

申请人： Ab Initio Technology LLC

发明人： John Joyce , Marshall A. Isman , Sandrick Melbouci

IPC分类号： G06F16/00 , G06F16/215 , G06F16/28 , G06N5/04 , G06N20/00 , G06F16/22

CPC分类号： G06F16/215 , G06F16/2228 , G06F16/285 , G06N5/04 , G06N20/00

摘要： Methods and systems are configured to determine a semantic meaning for data and generate data processing rules based on the semantic meaning of the data. The semantic meaning includes syntactical or contextual meaning for the data that is determined, for example, by profiling, by the data processing system, values stored in a field included in data records of one or more datasets; applying, by the data processing system, one or more classifiers to the profiled values; identifying, based on applying the one or more classifiers, one or more attributes indicative of a logical or syntactical characteristic for the values of the field, with each of the one or more attributes having a respective confidence level that is based on an output of each of the one or more classifiers. The attributes are associated with the fields and are used for generating data processing rules and processing the data.

92.

发明公开
DISCOVERING A SEMANTIC MEANING OF DATA FIELDS FROM PROFILE DATA OF THE DATA FIELDS 审中-公开

公开(公告)号：US20230409835A1

公开(公告)日：2023-12-21

申请号：US18201545

申请日：2023-05-24

申请人： Ab Initio Technology LLC

发明人： Christopher Thurston Butler , Timothy Spencer Bush

IPC分类号： G06F40/30 , G06F16/93 , G06N20/00 , G06F16/908

CPC分类号： G06F40/30 , G06F16/908 , G06N20/00 , G06F16/93

摘要： A data processing system for discovering a semantic meaning of a field included in one or more data sets is configured to identify a field included in one or more data sets, with the field having an identifier. For that field, the system profiles data values of the field to generate a data profile, accesses a plurality of label proposal tests, and generates a set of label proposals by applying the plurality of label proposal tests to the data profile. The system determines a similarity among the label proposals and selects a classification. The system identifies one of the label proposals as identifying the semantic meaning. The system stores the identifier of the field with the identified one of the label proposals that identifies the semantic meaning.

93.

发明授权
Publishing to a data warehouse 有权

公开(公告)号：US11835994B2

公开(公告)日：2023-12-05

申请号：US16517320

申请日：2019-07-19

申请人： Ab Initio Technology LLC

发明人： Andrew Blom , Darren Miller , Marshall A. Isman

IPC分类号： G06F7/00 , G06F17/00 , G06F16/25 , G06F16/901 , G06F8/34 , H04L67/565

CPC分类号： G06F16/254 , G06F8/34 , G06F16/258 , G06F16/9024 , H04L67/565

摘要： A method for generating an executable application to transform and load data into a structured dataset includes receiving a metadata file that specifies values for parameters for structuring data feeds, received from a networked data source, into a structured database. The metadata file specifies logical rules for transforming the data feeds. The values of the parameters and the logical rules for transforming the plurality of the data feeds are validated to ensure logical consistency for each data feed. Data rules are generated that specify standards for transforming each data feed in accordance with the validated values of the parameters and logical rules. The executable application is generated that is configured to receive source data comprising a data feed from one or more data sources and transform the source data into structured data that satisfies the one or more standards for the structured data record in compliance with the data rules.

94.

发明公开
DATAFLOW GRAPH DATASETS 审中-公开

公开(公告)号：US20230359668A1

公开(公告)日：2023-11-09

申请号：US18114212

申请日：2023-02-24

申请人： Ab Initio Technology LLC

发明人： Ian Robert Schechter , Garth Allen Dickie , Jonah Egenolf , Marshall Isman

IPC分类号： G06F16/901

CPC分类号： G06F16/9024

摘要： Described herein are techniques, performed by a data processing system, for enabling efficient development of software application programs in a dynamic environment with multiple datasets by generating entries in a dataset catalog to provide a software application program with access to output data dynamically generated by dataflow graphs, the entries associated with respective software application programs developed as dataflow graphs. The techniques include identifying a subgraph, wherein, when the subgraph is executed, the subgraph generates output data by applying one or more data processing operations to data obtained from one or more data sources; creating, in the dataset catalog, a new entry associated with the identified subgraph, the new entry associated with information indicating nodes, links, and configuration parameters of the identified subgraph; and configuring the dataset catalog to enable access to the new entry, in the dataset catalog, associated with the identified subgraph.

95.

发明授权
Debugging an executable control flow graph that specifies control flow 有权

公开(公告)号：US11782820B2

公开(公告)日：2023-10-10

申请号：US17029828

申请日：2020-09-23

申请人： Ab Initio Technology LLC

发明人： Joyce L. Vigneau , Mark Staknis , Xin Li

IPC分类号： G06F11/36 , G06F11/32

CPC分类号： G06F11/3664 , G06F11/323 , G06F11/362 , G06F11/3636 , G06F11/3656 , G06F11/3696

摘要： A computer-implemented method for debugging an executable control flow graph that specifies control flow among a plurality of functional modules, with the control flow being represented as transitions among the plurality of functional modules, the computer-implemented method including: specifying a position in the executable control flow graph at which execution of the executable control flow graph is to be interrupted; wherein the specified position represents a transition to a given functional module, a transition to a state in which contents of the given functional module are executed or a transition from the given functional module; starting execution of the executable control flow graph in an execution environment; and at a point of execution representing the specified position, interrupting execution of the executable control flow graph; and providing data representing one or more attributes of the execution environment in which the given functional module is being executed.

96.

发明授权
Workload automation and data lineage analysis 有权

公开(公告)号：US11748165B2

公开(公告)日：2023-09-05

申请号：US16906193

申请日：2020-06-19

申请人： Ab Initio Technology LLC

发明人： Harry Michael Wolfson , Joel Gould , Anthony Yeracaris , Tim Wakeling

IPC分类号： G06F9/50

CPC分类号： G06F9/5038

摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information.

97.

发明授权
Generating, accessing, and displaying lineage metadata 有权

公开(公告)号：US11741091B2

公开(公告)日：2023-08-29

申请号：US15829152

申请日：2017-12-01

申请人： Ab Initio Technology LLC

发明人： David Clemens , Dusan Radivojevic , Neil Galarneau

IPC分类号： G06F16/245 , G06F16/22 , G06F16/248 , G06F16/83 , G06F40/117

CPC分类号： G06F16/245 , G06F16/22 , G06F16/248 , G06F16/83 , G06F40/117

摘要： Among other things, we describe a method of receiving a portion of metadata from a data source, the portion of metadata describing nodes and edges; generating instances of a data structure representing the portion of metadata, at least one instance of the data structure including an identification value that identifies a corresponding node, one or more property values representing respective properties of the corresponding node, and one or more pointers to respective identification values, each pointer representing an edge associated with a node identified by the corresponding respective identification value; storing the instances of the data structure in random access memory; receiving a query that includes an identification of at least one particular element of data; and using at least one instance of the data structure to cause a display of a computer system to display a representation of lineage of the particular element of data.

98.

发明授权
Processing data from multiple sources 有权

公开(公告)号：US11720583B2

公开(公告)日：2023-08-08

申请号：US17878106

申请日：2022-08-01

申请人： Ab Initio Technology LLC

发明人： Ian Schechter , Tim Wakeling , Ann M. Wollrath

IPC分类号： G06F16/24 , G06F16/2458 , G06F16/13 , G06F16/25 , G06F16/28 , G06F16/17 , G06F16/901 , G06F9/50

CPC分类号： G06F16/2471 , G06F9/5066 , G06F16/13 , G06F16/1734 , G06F16/254 , G06F16/285 , G06F16/9024 , G06F16/284

摘要： In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.

99.

发明授权
Dynamic execution of parameterized applications for the processing of keyed network data streams 有权

公开(公告)号：US11669343B2

公开(公告)日：2023-06-06

申请号：US17477922

申请日：2021-09-17

申请人： Ab Initio Technology LLC

发明人： Oded Ravid , Trevor Murphy

IPC分类号： G06F7/00 , G06F9/448 , G06F16/901 , G06F16/2455 , G06F16/178 , G06F9/445 , G06F8/41

CPC分类号： G06F9/4494 , G06F9/44505 , G06F16/1794 , G06F16/24568 , G06F16/9024 , G06F8/433

摘要： A method is described for processing keyed data items that are each associated with a value of a key, the keyed data items being from a plurality of distinct data streams, the processing including collecting the keyed data items, determining, based on contents of at least one of the keyed data items, satisfaction of one or more specified conditions for execution of one or more actions and causing execution of at least one of the one or more actions responsive to the determining.

100.

发明申请
METADATA-DRIVEN DATA INGESTION 有权

公开(公告)号：US20230100418A1

公开(公告)日：2023-03-30

申请号：US17665109

申请日：2022-02-04

申请人： Ab Initio Technology LLC

发明人： Dusan Radivojevic , Robert Parks , Adam Weiss , Maja Jankovic , John Vickery

IPC分类号： G06F3/06

摘要： An electronic system for increasing the speed of preparing data with a specified data quality for storage by automatically identifying for a user, with minimal user input, common contexts among (i) fields in disparate datasets, and (ii) names the user has specified as potentially describing the fields, and by using those common contexts to govern the disparate datasets prior to storage to ensure the specified data quality.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类