-
公开(公告)号:US10860635B2
公开(公告)日:2020-12-08
申请号:US15694192
申请日:2017-09-01
Applicant: Ab Initio Technology LLC
Inventor: Erik Bator , Joel Gould , Dusan Radivojevic , Tim Wakeling
Abstract: In general, a specification of multiple contexts that are related according to a hierarchy is received. Relationships are determined among three or more metadata objects, and at least some of the metadata objects are grouped into one or more respective groups. Each of at least some of the groups is based on a selected one of the contexts and is represented by a node in a diagram. Relationships among the nodes are determined based on the relationships among the metadata objects in the groups represented by the nodes, and a visual representation is generated of the diagram including the nodes and the relationships among the nodes.
-
公开(公告)号:US20150302075A1
公开(公告)日:2015-10-22
申请号:US14255579
申请日:2014-04-17
Applicant: Ab Initio Technology LLC
Inventor: Ian Schechter , Tim Wakeling , Ann M. Wollrath
IPC: G06F17/30
CPC classification number: G06F17/30545 , G06F9/5066 , G06F17/30091 , G06F17/30144 , G06F17/30563 , G06F17/30595 , G06F17/30598 , G06F17/30958
Abstract: In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.
Abstract translation: 在第一方面,一种方法包括:在Hadoop簇的节点处,存储在HDFS数据存储器中的第一部分数据的节点,执行能够从数据源外部的数据源接收数据的数据处理引擎的第一实例, Hadoop集群,由数据处理引擎接收计算机可执行程序,由数据处理引擎的第一个实施程序的至少一部分,由数据处理引擎从外部数据源接收第二部分数据 存储除了HDFS存储之外的第二数据部分,并且由数据处理引擎执行至少使用数据的第一部分和第二部分数据由程序识别的数据处理操作。
-
13.
公开(公告)号:US20150301861A1
公开(公告)日:2015-10-22
申请号:US14690112
申请日:2015-04-17
Applicant: Ab Initio Technology LLC
Inventor: Dino LaChiusa , Joyce L. Vigneau , Mark Buxbaum , Brad Lee Miller , Tim Wakeling
CPC classification number: G06F9/4881 , G06F9/54 , G06F11/3003 , G06F11/3031 , G06F11/3055 , G06F11/3072 , G06F11/328 , G06F11/3409 , G06F11/3433 , G06F11/3452 , G06F2201/865
Abstract: A method of managing components in a processing environment is provided. The method includes monitoring (i) a status of each of one or more computing devices, (ii) a status of each of one or more applications, each application hosted by at least one of the computing devices, and (iii) a status of each of one or more jobs, each job associated with at least one of the applications; determining that one of the status of one of the computing devices, the status of one of the applications, and the status of one of the jobs is indicative of a performance issue associated with the corresponding computing device, application, or job, the determination being made based on a comparison of a performance of the computing device, application, or job and at least one predetermined criterion; and enabling an action to be performed associated with the performance issue.
Abstract translation: 提供了一种在处理环境中管理组件的方法。 该方法包括监视(i)一个或多个计算设备中的每一个的状态,(ii)一个或多个应用中的每一个的状态,由至少一个计算设备托管的每个应用,以及(iii) 每个作业与至少一个应用程序相关联; 确定计算设备中的一个的状态,应用中的一个的状态和其中一个作业的状态中的一个指示与相应的计算设备,应用或作业相关联的性能问题,所述确定是 基于计算设备,应用或作业的性能与至少一个预定标准的比较来制作; 并且启用与性能问题相关联的动作。
-
公开(公告)号:US20140143760A1
公开(公告)日:2014-05-22
申请号:US13678921
申请日:2012-11-16
Applicant: AB INITIO TECHNOLOGY LLC
Inventor: Mark Buxbaum , Michael G. Mulligan , Tim Wakeling , Matthew Darcy Atterbury
IPC: G06F11/36
CPC classification number: G06F11/3041 , G06F11/3003 , G06F11/3082 , G06F11/323 , G06F11/3404 , G06F11/3419 , G06F11/3476 , G06F2201/865 , G06Q30/0201
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving multiple units of work that each include one or more work elements. The method includes determining a characteristic of the first unit of work. The method includes identifying, by a component of the first dataflow graph, a second dataflow graph from multiple available dataflow graphs based on the determined characteristic, the multiple available dataflow graphs being stored in a data storage system. The method includes processing the first unit of work using the second dataflow graph. The method includes determining one or more performance metrics associated with the processing.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于动态图表性能监视。 其中一种方法包括接收多个工作单元,每个单元包括一个或多个工作单元。 该方法包括确定第一工作单元的特性。 该方法包括基于所确定的特性,通过第一数据流图的分量来识别来自多个可用数据流图的第二数据流图,所述多个可用数据流图被存储在数据存储系统中。 该方法包括使用第二数据流图处理第一工作单元。 该方法包括确定与该处理相关联的一个或多个性能度量。
-
公开(公告)号:US20240126561A1
公开(公告)日:2024-04-18
申请号:US18492173
申请日:2023-10-23
Applicant: Ab Initio Technology LLC
Inventor: Frank Lynch , Tim Wakeling
CPC classification number: G06F9/445 , G06F8/63 , G06F9/451 , G06F9/45545 , G06F9/5072
Abstract: A method implemented by a data processing system including: accessing the container image that includes the first application and a second application; determining, by the data processing system, the number of parallel executions of the given module of the first application; for the given module, generating a plurality of instances of the container image in accordance with the number of parallel executions determined, for each instance, configuring that instance to execute the given module of the first application; causing each of the plurality of configured instances to execute on one or more of the host systems; and for at least one of the plurality of configured instances, causing, by the second application of that configured instance, communication between the data processing system and the one or more of the host systems executing that configured instance.
-
公开(公告)号:US11836505B2
公开(公告)日:2023-12-05
申请号:US16656886
申请日:2019-10-18
Applicant: Ab Initio Technology LLC
Inventor: Frank Lynch , Tim Wakeling
CPC classification number: G06F9/445 , G06F8/63 , G06F9/451 , G06F9/45545 , G06F9/5072
Abstract: A method implemented by a data processing system including: accessing the container image that includes the first application and a second application; determining, by the data processing system, the number of parallel executions of the given module of the first application; for the given module, generating a plurality of instances of the container image in accordance with the number of parallel executions determined, for each instance, configuring that instance to execute the given module of the first application; causing each of the plurality of configured instances to execute on one or more of the host systems; and for at least one of the plurality of configured instances, causing, by the second application of that configured instance, communication between the data processing system and the one or more of the host systems executing that configured instance.
-
公开(公告)号:US11341155B2
公开(公告)日:2022-05-24
申请号:US16902949
申请日:2020-06-16
Applicant: Ab Initio Technology LLC
Inventor: Tim Wakeling , Adam Weiss
IPC: G06F16/25 , G06F16/28 , G06F16/2457 , G06F16/23 , G06F16/84 , G06F16/27 , G06F40/197
Abstract: Mapping data stored in a data storage system for use by a computer system includes processing specifications of dataflow graphs that include nodes representing computations interconnected by links representing flows of data. At least one of the dataflow graphs receives a flow of data from at least one input dataset and at least one of the dataflow graphs provides a flow of data to at least one output dataset. A mapper identifies one or more sets of datasets. Each dataset in a given set matches one or more criteria for identifying different versions of a single dataset. A user interface is provided to receive a mapping between at least two datasets in a given set. The mapping received over the user interface is stored in association with a dataflow graph that provides data to or receives data from the datasets of the mapping.
-
公开(公告)号:US10705877B2
公开(公告)日:2020-07-07
申请号:US14470501
申请日:2014-08-27
Applicant: Ab Initio Technology LLC
Inventor: Harry Michael Wolfson , Joel Gould , Anthony Yeracaris , Tim Wakeling
IPC: G06F9/50
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information.
-
公开(公告)号:US10452509B2
公开(公告)日:2019-10-22
申请号:US16137822
申请日:2018-09-21
Applicant: Ab Initio Technology LLC
Inventor: Mark Buxbaum , Michael G. Mulligan , Tim Wakeling , Matthew Darcy Atterbury
IPC: G06F11/30 , G06Q30/02 , G06Q40/06 , G06F16/2455 , G06F11/34
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving input data by the data processing system, the input data provided by an application executing on the data processing system. The method includes determining a characteristic of the input data. The method includes identifying, by the application, a dynamic component from multiple available dynamic components based on the determined characteristic, the multiple available dynamic components being stored in a data storage system. The method includes processing the input data using the identified dynamic component. The method also includes determining one or more performance metrics associated with the processing.
-
公开(公告)号:US10235204B2
公开(公告)日:2019-03-19
申请号:US14690112
申请日:2015-04-17
Applicant: Ab Initio Technology LLC
Inventor: Dino LaChiusa , Joyce L. Vigneau , Mark Buxbaum , Brad Lee Miller , Tim Wakeling
Abstract: A method of managing components in a processing environment is provided. The method includes monitoring (i) a status of each of one or more computing devices, (ii) a status of each of one or more applications, each application hosted by at least one of the computing devices, and (iii) a status of each of one or more jobs, each job associated with at least one of the applications; determining that one of the status of one of the computing devices, the status of one of the applications, and the status of one of the jobs is indicative of a performance issue associated with the corresponding computing device, application, or job, the determination being made based on a comparison of a performance of the computing device, application, or job and at least one predetermined criterion; and enabling an action to be performed associated with the performance issue.
-
-
-
-
-
-
-
-
-