Fault tolerant batch processing
    1.
    发明授权
    Fault tolerant batch processing 有权
    容错批处理

    公开(公告)号:US08205113B2

    公开(公告)日:2012-06-19

    申请号:US12502851

    申请日:2009-07-14

    IPC分类号: G06F11/00

    摘要: Among other aspects disclosed are a method and system for processing a batch of input data in a fault tolerant manner. The method includes reading a batch of input data including a plurality of records from one or more data sources and passing the batch through a dataflow graph. The dataflow graph includes two or more nodes representing components connected by links representing flows of data between the components. At least one but fewer than all of the components includes a checkpoint process for an action performed for each of multiple units of work associated with one or more of the records. The checkpoint process includes opening a checkpoint buffer stored in non-volatile memory at the start of processing for the batch. For each unit of work from the batch, if a result from performing the action for the unit of work was previously saved in the checkpoint buffer, the saved result is used to complete processing of the unit of work without performing the action again. If a result from performing the action for the unit of work is not saved in the checkpoint buffer. The action is performed to complete processing of the unit of work and the result from performing the action is saved in the checkpoint buffer.

    摘要翻译: 公开的其它方面是用于以容错方式处理一批输入数据的方法和系统。 该方法包括从一个或多个数据源读取一批包括多个记录的输入数据,并将批次传递通过数据流图。 数据流图包括两个或更多个节点,表示通过表示组件之间的数据流的链接连接的组件。 至少一个但少于所有组件包括针对与一个或多个记录相关联的多个工作单元中的每一个执行的动作的检查点过程。 检查点过程包括在批处理开始时打开存储在非易失性存储器中的检查点缓冲区。 对于批次中的每个工作单元,如果执行工作单元的操作的结果先前已保存在检查点缓冲区中,则保存的结果将用于完成对工作单元的处理,而不再执行操作。 如果执行工作单元的操作的结果不会保存在检查点缓冲区中。 执行操作以完成对工作单元的处理,并且执行操作的结果保存在检查点缓冲区中。

    FAULT TOLERANT BATCH PROCESSING
    2.
    发明申请
    FAULT TOLERANT BATCH PROCESSING 有权
    容错批处理

    公开(公告)号:US20110016354A1

    公开(公告)日:2011-01-20

    申请号:US12502851

    申请日:2009-07-14

    IPC分类号: G06F11/00 G06F9/46

    摘要: Among other aspects disclosed are a method and system for processing a batch of input data in a fault tolerant manner. The method includes reading a batch of input data including a plurality of records from one or more data sources and passing the batch through a dataflow graph. The dataflow graph includes two or more nodes representing components connected by links representing flows of data between the components. At least one but fewer than all of the components includes a checkpoint process for an action performed for each of multiple units of work associated with one or more of the records. The checkpoint process includes opening a checkpoint buffer stored in non-volatile memory at the start of processing for the batch. For each unit of work from the batch, if a result from performing the action for the unit of work was previously saved in the checkpoint buffer, the saved result is used to complete processing of the unit of work without performing the action again. If a result from performing the action for the unit of work is not saved in the checkpoint buffer. The action is performed to complete processing of the unit of work and the result from performing the action is saved in the checkpoint buffer.

    摘要翻译: 公开的其它方面是用于以容错方式处理一批输入数据的方法和系统。 该方法包括从一个或多个数据源读取一批包括多个记录的输入数据,并将批次传递通过数据流图。 数据流图包括两个或更多个节点,表示通过表示组件之间的数据流的链接连接的组件。 至少一个但少于所有组件包括针对与一个或多个记录相关联的多个工作单元中的每一个执行的动作的检查点过程。 检查点过程包括在批处理开始时打开存储在非易失性存储器中的检查点缓冲区。 对于批次中的每个工作单元,如果执行工作单元的操作的结果先前已保存在检查点缓冲区中,则保存的结果将用于完成对工作单元的处理,而不再执行操作。 如果执行工作单元的操作的结果不会保存在检查点缓冲区中。 执行操作以完成对工作单元的处理,并且执行操作的结果保存在检查点缓冲区中。

    MAPPING INSTANCES OF A DATASET WITHIN A DATA MANAGEMENT SYSTEM
    3.
    发明申请
    MAPPING INSTANCES OF A DATASET WITHIN A DATA MANAGEMENT SYSTEM 审中-公开
    数据管理系统中数据库的映射实例

    公开(公告)号:US20100138388A1

    公开(公告)日:2010-06-03

    申请号:US12628521

    申请日:2009-12-01

    IPC分类号: G06F17/00 G06F3/048

    摘要: Mapping data stored in a data storage system for use by a computer system includes processing specifications of dataflow graphs that include nodes representing computations interconnected by links representing flows of data. At least one of the dataflow graphs receives a flow of data from at least one input dataset and at least one of the dataflow graphs provides a flow of data to at least one output dataset. A mapper identifies one or more sets of datasets. Each dataset in a given set matches one or more criteria for identifying different versions of a single dataset. A user interface is provided to receive a mapping between at least two datasets in a given set. The mapping received over the user interface is stored in association with a dataflow graph that provides data to or receives data from the datasets of the mapping.

    摘要翻译: 映射存储在数据存储系统中以供计算机系统使用的数据包括数据流图的处理规范,其中包括表示通过表示数据流的链接互连的计算的节点。 至少一个数据流图接收来自至少一个输入数据集的数据流,并且数据流图中的至少一个将数据流提供给至少一个输出数据集。 映射器识别一组或多组数据集。 给定集合中的每个数据集匹配用于标识单个数据集的不同版本的一个或多个标准。 提供用户界面以接收给定集合中的至少两个数据集之间的映射。 通过用户界面接收的映射与数据流图相关联地存储,该数据流向数据提供数据或从映射的数据集接收数据。

    Data Quality Tracking
    4.
    发明申请
    Data Quality Tracking 有权
    数据质量跟踪

    公开(公告)号:US20090319566A1

    公开(公告)日:2009-12-24

    申请号:US12143362

    申请日:2008-06-20

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30958 G06F17/30303

    摘要: In general, a method includes determining metric values associated with data quality for one or more child nodes. Metric values are determined for a parent node based on the metric values of at least some of the child nodes, and relationships between one or more parent nodes and one or more child nodes define a hierarchy. The determination of the metric value for the parent node is repeated for multiple instances.

    摘要翻译: 通常,一种方法包括确定与一个或多个子节点的数据质量相关联的度量值。 基于至少一些子节点的度量值,以及一个或多个父节点与一个或多个子节点之间的关系定义层次结构,为父节点确定度量值。 为多个实例重复父节点的度量值的确定。

    Visualizing relationships between data elements

    公开(公告)号:US09767100B2

    公开(公告)日:2017-09-19

    申请号:US12629483

    申请日:2009-12-02

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30017 G06F17/30572

    摘要: In general, a specification of multiple contexts that are related according to a hierarchy is received. Relationships are determined among three or more metadata objects, and at least some of the metadata objects are grouped into one or more respective groups. Each of at least some of the groups is based on a selected one of the contexts and is represented by a node in a diagram. Relationships among the nodes are determined based on the relationships among the metadata objects in the groups represented by the nodes, and a visual representation is generated of the diagram including the nodes and the relationships among the nodes.

    Managing objects using a client-server bridge
    7.
    发明授权
    Managing objects using a client-server bridge 有权
    使用客户机 - 服务器桥管理对象

    公开(公告)号:US08661154B2

    公开(公告)日:2014-02-25

    申请号:US12967533

    申请日:2010-12-14

    IPC分类号: G06F15/16

    摘要: A method for supporting communication between a client and a server includes receiving a first message from a client. The method also includes creating an object in response to the first message. The method also includes sending a response to the first message to the client. The method also includes receiving changes to the object from a server. The method also includes storing the changes to the object. The method also includes receiving a second message from the client. The method also includes sending the stored changes to the client with a response to the second message.

    摘要翻译: 支持客户端与服务器之间的通信的方法包括从客户端接收第一消息。 该方法还包括响应于第一消息创建对象。 该方法还包括向客户端发送对第一消息的响应。 该方法还包括从服务器接收对象的更改。 该方法还包括将更改存储到对象。 该方法还包括从客户端接收第二消息。 该方法还包括将所存储的改变发送给客户机以对第二消息的响应。

    MANAGING TASK EXECUTION
    8.
    发明申请
    MANAGING TASK EXECUTION 有权
    管理任务执行

    公开(公告)号:US20100211953A1

    公开(公告)日:2010-08-19

    申请号:US12704998

    申请日:2010-02-12

    IPC分类号: G06F9/46 G06F9/54 G06F11/00

    CPC分类号: G06F9/5038 G06F2209/506

    摘要: Managing task execution includes: receiving a specification of a plurality of tasks to be performed by respective functional modules; processing a flow of input data using a dataflow graph that includes nodes representing data processing components connected by links representing flows of data between data processing components; in response to at least one flow of data provided by at least one data processing component, generating a flow of messages; and in response to each of the messages in the flow of messages, performing an iteration of a set of one or more tasks using one or more corresponding functional modules.

    摘要翻译: 管理任务执行包括:接收由各个功能模块执行的多个任务的指定; 使用数据流图处理输入数据流,所述数据流图包括表示通过表示数据处理组件之间的数据流的链接连接的数据处理组件的节点; 响应于由至少一个数据处理组件提供的至少一个数据流,产生消息流; 并且响应于消息流中的每个消息,使用一个或多个对应的功能模块执行一组一个或多个任务的迭代。

    Evaluating dataflow graph characteristics

    公开(公告)号:US09727438B2

    公开(公告)日:2017-08-08

    申请号:US13217778

    申请日:2011-08-25

    IPC分类号: G06F11/34 G06F9/50

    CPC分类号: G06F11/3476 G06F9/50

    摘要: One or more expressions are evaluated that represent one or more characteristics of a dataflow graph that includes vertices representing data processing components connected by links representing flows of work elements between the components. A request is received by a computing system to evaluate the one or more expressions that include one or more operations on one or more variables; and the one or more expressions are evaluated by the computing system. The evaluating includes: defining a data structure that includes one or more fields, collecting, during execution of the dataflow graph, tracking information associated with one or more components of the dataflow graph, storing values associated with the tracking information in the one or more fields, and replacing one or more variables of the one or more expressions with the values stored in the one or more fields to compute a result of evaluating the one or more expressions.

    MANAGING OBJECTS USING A CLIENT-SERVER BRIDGE
    10.
    发明申请
    MANAGING OBJECTS USING A CLIENT-SERVER BRIDGE 有权
    使用客户端服务器桥管理对象

    公开(公告)号:US20110153711A1

    公开(公告)日:2011-06-23

    申请号:US12967533

    申请日:2010-12-14

    IPC分类号: G06F15/16

    摘要: A method for supporting communication between a client and a server includes receiving a first message from a client. The method also includes creating an object in response to the first message. The method also includes sending a response to the first message to the client. The method also includes receiving changes to the object from a server. The method also includes storing the changes to the object. The method also includes receiving a second message from the client. The method also includes sending the stored changes to the client with a response to the second message.

    摘要翻译: 支持客户端与服务器之间的通信的方法包括从客户端接收第一消息。 该方法还包括响应于第一消息创建对象。 该方法还包括向客户端发送对第一消息的响应。 该方法还包括从服务器接收对对象的更改。 该方法还包括将更改存储到对象。 该方法还包括从客户端接收第二消息。 该方法还包括将所存储的改变发送给客户机以对第二消息的响应。