OPERATIONALIZING METADATA
    11.
    发明公开

    公开(公告)号:US20240070163A1

    公开(公告)日:2024-02-29

    申请号:US18104066

    申请日:2023-01-31

    CPC classification number: G06F16/254 G06F16/26 G06F16/9024

    Abstract: A method for using a metadata model to perform operations on data items, with the metadata model including parent nodes and child nodes connected by edges, with the parent nodes specifying logical metadata and the child nodes specifying physical metadata representing the data items, and with the edges specifying relationships between the nodes. The method includes: identifying a given data item and physical metadata of that given data item, accessing the metadata model, identifying, in the metadata model, a child node representing the physical metadata of the given data item, traversing one or more edges in the metadata model to identify parent nodes of the child node, determining, from logical metadata associated with the identified parent nodes, one or more operations to be performed on the given data item, applying the one or more operations to the given data item to transform the data item, and storing the transformed data item.

    Publishing to a data warehouse
    14.
    发明授权

    公开(公告)号:US11835994B2

    公开(公告)日:2023-12-05

    申请号:US16517320

    申请日:2019-07-19

    Abstract: A method for generating an executable application to transform and load data into a structured dataset includes receiving a metadata file that specifies values for parameters for structuring data feeds, received from a networked data source, into a structured database. The metadata file specifies logical rules for transforming the data feeds. The values of the parameters and the logical rules for transforming the plurality of the data feeds are validated to ensure logical consistency for each data feed. Data rules are generated that specify standards for transforming each data feed in accordance with the validated values of the parameters and logical rules. The executable application is generated that is configured to receive source data comprising a data feed from one or more data sources and transform the source data into structured data that satisfies the one or more standards for the structured data record in compliance with the data rules.

    DATAFLOW GRAPH DATASETS
    15.
    发明公开

    公开(公告)号:US20230359668A1

    公开(公告)日:2023-11-09

    申请号:US18114212

    申请日:2023-02-24

    CPC classification number: G06F16/9024

    Abstract: Described herein are techniques, performed by a data processing system, for enabling efficient development of software application programs in a dynamic environment with multiple datasets by generating entries in a dataset catalog to provide a software application program with access to output data dynamically generated by dataflow graphs, the entries associated with respective software application programs developed as dataflow graphs. The techniques include identifying a subgraph, wherein, when the subgraph is executed, the subgraph generates output data by applying one or more data processing operations to data obtained from one or more data sources; creating, in the dataset catalog, a new entry associated with the identified subgraph, the new entry associated with information indicating nodes, links, and configuration parameters of the identified subgraph; and configuring the dataset catalog to enable access to the new entry, in the dataset catalog, associated with the identified subgraph.

    Debugging an executable control flow graph that specifies control flow

    公开(公告)号:US11782820B2

    公开(公告)日:2023-10-10

    申请号:US17029828

    申请日:2020-09-23

    Abstract: A computer-implemented method for debugging an executable control flow graph that specifies control flow among a plurality of functional modules, with the control flow being represented as transitions among the plurality of functional modules, the computer-implemented method including: specifying a position in the executable control flow graph at which execution of the executable control flow graph is to be interrupted; wherein the specified position represents a transition to a given functional module, a transition to a state in which contents of the given functional module are executed or a transition from the given functional module; starting execution of the executable control flow graph in an execution environment; and at a point of execution representing the specified position, interrupting execution of the executable control flow graph; and providing data representing one or more attributes of the execution environment in which the given functional module is being executed.

    Workload automation and data lineage analysis

    公开(公告)号:US11748165B2

    公开(公告)日:2023-09-05

    申请号:US16906193

    申请日:2020-06-19

    CPC classification number: G06F9/5038

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information.

    Generating, accessing, and displaying lineage metadata

    公开(公告)号:US11741091B2

    公开(公告)日:2023-08-29

    申请号:US15829152

    申请日:2017-12-01

    CPC classification number: G06F16/245 G06F16/22 G06F16/248 G06F16/83 G06F40/117

    Abstract: Among other things, we describe a method of receiving a portion of metadata from a data source, the portion of metadata describing nodes and edges; generating instances of a data structure representing the portion of metadata, at least one instance of the data structure including an identification value that identifies a corresponding node, one or more property values representing respective properties of the corresponding node, and one or more pointers to respective identification values, each pointer representing an edge associated with a node identified by the corresponding respective identification value; storing the instances of the data structure in random access memory; receiving a query that includes an identification of at least one particular element of data; and using at least one instance of the data structure to cause a display of a computer system to display a representation of lineage of the particular element of data.

Patent Agency Ranking