Generating rules for data processing values of data fields from semantic labels of the data fields

    公开(公告)号:US12242443B2

    公开(公告)日:2025-03-04

    申请号:US18399545

    申请日:2023-12-28

    Abstract: Methods and systems are configured to determine a semantic meaning for data and generate data processing rules based on the semantic meaning of the data. The semantic meaning includes syntactical or contextual meaning for the data that is determined, for example, by profiling, by the data processing system, values stored in a field included in data records of one or more datasets; applying, by the data processing system, one or more classifiers to the profiled values; identifying, based on applying the one or more classifiers, one or more attributes indicative of a logical or syntactical characteristic for the values of the field, with each of the one or more attributes having a respective confidence level that is based on an output of each of the one or more classifiers. The attributes are associated with the fields and are used for generating data processing rules and processing the data.

    PUBLISHING TO A DATA WAREHOUSE
    3.
    发明公开

    公开(公告)号:US20240104113A1

    公开(公告)日:2024-03-28

    申请号:US18492425

    申请日:2023-10-23

    Abstract: A method for generating an executable application to transform and load data into a structured dataset includes receiving a metadata file that specifies values for parameters for structuring data feeds, received from a networked data source, into a structured database. The metadata file specifies logical rules for transforming the data feeds. The values of the parameters and the logical rules for transforming the plurality of the data feeds are validated to ensure logical consistency for each data feed. Data rules are generated that specify standards for transforming each data feed in accordance with the validated values of the parameters and logical rules. The executable application is generated that is configured to receive source data comprising a data feed from one or more data sources and transform the source data into structured data that satisfies the one or more standards for the structured data record in compliance with the data rules.

    Publishing to a data warehouse
    5.
    发明授权

    公开(公告)号:US11835994B2

    公开(公告)日:2023-12-05

    申请号:US16517320

    申请日:2019-07-19

    Abstract: A method for generating an executable application to transform and load data into a structured dataset includes receiving a metadata file that specifies values for parameters for structuring data feeds, received from a networked data source, into a structured database. The metadata file specifies logical rules for transforming the data feeds. The values of the parameters and the logical rules for transforming the plurality of the data feeds are validated to ensure logical consistency for each data feed. Data rules are generated that specify standards for transforming each data feed in accordance with the validated values of the parameters and logical rules. The executable application is generated that is configured to receive source data comprising a data feed from one or more data sources and transform the source data into structured data that satisfies the one or more standards for the structured data record in compliance with the data rules.

    Static and runtime analysis of computer program ecosystems

    公开(公告)号:US11487534B2

    公开(公告)日:2022-11-01

    申请号:US17306075

    申请日:2021-05-03

    Abstract: A method for analyzing a computer program ecosystem includes performing a static analysis, including identifying static dependencies among elements of the ecosystem based on values of parameters in one or more parameter sets associated with the ecosystem, the elements of the ecosystem including the computer programs of the ecosystem and data resources associated with the computer programs. The method includes performing a runtime analysis, including identifying elements of the ecosystem that were utilized during execution of the ecosystem to process data records. The method includes performing a schedule analysis, including identifying a computer program of the ecosystem that has a schedule dependency from another computer program of the ecosystem. The method includes identifying a subset of the elements of the ecosystem as an ecosystem unit based on the results of the static, runtime, and schedule analyses. The method includes migrating the ecosystem unit, testing the ecosystem unit, or both.

    Transforming a specification into a persistent computer program

    公开(公告)号:US11423083B2

    公开(公告)日:2022-08-23

    申请号:US15795917

    申请日:2017-10-27

    Abstract: A method performed by a computer system including: accessing a specification that specifies a plurality of modules to be implemented by the computer program for processing the one or more values of the one or more fields in the structured data item; transforming the specification into the computer program that implements the plurality of modules, wherein the transforming includes: for each of one or more first modules of the plurality of modules: identifying one or more second modules of the plurality of modules that each receive input that is at least partly based on an output of the first module; and formatting an output data format of the first module such that the first module outputs only one or more values of one or more fields of the structured data item.

    STATIC AND RUNTIME ANALYSIS OF COMPUTER PROGRAM ECOSYSTEMS

    公开(公告)号:US20210263734A1

    公开(公告)日:2021-08-26

    申请号:US17306075

    申请日:2021-05-03

    Abstract: A method for analyzing a computer program ecosystem including multiple computer programs includes performing a static analysis of the ecosystem, including identifying static dependencies among elements of the ecosystem based on values of parameters in one or more parameter sets associated with the ecosystem, the elements of the ecosystem including the computer programs of the ecosystem and data resources associated with the computer programs. The method includes performing a runtime analysis of the ecosystem, including identifying elements of the ecosystem that were utilized during execution of the ecosystem to process data records. The method includes performing a schedule analysis of the ecosystem, including identifying a computer program of the ecosystem that has a schedule dependency from another computer program of the ecosystem. The method includes identifying a subset of the elements of the ecosystem as an ecosystem unit based on the results of the static, runtime, and schedule analyses. The method includes migrating the ecosystem unit from a first computer system to a second computer system, testing the ecosystem unit, or both.

    DATA GENERATION
    9.
    发明申请
    DATA GENERATION 审中-公开
    数据生成

    公开(公告)号:US20150169428A1

    公开(公告)日:2015-06-18

    申请号:US14573038

    申请日:2014-12-17

    CPC classification number: G06F11/36 G06F11/3688

    Abstract: A method includes receiving data indicative of a number of times each of one or more rules was executed by a data processing application during processing of one or more records; based on the number of times each of the rules was executed by the data processing application, determining a content criterion for each of one or more particular fields; generating content for each of the particular fields based on the content criterion; and populating each of the particular fields with the generated content.

    Abstract translation: 一种方法包括在处理一个或多个记录期间接收指示数据处理应用程序执行一个或多个规则的每一个的次数的数据; 基于每个规则由数据处理应用执行的次数,确定一个或多个特定字段中的每一个的内容标准; 基于内容标准为每个特定字段生成内容; 并用生成的内容填充每个特定字段。

    DATA RECORDS SELECTION
    10.
    发明申请
    DATA RECORDS SELECTION 有权
    数据记录选择

    公开(公告)号:US20140222752A1

    公开(公告)日:2014-08-07

    申请号:US13827558

    申请日:2013-03-14

    CPC classification number: G06F11/3684 G06F17/30306 G06F17/30867

    Abstract: A computer-implemented method includes accessing a plurality of data records, each data record having a plurality of data fields. The method further includes analyzing values for one or more of the data fields for at least some of the plurality of data records and generating a profile of the plurality of data records based on the analyzing. The method further includes formulating at least one subsetting rule based on the profile; and selecting a subset of data records from the plurality of data records based on the at least one subsetting rule.

    Abstract translation: 计算机实现的方法包括访问多个数据记录,每个数据记录具有多个数据字段。 该方法还包括分析多个数据记录中的至少一些数据记录中的一个或多个数据字段的值,并且基于分析生成多个数据记录的简档。 该方法还包括基于该简档来制定至少一个子集规则; 以及基于所述至少一个子集规则从所述多个数据记录中选择数据记录的子集。

Patent Agency Ranking