Profiling in a massive parallel processing environment
    1.
    发明授权
    Profiling in a massive parallel processing environment 有权
    在大规模并行处理环境中进行分析

    公开(公告)号:US09251212B2

    公开(公告)日:2016-02-02

    申请号:US12413289

    申请日:2009-03-27

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30469 G06F17/30445

    摘要: A computer-implemented method of profiling a data set in a parallel processing environment includes vertically partitioning an initial data set. One or more attribute subsets are then profiled. A list of subjects is generated each corresponding to a specific attribute value identified in the profiling. Values of multiple attributes are extracted for each identified subject, and the sample results are assembled and merged.

    摘要翻译: 在并行处理环境中对数据集进行分析的计算机实现方法包括垂直划分初始数据集。 然后对一个或多个属性子集进行轮廓分析。 每个对象都生成与列表中识别的特定属性值对应的列表。 为每个识别的对象提取多个属性的值,并将样本结果进行汇编和合并。

    Apparatus and method for creating portable ETL jobs
    2.
    发明授权
    Apparatus and method for creating portable ETL jobs 有权
    用于创建便携式ETL作业的装置和方法

    公开(公告)号:US08639652B2

    公开(公告)日:2014-01-28

    申请号:US11303047

    申请日:2005-12-14

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30563

    摘要: A computer readable medium with executable instructions to receive a job and correlate a data store with each data source associated with the job. A first configuration profile is associated with the data store. A second configuration profile is specified for the data store. Dependent flows are identified. The dependent flow is updated to include additional configuration information derived from the second configuration profile.

    摘要翻译: 一种具有可执行指令的计算机可读介质,用于接收作业并将数据存储与与作业相关联的每个数据源相关联。 第一配置配置文件与数据存储相关联。 为数据存储指定了第二个配置配置文件。 确定依赖流。 依赖流程被更新为包括从第二配置简档导出的附加配置信息。

    Client-server systems and methods for accessing metadata information across a network using proxies
    3.
    发明授权
    Client-server systems and methods for accessing metadata information across a network using proxies 有权
    客户端 - 服务器系统和方法,用于通过代理访问跨网络的元数据信息

    公开(公告)号:US07966383B2

    公开(公告)日:2011-06-21

    申请号:US12413268

    申请日:2009-03-27

    IPC分类号: G06F15/167 G06F15/173

    CPC分类号: H04L67/1002

    摘要: Embodiments of the present invention include a computer-implemented systems and methods for accessing metadata across a network. A metadata server receives requests to access a data source from one or more clients. The metadata server is coupled between one or more backend servers and the clients. The backend servers may be coupled to the data sources of interest. The metadata server provides a metadata service proxy for establishing communications with the backend servers and for signaling the backend servers to establish connections to data sources. Data sources may be stateful or stateless. For stateless data sources, the metadata server may dynamically create reusable metadata service provider proxies that receive metadata from metadata service providers on the backend servers. For stateful data sources, unique metadata service provider proxies may be dynamically created and used to service client requests.

    摘要翻译: 本发明的实施例包括用于通过网络访问元数据的计算机实现的系统和方法。 元数据服务器接收从一个或多个客户端访问数据源的请求。 元数据服务器耦合在一个或多个后端服务器和客户端之间。 后端服务器可以耦合到感兴趣的数据源。 元数据服务器提供用于建立与后端服务器的通信的元数据服务代理,以及用于发信号通知后端服务器建立到数据源的连接。 数据源可能是有状态的或无状态的。 对于无状态数据源,元数据服务器可以动态地创建从后端服务器上的元数据服务提供商接收元数据的可重用的元数据服务提供商代理。 对于有状态数据源,唯一的元数据服务提供者代理可以动态创建并用于服务客户端请求。

    PROFILING IN A MASSIVE PARALLEL PROCESSING ENVIRONMENT
    4.
    发明申请
    PROFILING IN A MASSIVE PARALLEL PROCESSING ENVIRONMENT 有权
    在大型平行处理环境中进行分析

    公开(公告)号:US20100250563A1

    公开(公告)日:2010-09-30

    申请号:US12413289

    申请日:2009-03-27

    IPC分类号: G06F17/30 G06F7/14 G06F7/16

    CPC分类号: G06F17/30469 G06F17/30445

    摘要: A computer-implemented method of profiling a data set in a parallel processing environment includes vertically partitioning an initial data set. One or more attribute subsets are then profiled. A list of subjects is generated each corresponding to a specific attribute value identified in the profiling. Values of multiple attributes are extracted for each identified subject, and the sample results are assembled and merged.

    摘要翻译: 在并行处理环境中对数据集进行分析的计算机实现方法包括垂直划分初始数据集。 然后对一个或多个属性子集进行轮廓分析。 每个对象都生成与列表中识别的特定属性值对应的列表。 为每个识别的对象提取多个属性的值,并将样本结果进行汇编和合并。

    Apparatus and method for creating portable ETL jobs
    5.
    发明申请
    Apparatus and method for creating portable ETL jobs 有权
    用于创建便携式ETL作业的装置和方法

    公开(公告)号:US20070136324A1

    公开(公告)日:2007-06-14

    申请号:US11303047

    申请日:2005-12-14

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30563

    摘要: A computer readable medium with executable instructions to receive a job and correlate a data store with each data source associated with the job. A first configuration profile is associated with the data store. A second configuration profile is specified for the data store. Dependent flows are identified. The dependent flow is updated to include additional configuration information derived from the second configuration profile.

    摘要翻译: 一种具有可执行指令的计算机可读介质,用于接收作业并将数据存储与与作业相关联的每个数据源相关联。 第一配置配置文件与数据存储相关联。 为数据存储指定了第二个配置配置文件。 确定依赖流。 依赖流程被更新为包括从第二配置简档导出的附加配置信息。

    CLIENT-SERVER SYSTEMS AND METHODS FOR ACCESSING METADATA INFORMATION ACROSS A NETWORK USING PROXIES
    6.
    发明申请
    CLIENT-SERVER SYSTEMS AND METHODS FOR ACCESSING METADATA INFORMATION ACROSS A NETWORK USING PROXIES 有权
    客户服务器系统和使用代码的网络访问元数据信息的方法

    公开(公告)号:US20100250648A1

    公开(公告)日:2010-09-30

    申请号:US12413268

    申请日:2009-03-27

    IPC分类号: G06F15/16 G06F17/30

    CPC分类号: H04L67/1002

    摘要: Embodiments of the present invention include a computer-implemented systems and methods for accessing metadata across a network. A metadata server receives requests to access a data source from one or more clients. The metadata server is coupled between one or more backend servers and the clients. The backend servers may be coupled to the data sources of interest. The metadata server provides a metadata service proxy for establishing communications with the backend servers and for signaling the backend servers to establish connections to data sources. Data sources may be stateful or stateless. For stateless data sources, the metadata server may dynamically create reusable metadata service provider proxies that receive metadata from metadata service providers on the backend servers. For stateful data sources, unique metadata service provider proxies may be dynamically created and used to service client requests.

    摘要翻译: 本发明的实施例包括用于通过网络访问元数据的计算机实现的系统和方法。 元数据服务器接收从一个或多个客户端访问数据源的请求。 元数据服务器耦合在一个或多个后端服务器和客户端之间。 后端服务器可以耦合到感兴趣的数据源。 元数据服务器提供用于建立与后端服务器的通信的元数据服务代理,以及用于发信号通知后端服务器建立到数据源的连接。 数据源可能是有状态的或无状态的。 对于无状态数据源,元数据服务器可以动态地创建从后端服务器上的元数据服务提供商接收元数据的可重用的元数据服务提供商代理。 对于有状态数据源,唯一的元数据服务提供者代理可以动态创建并用于服务客户端请求。