摘要:
A computer-implemented method of profiling a data set in a parallel processing environment includes vertically partitioning an initial data set. One or more attribute subsets are then profiled. A list of subjects is generated each corresponding to a specific attribute value identified in the profiling. Values of multiple attributes are extracted for each identified subject, and the sample results are assembled and merged.
摘要:
A computer readable medium with executable instructions to receive a job and correlate a data store with each data source associated with the job. A first configuration profile is associated with the data store. A second configuration profile is specified for the data store. Dependent flows are identified. The dependent flow is updated to include additional configuration information derived from the second configuration profile.
摘要:
Embodiments of the present invention include a computer-implemented systems and methods for accessing metadata across a network. A metadata server receives requests to access a data source from one or more clients. The metadata server is coupled between one or more backend servers and the clients. The backend servers may be coupled to the data sources of interest. The metadata server provides a metadata service proxy for establishing communications with the backend servers and for signaling the backend servers to establish connections to data sources. Data sources may be stateful or stateless. For stateless data sources, the metadata server may dynamically create reusable metadata service provider proxies that receive metadata from metadata service providers on the backend servers. For stateful data sources, unique metadata service provider proxies may be dynamically created and used to service client requests.
摘要:
A computer-implemented method of profiling a data set in a parallel processing environment includes vertically partitioning an initial data set. One or more attribute subsets are then profiled. A list of subjects is generated each corresponding to a specific attribute value identified in the profiling. Values of multiple attributes are extracted for each identified subject, and the sample results are assembled and merged.
摘要:
A computer readable medium with executable instructions to receive a job and correlate a data store with each data source associated with the job. A first configuration profile is associated with the data store. A second configuration profile is specified for the data store. Dependent flows are identified. The dependent flow is updated to include additional configuration information derived from the second configuration profile.
摘要:
Embodiments of the present invention include a computer-implemented systems and methods for accessing metadata across a network. A metadata server receives requests to access a data source from one or more clients. The metadata server is coupled between one or more backend servers and the clients. The backend servers may be coupled to the data sources of interest. The metadata server provides a metadata service proxy for establishing communications with the backend servers and for signaling the backend servers to establish connections to data sources. Data sources may be stateful or stateless. For stateless data sources, the metadata server may dynamically create reusable metadata service provider proxies that receive metadata from metadata service providers on the backend servers. For stateful data sources, unique metadata service provider proxies may be dynamically created and used to service client requests.