摘要:
A learning-based method for estimating costs or statistics of an operator in a continuous query includes a cost estimation model learning procedure and a model applying procedure. The model learning procedure builds a cost estimation model from training data, and the applying procedure uses the model to estimate the cost associated with a given query. The learning procedure uses a feature extractor and a cost estimator. The feature extractor collects relevant training data and obtains feature values. The extracted feature values are associated with costs and used to create the cost estimator. When applying the cost estimator to a continuous stream of data, the feature extractor extracts feature values from the data stream and uses the extracted feature values as inputs into the cost estimator to obtain the desired cost values.
摘要:
Systems and methods for conducting attribute-based queries over a plurality of objects using bounded memory locations and minimizing costly input and output operations are provided. A plurality of attributes are associated with each object, and a plurality of data groups, one each for the identified attributes are created. The objects associated with the attributes are placed into the appropriate data groups, and the objects contained within each data group are sorted into blocks such that each block within a given attribute contains that objects having the same attribute value. Results to the query are created by loading blocks into a primary memory location in a middleware system and combining the loaded blocks to create the desire query results. Block combinations are created based upon the fit of the given block combination to the query as expressed in an aggregation function. A second dedicated memory location can also be provided to hold multiple block combinations to optimize the order in which blocks are loaded and combined. Empty block buffers and external storage devices can also be provided to further enhance the generation of query results.
摘要:
Disclosed is a method of internally encrypting data in a relational database, comprising the steps of providing a security dictionary comprising one or more security catalogs, receiving data from a user associating said data with a database column and at least one authorized user, generating a working encryption key, internally encrypting said working encryption key using a public key from an authorized user, storing said encrypted working key in a security catalog, and using said working key to internally encrypt said data.
摘要:
Interoperability is enabled between participants in a network by determining values associated with a value metric defined for at least a portion of the network. Information flow is directed between two or more of the participants based at least in part on semantic models corresponding to the participants and on the values associated with the value metric. The semantic models may define interactions between the participants and define at least a portion of information produced or consumed by the participants. The determination of the values and the direction of the information flow may be performed multiple times in order to modify the one or more value metrics. The direction of information flow may allow participants to be deleted from the network, may allow participants to be added to the network, or may allow behavior of the participants to be modified.
摘要:
A computer-implemented method, computer program product and a system for supporting star and snowflake data schemas for use with an Extract, Transform, Load (ETL) process, comprising selecting a data source comprising dimensional data, where the dimensional data comprises at least one source table comprising at least one source column, importing a data model for the dimensional data into a data integration system, analyzing the imported data model to select a star or snowflake target data schema comprising target dimensions and target facts, generating a meta-model representation by mapping at least one source table or source column to each target fact and target dimension, automatically converting the meta-model representation into one or more ETL jobs, and executing the ETL jobs to extract the dimensional data from the data source and loading the dimensional data into the selected target data schema in a target data system.
摘要:
A computer-implemented method, computer program product and a system for identifying and handling slowly changing dimension (SCD) attributes for use with an Extract, Transform, Load (ETL) process, comprising importing a data model for dimensional data into a data integration system, where the dimensional data comprises a plurality of attributes, identifying via a data discovery analyzer one or more attributes in the data model as SCD attributes, importing the identified SCD attributes into the data integration system, selecting a data source comprising dimensional data, automatically generating an ETL job for the dimensional data utilizing the imported SCD attributes, and executing the automatically generated ETL to extract the dimensional data from the data source and loading the dimensional data into the imported SCD attributes in a target data system.
摘要:
Disclosed are a method, information processing system, and computer readable medium for scanning a storage medium table. The method includes retrieving location information associated with at least one other storage medium table scan. A storage medium table scan is started at a location within a storage medium table based on at least a location of the one other storage medium table scan. A weight is assigned to at least one storage medium block based on at least a current scanning location within the storage medium table relative to the location of the one other table scan. The method determines if a distance between the current scanning location and the location of the one other table scan is greater than a first given threshold. A current scanning operation is delayed, in response to the distance being greater than the given threshold, until the distance is below a second given threshold.
摘要:
Disclosed are a method, information processing system, and computer readable medium for managing table scan processes. The method includes monitoring a plurality of storage medium table scan processes. Each storage medium table scan process in the plurality of storage medium table scan processes is placed into a plurality of scan groups based on storage medium pages to be scanned by each of the storage medium table scan processes. Each storage medium table scan process in a scan group can share data within a storage medium page.
摘要:
A system, apparatus and method are provided for the dynamic caching of data based on queries performed by a local application, where the system includes a remote server having a complete database, a local database on an edge server including a subset of the complete database, the edge server in communication with the remote server, shared tables within the local database on the edge server for caching results from the complete database, receiving locally generated data, and adjusting the contents of the cache based on available storage requirements while ensuring consistency of the data between the local database and the remote database; the apparatus includes an edge data cache including a query evaluator, a cache index, cache repository, resource manager, containment checker, query parser and consistency manager all in signal communication with the query evaluator; and the method for a local server to satisfy a database query meant for at least one remote server includes dynamically caching results of previous database queries of the remote server, associating a local database with the local server, storing a plurality of the caching results in shared tables in the local database, and using the plurality of the caching results in satisfying a new database query with the local server.