摘要:
Dynamically aggregating data is provided. A server device receives a subscriber request for a report based on a subset of metadata contained in a data dimensions catalog. The server device analyzes data aggregation requirements from a plurality of data sources for the report based on the subset of metadata defined in the subscriber request. The server device generates a data access plan for movement of data from the plurality of data sources based on the data aggregation requirements for the report. Then, the server device executes the data access plan to fetch the data from the plurality of data sources based on the data aggregation requirements for the report.
摘要:
A method for dynamically aggregating data is provided. A server device receives a subscriber request for a report based on a subset of metadata contained in a data dimensions catalog. The server device analyzes data aggregation requirements from a plurality of data sources for the report based on the subset of metadata defined in the subscriber request. The server device generates a data access plan for movement of data from the plurality of data sources based on the data aggregation requirements for the report. Then, the server device executes the data access plan to fetch the data from the plurality of data sources based on the data aggregation requirements for the report. A computer system and computer program product for dynamically aggregating data are also provided.
摘要:
Access is obtained to a parallel corpus including a problem corpus and a solution corpus. A first plurality of topics are mined from the problem corpus and a second plurality of topics are mined from the solution corpus. A transition probability from the first plurality of topics to the second plurality of topics is determined, to identify a most appropriate one of the topics from the solution corpus for a given one of the topics from the problem corpus.
摘要:
The invention provides a method and system for visualization of a data set, the method comprises: dividing the data set into a plurality of information layers based on different information dimensions; and visually processing the plurality of information layers based on different information dimensions, respectively, in order to present respective views of the plurality of information layers. In the present invention, by visualizing the data set through presenting different overviews of the data set from different information dimensions, respectively, the presentation of comprehensive information of the data set to a data set analyst is ensured while distortion of presented contents as well as visual clutter are prevented.
摘要:
A method, system and computer program product for managing and querying a graph. The method includes the steps of: receiving a graph; partitioning the graph into homogeneous blocks; compressing the homogeneous blocks; and storing the compressed homogeneous blocks in files where at least one of the steps is carried out using a computer device.
摘要:
Systems and methods for risk factor identification include identifying a first set of risk factors from personal data. A second set of risk factors is identified from at least one of a user input and a knowledge source. The first set is combined with the second set, using a processor, by selecting a number of risk factors from the first set that augment the second set of risk factors to determine a combined list of risk factors that predict a condition of interest.
摘要:
Common sub-process patterns in a plurality of deployed process models may be discovered, and performance measures associated with the sub-process patterns may be computed based on runtime events of the deployed process models. Positive or negative performance patterns among sub-process patterns may be identified and used for creating new process models or improving existing process models.
摘要:
The invention provides a method and system for visualization of a data set, the method comprises: dividing the data set into a plurality of information layers based on different information dimensions; and visually processing the plurality of information layers based on different information dimensions, respectively, in order to present respective views of the plurality of information layers. In the present invention, by visualizing the data set through presenting different overviews of the data set from different information dimensions, respectively, the presentation of comprehensive information of the data set to a data set analyst is ensured while distortion of presented contents as well as visual clutter are prevented.
摘要:
Computer-implemented methods, systems, and articles of manufacture for determining the importance of a data item. A method includes: (a) receiving a node graph; (b) approximating a number of neighbor nodes of a node; and (c) calculating a average shortest path length of the node to the remaining nodes using the approximation step, where this calculation demonstrates the importance of a data item represented by the node. Another method includes: (a) receiving a node graph; (b) building a decomposed line graph of the node graph; (c) calculating stationary probabilities of incident edges of a node graph node in the decomposed line graph, and (d) calculating a summation of the stationary probabilities of the incident edges associated with the node, where the summation demonstrates the importance of a data item represented by the node. Both methods have at least one step carried out using a computer device.
摘要:
A system and method for a composite distance metric leveraging multiple expert judgments includes inputting a data distribution of multiple expert judgments stored on a computer readable storage medium. Base distance metrics are converted into neighborhoods for comparison, wherein each base distance metric represents an expert. The neighborhoods are combined to leverage the local discriminalities of all base distance metrics by applying at least one iterative process to output a composite distance metric.