摘要:
Methods, systems, and computer program products for expressive temporal predictions over semantically-driven time windows are provided herein. A computer-implemented method includes identifying, within a knowledge graph pertaining to a given prediction, a subset of the knowledge graph related to one or more predicted training examples, wherein the subset comprises (i) a set of nodes and (ii) one or more relationships among the set of nodes; determining, for the identified subset, one or more snapshots of the knowledge graph relevant to the given prediction; quantifying a validity window for the one or more predicted training examples, wherein the validity window comprises a temporal bound for prediction validity; and computing a validity window for the given prediction based on the quantified validity window for the one or more predicted training examples.
摘要:
A mechanism is provided for identifying a set of top-in clusters from a set of top-k plans. A planning problem and an integer value k indicating a number of top plans to be identified are received. A set of top-k plans are generated with at most size k, where the set of top-k plans is with respect to a given measure of plan quality. Each plan in the set of top-k plans is clustered based on a similarity between plans such that each cluster contains similar plans and each plan is grouped only into one cluster thereby forming the set of top-m clusters. A representative plan from each top-m cluster is presented to the user.
摘要:
Embodiments include method, systems and computer program products for predicting adverse drug events on a computational system. Aspects include receiving known drug data from drug databases and one or more of a candidate drug, a drug pair, and a candidate drug-patient pair. Aspects also include calculating an adverse event prediction rating representing a confidence level of an adverse drug event for the candidate drug, a drug pair, and a candidate drug-patient pair, the rating being based on the known drug data. Aspects also include associating adverse event features with the candidate drug, drug pair, or a candidate drug-patient pair, including a nature, cause, mechanism, or severity of the adverse drug event. Aspects also include calculating and outputting an adverse event prediction rating.
摘要:
A unified interface that abstracts the underlying differences among heterogeneous data sources and data formats to produce uniform search results. While the result of an initial search may be exactly what the user was seeking, it is likely that the result is in the neighborhood of what was sought. It may aid the end user to provide guided data navigation suggestions to locate related data during data exploration, by providing analysis to identify data similarities among disparate data sources, and by providing guided combination options. The guided data navigation suggestions may include suggestions based on schematic, semantic, and social information. Guided data navigation may aid the user in moving from the initial search landing point in the data to the precise result sought.
摘要:
A computer-implemented method is provided that includes accessing candidate text and a candidate pair including first and second phrases, substituting the first and second phrases into cause-effect patterns to generate variant sentences. An artificial intelligence model is leveraged to determine respective probabilities that the variant sentences are inferred from the candidate text, calculate a statistical measure of the respective probabilities, and assess the calculated statistical measure to ascertain whether the first and second phrases possess a causal relationship or non-causal relationship to one another. A knowledge base including one or more pairs of cause-effect phrase pairs is populated with the first and second phrases possessing the causal relationship. A computer system and a computer program product are also provided.
摘要:
Methods, systems, and computer program products for iterative and targeted feature selection are provided herein. A computer-implemented method includes generating a first prediction value for a variable attribute of a set of objects by executing a predictive model that comprises a set of features for the set of objects; evaluating the prediction error of the predictive model based on said first prediction value; generating additional features upon a determination that the prediction error exceeds a threshold; incorporating the additional features into the predictive model, generating an updated predictive model; generating a second prediction value for the variable attribute by executing the updated predictive model; evaluating the prediction error of the updated predictive model based on said second prediction value; and outputting the second prediction value to a user upon a determination that the prediction error of the updated predictive model is below the threshold.
摘要:
A method, system, and recording medium for knowledge graph augmentation using data based on a statistical analysis of attributes in the data, including mapping classes, attributes, and instances of the classes of the data, indexing semantically similar input data elements based on the mapped data using at least one of a label-based analysis, a content-based analysis, and an attribute-based clustering, and ranking the semantically similar input data elements to create a ranked list.
摘要:
Organizing data within a database is provided. In response to determining that a group of coarsified data records within a database table is not an aligned group of data records, a virtually replicated subgroup of coarsified data records that corresponds to the group of coarsified data records is generated from different groups of coarsified data records within the database table. The virtually replicated subgroup of coarsified data records is aligned with the corresponding group of coarsified data records.
摘要:
A method and system for interfacing with an end user to search, navigate, and combine large numbers of heterogeneous data sources with varying data characteristics. End user entered search terms are received and the end user is then presented a guided exploration including search results and search result details. The end user is also presented with a guided combination including search result combination options and combination details. Both the guided exploration and guided combination render all data from the heterogeneous data sources in a uniform data format and both can culminate in saving selected results.
摘要:
Embodiments for automated feature engineering are provided. Data associated with a machine learning model is received. The received data is mapped to at least one description associated with the data. A feature for the machine learning model is generated based on a formula within a corpus. The formula is associated with the at least one description.