Abstract:
Similarities between events that include a plurality of dimensions are computed, the similarities computed based on binary comparisons between the events and based on user-specified weights for the dimensions. Multidimensional scaling (MDS) values are calculated based on the computed similarities between the events. A graphical visualization is generated of a temporal plot of the events, the temporal plot comprising a first axis corresponding to time, and a second axis corresponding to the MDS values, and the temporal plot representing overlapping time slices each containing pixels representing a respective subset of the events.
Abstract:
Visually interactive identification of a cohort of similar data objects is disclosed. One example is a system including a data processor to access a plurality of data objects, each data object comprising a plurality of numerical components, where each component represents a data feature of a plurality of data features, and to identify, for each data feature, a feature distribution of the numerical components. A selector selects a sub-plurality of the data features of a query object, where a given data feature is selected if the component representing the given data feature is a peak for the feature distribution. An evaluator determines a similarity measure based on the sub-plurality of the data features. An interaction processor iteratively processes selection of a sub-plurality of the data features based on domain knowledge, and identifies, based on the similarity measures, a cohort of data objects similar to the query object.
Abstract:
Synthetic healthcare data generation can include receiving an indication of a particular quantity of people, receiving an indication of a particular quantity of time periods, assigning a respective set of characteristics to each of the people based on a statistical model, simulating a respective path for each of the people through a set of clinical practice guidelines over the specified time periods, wherein each path is determined based on the respective set of characteristics, determining a probability associated with a progression of a medical condition for each of the people at the end of each time period, and generating a synthetic data set for each of the people based on the simulated paths and the determined probabilities.
Abstract:
According to an example, in a method for displaying visual analytics of entity data, geographic locations of entities may be plotted as first pixel cells on a first region and as second pixel cells on a second region of a geographic map. A determination may be made that the first pixel cells have a higher degree of overlap with each other in the first region compared to the second pixel cells in the second region. The geographic map may be distorted to enlarge the first region and the first pixel cells may be arranged in the first region in a manner that prevents the first pixel cells from overlapping each other. A color value for each of the pixel cells may be determined from a multi-paired color map that represents two variables corresponding to the entities by color and the pixel cells may be caused to be displayed on the distorted geographic map according to the determined respective color values.
Abstract:
A multi-attribute visualization is generated that includes non-overlapped cells that represent respective items. The cells are placed in the visualization according to geographic locations associated with the items, and the cells being assigned visual indicators to represent a first attribute of the items. The cells are arranged in clusters in the visualization, where a size of a particular one of the clusters indicates a second attribute representing a number of cases associated with a corresponding one of the items. Multiple coordinated views of the cells are presented in the visualization, the multiple views corresponding to respective different time intervals.
Abstract:
According to an example, in a method for displaying visual analytics of entity data, geographic locations of entities may be plotted as first pixel cells on a first region and as second pixel cells on a second region of a geographic map. A determination may be made that the first pixel cells have a higher degree of overlap with each other in the first region compared to the second pixel cells in the second region. The geographic map may be distorted to enlarge the first region and the first pixel cells may be arranged in the first region in a manner that prevents the first pixel cells from overlapping each other. A color value for each of the pixel cells may be determined from a multi-paired color map that represents two variables corresponding to the entities by color and the pixel cells may be caused to be displayed on the distorted geographic map according to the determined respective color values.
Abstract:
A user-selected group of data points is received. Weighted distances between further data points with the user-selected group of data points are computed, the weighted distances computed based on respective weights assigned to dimensions of data points. Density-based grouping of the further data points is performed based on the computed weighted distances, the density-based grouping producing cohorts of data points. A graphical visualization is generated including pixels representing the user-selected group of data points and the cohorts of data points. The graphical visualization provides a temporal-based visualized identification of the cohorts with the user selected group of data points.
Abstract:
Visualization of a cohort for high-dimensional categorical data is disclosed. One example is a system including a display module to identify real-time selection of a query data element in an interactive visual representation of high-dimensional categorical data elements comprising a plurality of categorical components. A matrix generator generates a binary distance matrix with columns representing categorical components, and entries in a row indicative of a degree of similarity of respective categorical components of the selected query data element to a data element represented by the row, and determines a category weighting matrix by associating a weight with entries in each column of the binary distance matrix. An evaluator evaluates a weighted similarity score for a data element represented by a row of the category weighting matrix based on entries of the row. A selector iteratively and interactively selects, based on weighted similarity scores, a cohort of categorical data elements.