Abstract:
Systems and techniques for determining significance between entities are disclosed. The systems and techniques identify a first entity having an association with a second entity, apply a plurality of association criteria to the association, weight each of the criteria based on defined weight values, and compute a significance score for the first entity with respect to the second entity based on a sum of a plurality of weighted criteria values. The systems and techniques utilize information from disparate sources to create a uniquely powerful signal. The systems and techniques can be used to identify the significance of relationships (e.g., associations) among various entities including, but not limited to, organizations, people, products, industries, geographies, commodities, financial indicators, economic indicators, events, topics, subject codes, unique identifiers, social tags, industry terms, general terms, metadata elements, classification codes, and combinations thereof.
Abstract:
Systems and techniques for improving the training of machine learning classifiers are disclosed. A classifier is trained using a set of validated documents that are accurately associated with a set of class labels. A subset of non-validated documents is also identified and is used to further train and improve accuracy of the classifier.
Abstract:
A method and system for scene parsing and model fusion in laparoscopic and endoscopic 2D/2.5D image data is disclosed. A current frame of an intra-operative image stream including a 2D image channel and a 2.5D depth channel is received. A 3D pre-operative model of a target organ segmented in pre-operative 3D medical image data is fused to the current frame of the intra-operative image stream. Semantic label information is propagated from the pre-operative 3D medical image data to each of a plurality of pixels in the current frame of the intra-operative image stream based on the fused pre-operative 3D model of the target organ, resulting in a rendered label map for the current frame of the intra-operative image stream. A semantic classifier is trained based on the rendered label map for the current frame of the intra-operative image stream.
Abstract:
A method (100) of constructing a probabilistic graphical model (10) of a system from data that includes both normal and anomalous data includes the step of learning parameters of a structure for the probabilistic graphical model (10). The structure includes at least one latent variable (26) on which other variables (12, 14, 16, 18, 20, 22, 24) are conditional, and has a plurality of components. The method further includes the steps of: iteratively associating one or more of the plurality of components of the latent variable (26) with normal data;constructing a matrix of the associations;detecting abnormal components of the latent variable (26) based on one of a low association with the normal data or the matrix of associations; and deleting the abnormal components of the latent variable (26) from the probabilistic graphical model (10).
Abstract:
For therapy response assessment, texture features are input for machine learning a classifier and for using a machine learnt classifier. Rather than or in addition to using formula-based texture features, data driven texture features are derived from training images. Such data driven texture features are independent analysis features, such as features from independent subspace analysis. The texture features may be used to predict the outcome of therapy based on a few number of or even one scan of the patient
Abstract:
A method for a partially self-training learning system is disclosed. The learning systems, such as document classifiers (10), are initially trained on a small amount of hand-sorted data (12). The learning systems process unlabeled data by assigning classifications to the data. A confidence level in the classification is verified for each newly classified document. If the classification is made with a sufficiently high confidence level, the learning system trains on the word vector of the newly classified document. If the classification of the newly classified document is not made with a sufficiently high confidence level, the learning system does not use the word vector in the newly classified document for training purposes.
Abstract:
An approach is provided for generating synthetic image data for machine learning. The approach, for instance,involvesdetermining, by a processor, a set of parameters for indicating an action by one or more objects. The action is a dynamic movement of the one or more objects through a geographic space over a period of time. The approach also involves processing the set of parameters to generate synthetic image data. The synthetic image data includes a computer- generated image sequence of the one or more objects performing the action in the geographic space over the period of time. The approach further involves automatically labeling the synthetic image data with at least one label representing the action, the set of parameters, or a combination thereof. The approach further involves providing the labeled synthetic image data for training or evaluating a machine learning model to detect the action.