摘要:
A computer-implemented method for privacy-preserving data mining to determine cancer survival rates includes providing a random matrix B agreed to by a plurality of entities, wherein each entity i possesses a data matrix Ai of cancer survival data that is not publicly available, providing a class matrix Di for each of the data matrices Ai, providing a kernel K(Ai, B) by each of said plurality of entities to allow public computation of a full kernel, and computing a binary classifier that incorporates said public full kernel, wherein said classifier is adapted to classify a new data vector according to a sign of said classifier.
摘要:
A computer-implemented method for privacy-preserving data mining to determine cancer survival rates includes providing a random matrix B agreed to by a plurality of entities, wherein each entity i possesses a data matrix Ai of cancer survival data that is not publicly available, providing a class matrix Di for each of the data matrices Ai, providing a kernel K(Ai, B) by each of said plurality of entities to allow public computation of a full kernel, and computing a binary classifier that incorporates said public full kernel, wherein said classifier is adapted to classify a new data vector according to a sign of said classifier.
摘要:
Knowledge-based interpretable predictive modeling is provided. Expert knowledge is used to seed training of a model by a machine. The expert knowledge may be incorporated as diagram information, which relates known causal relationships between predictive variables. A predictive model is trained. In one embodiment, the model operates even with a missing value for one or more variables by using the relationship between variables. For application, the model outputs a prediction, such as the likelihood of survival for two years of a lung cancer patient. A graphical representation of the model is also output. The graphical representation shows the variables and relationships between variables used to determine the prediction. The graphical representation is interpretable by a physician or other to assist in understanding.
摘要:
A system for modeling complete response prediction is provided. The system includes an input that is operable to receive treatment information representing treatment data that may be used to predict a complete response of a tumor. The complete response may include a disappearance of all or substantially all of a disease. A processor may be operable to use a model to predict complete response of the tumor as a function of the treatment data. The model represents a probability of complete response to treatment given the treatment data. A display is operable to output an image as a function of the complete response prediction.
摘要:
A method for multiple-label data analysis includes: obtaining labeled data points from more than one labeler; building a classifier that maximizes a measure relating the data points, labels on the data points and a predicted output label; and assigning an output label to an input data point by using the classifier.
摘要:
We propose using different classifiers based on the spatial location of the object. The intuitive idea behind this approach is that several classifiers may learn local concepts better than a “universal” classifier that covers the whole feature space. The use of local classifiers ensures that the objects of a particular class have a higher degree of resemblance within that particular class. The use of local classifiers also results in memory, storage and performance improvements, especially when the classifier is kernel-based. As used herein, the term “kernel-based classifier” refers to a classifier where a mapping function (i.e., the kernel) has been used to map the original training data to a higher dimensional space where the classification task may be easier.
摘要:
A list of biomarkers indicative of patient outcome is reduced. A computer program is applied to a set of biomarkers indicative of a patient outcome (e.g., prognosis, diagnosis, or treatment result). The computer program models the set of biomarkers with a subset of the biomarkers. The subset is identified without labeling based on the patient outcome. Instead, biomarker scores (e.g., sequence score) are used to identify the subset of biomarkers.
摘要:
An incremental greedy method to feature selection is described. This method results in a final classifier that performs optimally and depends on only a few features. Generally, a small number of features is desired because it is often the case that the complexity of a classification method depends on the number of features. It is very well known that a large number of features may lead to overfitting on the training set, which then leads to a poor generalization performance in new and unseen data. The incremental greedy method is based on feature selection of a limited subset of features from the feature space. By providing low feature dependency, the incremental greedy method 100 requires fewer computations as compared to a feature extraction approach, such as principal component analysis.
摘要:
CAD (computer-aided diagnosis) systems and applications for breast imaging are provided, which implement methods to automatically extract and analyze features from a collection of patient information (including image data and/or non-image data) of a subject patient, to provide decision support for various aspects of physician workflow including, for example, automated diagnosis of breast cancer other automated decision support functions that enable decision support for, e.g., screening and staging for breast cancer. The CAD systems implement machine-learning techniques that use a set of training data obtained (learned) from a database of labeled patient cases in one or more relevant clinical domains and/or expert interpretations of such data to enable the CAD systems to “learn” to analyze patient data and make proper diagnostic assessments and decisions for assisting physician workflow.
摘要:
A method for computer aided detection of anatomical abnormalities in medical images includes providing a plurality of abnormality candidates and features of said abnormality candidates, and classifying said abnormality candidates as true positives or false positives using a hierarchical cascade of linear classifiers of the form sign(wTx+b), wherein x is a feature vector, w is a weighting vector and b is a model parameter, wherein different weights are used to penalize false negatives and false positives, and wherein more complex features are used for each successive stage of said cascade of classifiers.
摘要翻译:一种用于计算机辅助检测医学图像中的解剖异常的方法,包括提供所述异常候选的多个异常候选和特征,并且使用形式符号(w)的线性分类器的分级级联将所述异常候选分类为真阳性或假阳性 x + b),其中x是特征向量,w是加权向量,b是模型参数,其中不同的权重用于惩罚假否定和假肯定,并且其中更复杂的特征 用于分级器级联的每个连续阶段。