Abstract:
Disclosed are the embodiments for creating a model capable of identifying one or more clusters in a healthcare dataset. An input is received pertaining to a range of numbers. Each number in the range of numbers is representative of a number of clusters in the healthcare dataset. For a cluster, one or more first parameters of a distribution associated with the cluster are estimated. Thereafter, a threshold value is determined based on the one or more first parameters. An inverse cumulative distribution of each of one or more n-dimensional variables in the healthcare dataset is determined. The one or more first parameters are updated to generate one or more second parameters based on the estimated inverse cumulative distribution. A model is created for each number in the range of numbers based on the one or more second parameters.
Abstract:
Disclosed are embodiments of methods and systems for predicting a health condition of a first human subject. The method comprises extracting a historical data including physiological parameters of one or more second human subjects. A latent variable is determined based on an inverse cumulative distribution of a transformed historical data, determined by ranking of the historical data. Further, one or more parameters of a first distribution, deterministic of health conditions in the historical data, are determined based on the latent variable. For each physiological parameter, a random variable is sampled from a second distribution of the physiological parameter based on the one or more parameters. Further, based on the random variable, the latent variable is updated. Thereafter, the one or more parameters are re-estimated based on the updated latent variable. Based on the first distribution a classifier is trained to predict the health condition of the first human subject.
Abstract:
According to embodiments illustrated herein, there is provided a system for predicting a health condition of a first patient. The system includes a document processor configured to extract one or more headings from one or more medical records of the first patient based on one or more predefined rules. The document processor is further configured to extract one or more words from one or more phrases written under each of the extracted one or more headings, wherein the one or more phrases correspond to documentation of the observation of the first patient by a medical attender. The system further includes one or more processors configured to predict the health condition of the first patient based on a count of the one or more words in historical medical records and the one or more medical records.
Abstract:
Disclosed are methods and systems for classifying one or more human subjects in one or more categories indicative of a health condition of the one or more human subjects. The method includes categorizing one or more parameters of each of the one or more human subjects in one or more data views based on a data type of each of the one or more parameters. A data view corresponds to a first data structure storing a set of parameters categorized in the data view, associated with each of the one or more human subjects. The one or more data views are transformed to a second data structure representative of the set of parameters across the one or more data views. Thereafter, a classifier is trained based on the second data structure, wherein the classifier classifies the one or more human subjects in the one or more categories.
Abstract:
LASSO constraints can lead to a Gaussian mixture copula model that is more robust, better conditioned, and more reflective of the actual clusters in the training data. These qualities of the GMCM have been shown with data obtained from: digital images of fine needle aspirates of breast tissue for detecting cancer; email for detecting spam; two dimensional terrain data for detecting hills and valleys; and video sequences of hand movements to detect gestures. Using training data, a GMCM estimate can be produced and iteratively refined to maximize a penalized log likelihood estimate until sequential iterations are within a threshold value of one another. The GMCM estimate can then be used to classify further samples. The LASSO constraints help keep the analysis tractibe such that useful results can be found and used while the result is still useful.
Abstract:
A method, non-transitory computer readable medium and apparatus for predicting mortality of a current patient are disclosed. For example, the method includes receiving data associated with a plurality of different patients with known mortality outcomes, wherein the data includes a subset of data for each one of a plurality of different measurement timepoints for each one of the plurality of different patients, calculating n number of classifiers, wherein n is equal to a number of the plurality of different measurement timepoints, receiving data associated with the current patient at an i-th measurement timepoint, predicting the current patient has a high mortality risk based on an output of the i-th classifier of the n number of classifiers and transmitting a signal to a health administration server to cause an alarm to be generated in response to the high mortality risk that is predicted.
Abstract:
The disclosed embodiments illustrate methods and systems for digitizing a document. The method includes receiving at least one first transcription of content of at least one portion of the document from at least one crowdworker, in response to the at least one portion being crowdsourced as a digitization task to the at least one crowdworker. Thereafter, one or more second transcriptions are determined based on the at least one first transcription. The one or more second transcriptions correspond to intended transcriptions for the at least one portion. Further, the one or more second transcriptions are ranked based at least on a measure of similarity between the at least one first transcription and each of the one or more second transcriptions. At least one second transcription is selected from the one or more second transcriptions as an acceptable transcription for the at least one portion based on the ranking.
Abstract:
Disclosed are the methods and systems for classifying one or more patients in one or more categories. A distribution of one or more physiological parameters associated with the one or more patients is determined based on a patient dataset. The one or more physiological parameters correspond to at least a stroke scale score. One or more parameters associated with a copula are estimated by the one or more processors. In an embodiment, the copula defines a joint distribution of the one or more physiological parameters. A classifier is created based on the one or more parameters, wherein the classifier classifies the one or more patients in the one or more categories. The one or more categories correspond to a range of the stroke scale score.
Abstract:
Disclosed are embodiments of methods and systems for predicting mortality of a first patient. The method comprises categorizing a historical data into a first category and a second category. The method further comprises determining a first test parameter and a second test parameter based on at least one of a sample data of a first patient and the historical data corresponding to at least one of the first category and the second category. The method further comprises determining a probability score based on a cumulative distribution of at least one of the first test parameter and the second test parameter. The method further comprises categorizing the sample data in one of the first category and the second category based on the probability score. Further, the method comprises predicting the mortality of the first patient based on at least the categorization of the sample data of the first patient.
Abstract:
Methods and systems for creating one or more statistical classifiers. A first set of performance parameters, corresponding to the one or more applications and the one or more computing infrastructures, is extracted from a historical data pertaining to the execution of the one or more applications on the one or more computing infrastructures. Further, a set of application-specific and a set of infrastructure-specific parameters are selected, from the first set of performance parameters, based on one or more statistical techniques. A similarity between each pair of the applications, each pair of the computing infrastructures, and each pair of possible combinations of an application and a computing infrastructure is determined. One or more statistical classifiers are created, based on the determined similarity.