摘要:
Medical related quality of care information is extracted and edited for reporting. Patient records are mined. The mining may include mining unstructured data to create structured information. Measures are derived automatically from the structured information. A user may then edit the measures, data points used to derive the measures, or other quality metric based on expert review. The editing may allow for a better quality report. Tools may be provided to configure reports, allowing generation of new or different reports.
摘要:
Medical or other data is de-identified by obfuscation. Located instances are replaced. By replacing with values in a same format and level of generality, multiple possible identifications—the replacement values and the instances not located—are provided in the data, obfuscating the original identification. By replacing as a function of a probability, the resulting data set has different instances distributed in a way making identification of the actual or original instances not located by searching more difficult.
摘要:
A method for interpreting date information from unstructured text includes performing phrase tokenization on the unstructured text to identify one or more temporal phrases. Word categorization is performed on the one or more temporal phrases to categorize one or more words of each temporal phrase. Grammar analysis is performed to match each temporal phrase to an understood syntax using the categorizations of the words of each temporal phrase. Each temporal phrase is interpreted based on the matched syntax.
摘要:
A method for sequence tagging medical patient records includes providing a labeled corpus of sentences taken from a set of medical records, initializing generative parameters θ and discriminative parameters {tilde over (θ)}, providing a functional LL−C×Penalty, where LL is a log-likelihood function LL = log p ( θ , θ ~ ) + ∏ l = 1 M [ log p ( X l , Y l | θ ~ ) - log p ( X l | θ ~ ) ] + ∏ l = 1 M log p ( X l | θ ) , Penalty = ∑ y ∈ V Y ( em y 2 + tr y 2 + e m ~ y 2 + t r ~ y 2 ) , where emy=1−Σ∀xiεVXp(xi|y), e{tilde over (m)}y=1−Σ∀xiεVX{tilde over (p)}(xi|y) are emission probability constraints, try=1−Σ∀yiεVYp(yi|y), t{tilde over (r)}y=1−Σ∀yiεVY{tilde over (p)}(yi|y) are transition probability constraints, and extracting gradients of LL−C×Penalty with respect to the transition and emission probabilities and solving θk*,{tilde over (θ)}k*that maximize LL−C×Penalty, initializing a new iteration with θk*,{tilde over (θ)}k* and incrementing C and repeating until solutions have converged, where parameters θ,{tilde over (θ)} are the probabilities that a new sentence X′ is labeled as Y′.
摘要翻译:用于对医疗病人记录进行顺序标记的方法包括提供从一组医疗记录中取得的标记语句库,初始化生成参数和假设; 提供一个功能性的LL-C×Penalty,其中LL是一个对数似然函数,LL = log-perm p(&Thetas;,&thetas;〜)+Πl = 1 M ¯[(X as;;ΠΠΠΠΠΠ |& tt;φ········ VXp (xi | y),e {tilde over(m)} y = 1&Sgr;∀xi&egr; VX {tilde over(p)}(xi | y)是发射概率约束,try = 1-&Sgr;∀yi&egr; (yi | y),t {tilde over(r)} y = 1&Sgr;∀yi&egr; VY {tilde over(p)}(yi | y)是转移概率约束,提取LL-C× 对于过渡和排放概率和解决方案的惩罚; k *,{tilde over(&thetas;)} k *,使LL-C×Penalty最大化,用&thetas初始化新的迭代; k *,{tilde over(&thetas; )} k *并递增C并重复 直到解决方案已经收敛,其中参数&thetas; {tilde over(&thetas;)}是新句子X'被标记为Y'的概率。
摘要:
A method for sequence tagging medical patient records includes providing a labeled corpus of sentences taken from a set of medical records, initializing generative parameters θ and discriminative parameters {tilde over (θ)}, providing a functional LL−C×Penalty, where LL is a log-likelihood function LL = log p ( θ , θ ~ ) + ∏ l = 1 M [ log p ( X l , Y l | θ ~ ) - log p ( X l | θ ~ ) ] + ∏ l = 1 M log p ( X l | θ ) , Penalty = ∑ y ∈ V Y ( em y 2 + tr y 2 + e m ~ y 2 + t r ~ y 2 ) , where emy=1−Σ∀xjεVXp(xi|y), e{tilde over (m)}y=1−Σ∀xiεVX{tilde over (p)}(xi|y) are emission probability constraints, try=1−Σ∀yiεVYp(yi|y), t{tilde over (r)}y=1−Σ∀yiεVY{tilde over (p)}(yi|y) are transition probability constraints, and extracting gradients of LL−C×Penalty with respect to the transition and emission probabilities and solving θ*k,{tilde over (θ)}*k that maximize LL−C×Penalty, initializing a new iteration with θ*k,{tilde over (θ)}*k and incrementing C and repeating until solutions have converged, where parameters θ,{tilde over (θ)} are the probabilities that a new sentence X′ is labeled as Y′.
摘要:
The present invention provides a data mining framework for mining high-quality structured clinical information. The data mining framework includes a data miner that mines medical information from a computerized patient record (CPR) based on domain-specific knowledge contained in a knowledge base. The data miner includes components for extracting information from the CPR, combining all available evidence in a principled fashion over time, and drawing inferences from this combination process. The mined medical information is stored in a structured CPR which can be a data warehouse.
摘要:
A technique is provided for automatically generating performance measurement information. At least some of the obtained performance measurement information may be derived from unstructured data sources, such as free text physician notes, medical images, and waveforms. The performance measurement may be sent to a health care accreditation organization. The health care accreditation organization can use the performance measurement to evaluate a health care provider for its quality of patient care. Alternatively, performance measurement information can be provided directly to consumers.
摘要:
The present invention provides a graphical user interface for presentation, exploration and verification of patient information. In various embodiments, a method is provided for browsing mined patient information. The method includes selecting patient information to view, at least some of the patient information being probabilistic, presenting the selected patient information on a screen, the selected patient information including links to related information. The selected patient information may include elements, factoids, and/or conclusions. The selected patient information may include an element linked to unstructured information. For example, an element linked to a note with highlighted information may be presented. Additionally, the unstructured information may include medical images and waveform information.
摘要:
A method for creating and searching medical ontologies includes providing a semi-structured information source comprising a plurality of articles linked to each other, each article having one or more sections and each article is associated with a concept, creating a directed unlabeled graph representative of the information source, providing a plurality of labels, labeling a subset of edges, and assigning each unlabeled edge an equal probability of being assigned one of the labels. For each node, the probability of each outgoing edge is updated by smoothing each probability by an overall probability distribution of labels over all outgoing edges of each node, and the probability of each incoming edge is updated the same way. A label with a maximum probability is assigned to an edge if said maximum probability is greater than a predetermined threshold to create a labeled graph.
摘要:
Members of a medical entity class are extracted from patient data. A semi-supervised approach uses one or more initial medical terms such as terms from an ontology, for a given category or medical canonical entity. A larger set of medical terms is extracted from the medical information. In one example, the extraction is performed using lexical surface form features, rather than syntactical parsing.