摘要:
Consumer-generated media (CGM) and/or other media are monitored to allow an organization to become aware of, and respond to, issues that may affect how it is perceived by the public. An extract, transform, load (ETL) engine is used to process CGM and other media content, and an analytical engine utilizes a multi-step progressive filtering approach to identify those documents that are most relevant. The filtering approach includes executing broad queries to extract relevant content from different CGM and other sources, extracting text snippets from the relevant content and performing de-duplication, defining organizational identity (e.g., brand name, trade name, or company name) and hot-topic models using a rule-based and statistical-based approach, and using the models together in an orthogonal filtering approach to effectively generate alerts and reports. The methodology is found to be substantially more effective compared to a conventional keyword based approach.
摘要:
Consumer-generated media (CGM) and/or other media are monitored to allow an organization to become aware of, and respond to, issues that may affect how it is perceived by the public. An extract, transform, load (ETL) engine is used to process CGM and other media content, and an analytical engine utilizes a multi-step progressive filtering approach to identify those documents that are most relevant. The filtering approach includes executing broad queries to extract relevant content from different CGM and other sources, extracting text snippets from the relevant content and performing de-duplication, defining organizational identity (e.g., brand name, trade name, or company name) and hot-topic models using a rule-based and statistical-based approach, and using the models together in an orthogonal filtering approach to effectively generate alerts and reports. The methodology is found to be substantially more effective compared to a conventional keyword based approach.
摘要:
A method for identifying emerging concepts in unstructured text streams comprises: selecting a subset V of documents from a set U of documents; generating at least one Boolean combination of terms that partitions the set U into a plurality of categories that represent a generalized, statistically based model of the selected subset V wherein the categories are disjoint inasmuch as each document of U is included in only one category of the partition; and generating a descriptive label for each of the disjoint categories from the Boolean combination of terms for that category.
摘要:
A method for identifying emerging concepts in unstructured text streams comprises: selecting a subset V of documents from a set U of documents; generating at least one Boolean combination of terms that partitions the set U into a plurality of categories that represent a generalized, statistically based model of the selected subset V wherein the categories are disjoint inasmuch as each document of U is included in only one category of the partition; and generating a descriptive label for each of the disjoint categories from the Boolean combination of terms for that category.
摘要:
Data may be modeled as an undirected graph. A set of entities and a set of attributes may be defined. A set of relationships may be defined to represent semantic associations with each association connecting at least two entities. Attributes may be associated with entities rather than with relationships. A hierarchical query language with a set of atomic operations on modeled data may be employed. The modeled data may be displayed on a display unit.
摘要:
A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.
摘要:
A system for classifying documents in a collection of documents according to their intended readerships includes: a computer configured to select a document in the collection of documents; and a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. A computer classifies the selected document as misleading, commercial, or personal according to its determined characteristic; and a computer repeats the steps of select document, determines a characteristic of the selected document, and classifies the selected document for additional documents in the collection. At least some documents are classified as misleading, some as commercial, and at least some as personal.
摘要:
One embodiment is a computer-implemented method for classifying documents in a collection of documents according to their intended readerships. The method comprises using a computer to select a document in the collection of documents; and using a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. The method further includes using a computer to classify the selected document as misleading, commercial, or personal according to its determined characteristic; and using a computer to repeat the steps of select document, determine a characteristic of the selected document, and classify the selected document for additional documents in the collection. At least some documents are classified as misleading, at least some documents are classified as commercial, and at least some documents are classified as personal. Other methods and computer program products are also disclosed according to even more embodiments.
摘要:
The present invention provides a method, Web server and computer system for converging a desktop application and a Web application. The method may comprise: in response to a request from a client user for using a target desktop application, starting a desktop application initialization process on the Web server and determining an appropriate corresponding hosting server for the user; preparing and provisioning desktop application environment on the corresponding hosting server and starting the target desktop application; transmitting the corresponding hosting server's address to the client so as to make desktop application interaction between the client and the corresponding hosting server; and in response to the completion of the desktop application interaction, stopping and exiting the target desktop application on the corresponding hosting server.
摘要:
Among the various aspects of the present disclosure is the provision of methods of detecting biomarkers of endoplasmic reticulum (ER) stress-associated kidney diseases. Another aspect of the present disclosure provides for a method of treating an endoplasmic reticulum (ER) stress-associated kidney disease in a subject.