摘要:
Different advantageous embodiments provide for response prediction. A social element is received by a prediction mechanism. A feature set is generated for the social element. A prediction is generated using the feature set and a prediction model.
摘要:
An analysis module, when triggered by a synchronization framework when a new data item is added to a project data store, runs a series of analysis feature extractors on the new content. An analysis may be conducted, and features of interest may be extracted from the data item. The analysis utilizes natural language processing, as well as other technologies, to provide an automatic or semi-automatic extraction of information. The extracted features of interest are saved as metadata within the project data store, and are associated with the data item from which it was extracted. The analysis module may be utilized to discover additional information that may be gleaned from content that is already in the project data store.
摘要:
The present invention provides a system for identifying, extracting, clustering and analyzing sentiment-bearing text. In one embodiment, the invention implements a pipeline capable of accessing raw text and presenting it in a highly usable and intuitive way.
摘要:
A computer-implemented system and method for assessing the editorial quality of a textual unit (document, paragraph or sentence) is provided. The method includes generating a plurality of training-time feature vectors by automatically extracting features from first and last versions of training documents. The method also includes training a machine-learned classifier based on the plurality of training-time feature vectors. A run-time feature vector is generated for the textual unit to be assessed by automatically extracting features from the textual unit. The run-time feature vector is evaluated using the machine-learned classifier to provide an assessment of the editorial quality of the textual unit.
摘要:
In one embodiment, a web service engine server 104 may predict a successive action by a user based on an entity reference 302. The web service engine server 104 identifies an entity reference 302 in a data transmission caused by a user. The web service engine server 104 determines from the data transmission a user intention towards the entity reference 302 using an intention model based on a transmission log. The web service engine server 104 predicts a related successive web action option 522 for the entity reference 302 based on the user intention.
摘要:
Architecture that detects and corrects writing errors in a human language based on the utilization of three different stages: error detection, correction candidate generation, and correction candidate ranking. The architecture is a generic framework for generating fluent alternatives to non-grammatical word sequences in a written sample. Error detection is addressed by a suite of language model related scores and other scores such as parse scores that can identify a particularly unlikely sequence of words. Correction candidate generation is addressed by a lookup in a very large corpus of “correct” English that looks for alternative arrangements of the same or similar words or subsequences of these words in the same context. Correction candidate ranking is addressed by a language model ranker.
摘要:
Described is estimating whether an online search query is a news-related query, and if so, outputting news-related results in association with other search results returned in response to the query. The query is processed into features, including by accessing corpora that corresponds to relatively current events, e.g., recently crawled from news and blog articles. A corpus of static reference data, such as an online encyclopedia, may be used to help determine whether the query is less likely to be about current events. Features include frequency-related data and context-related data corresponding to frequency and context information maintained in the corpora. Additional features may be obtained by processing text of the query itself, e.g., “query-only” features.
摘要:
An overwhelming number of articles are available everyday via the internet. Unfortunately, it is impossible to peruse more than a handful, and it is difficult to ascertain an article's social context. The techniques disclosed herein address this problem by harnessing implicit and explicit contextual information from social media. By extracting text surrounding a hyperlink to an article in a post and assessing the article as a function of content surrounding the hyperlink, an article's social context is determined and presented. Additionally, articles that are sufficiently similar in content may be grouped to establish a many-to-one relationship between posts and an article, creating a more accurate assessment.
摘要:
A summarization system and method. The summarization method includes utilizing a first body of information to obtain a second body of information, which is identified (by a hyperlink, an attachment identifier, a reference, etc.) in the first body of information. A summary of the obtained second body of information is then computed. The computed summary can be displayed to a user and/or stored for later use.
摘要:
A method for providing aligned editorial corrections to a database is discussed. The method includes receiving a first text in a language and organizing the first text into one or more sentences. The method further includes editing a copy of the first text to create a second text. The second text is in the language of the first text. The method further includes aligning the sentences of the first text with corresponding sentences of the second text storing the aligned sentences on a computer readable medium. A system for providing a data structure having aligned editorial corrections is also discussed. The system includes an alignment component for receiving a first text and organizing the first text into sentences. The system also includes a user interface configured to provide a second text, wherein the second text is an edited version of the first text in the language of the first text.