摘要:
Provided are techniques for creating an inverted index for features of a set of data elements, wherein each of the data elements is represented by a vector of features, wherein the inverted index, when queried with a feature, outputs one or more data elements containing the feature. The features of the set of data elements are ranked. For each feature in the ranked list, the inverted index is queried for data elements having the feature and not having any previously selected feature and a cluster of the data elements is created based on results returned in response to the query.
摘要:
Techniques are provided for electronic Information Retrieval (IR) applied for an electronic search in a search environment. At indexing time, a searched document is mapped to at least one element of an organizational structure of an enterprise associated with the search environment. At query time, a querying user is associated with at least one element of the organizational structure of the enterprise. The organizational information of the searched document and that of the querying user are compared. A higher rank is provided to the searched document when the searched document has a closer organizational relation to the querying user compared to other searched documents with a less close relation to the querying user based on the compared organizational information.
摘要:
Provided are techniques for creating an inverted index for features of a set of data elements, wherein each of the data elements is represented by a vector of features, wherein the inverted index, when queried with a feature, outputs one or more data elements containing the feature. The features of the set of data elements are ranked. For each feature in the ranked list, the inverted index is queried for data elements having the feature and not having any previously selected feature and a cluster of the data elements is created based on results returned in response to the query.
摘要:
An embodiment of a method for enhanced content browsing includes loading a web page in a user interface; detecting entities of a first specified type in the web page by an analysis service; tagging the detected entities in the web page; calling an action service associated with the analysis service when a detected entity is activated; and displaying a result of the action service in the user interface. Embodiments of systems for enhanced content browsing are also provided.
摘要:
According to one embodiment of the present invention, a method for social bookmarking and tagging documents is provided. According to one embodiment of the present invention, a method comprises receiving a new document in a tagging server having a storage unit with stored tags associated with a preexisting document and comparing the new document with the tags using a processor to find matching instances between parts of the new document and the tags. Each matching instance in the new document is marked with tag information. The marked up new document is delivered for display on a display unit.
摘要:
A framework for information extraction from natural language documents is application independent and provides a high degree of reusability. The framework integrates different Natural Language/Machine Learning techniques, such as parsing and classification. The architecture of the framework is integrated in an easy to use access layer. The framework performs general information extraction, classification/categorization of natural language documents, automated electronic data transmission (e.g., E-mail and facsimile) processing and routing, and plain parsing. Inside the framework, requests for information extraction are passed to the actual extractors. The framework can handle both pre- and post processing of the application data, control of the extractors, enrich the information extracted by the extractors. The framework can also suggest necessary actions the application should take on the data. To achieve the goal of easy integration and extension, the framework provides an integration (outside) application program interface (API) and an extractor (inside) API.
摘要:
An embodiment of a method for enhanced content browsing includes loading a web page in a user interface; detecting entities of a first specified type in the web page by an analysis service; tagging the detected entities in the web page; calling an action service associated with the analysis service when a detected entity is activated; and displaying a result of the action service in the user interface. Embodiments of systems for enhanced content browsing are also provided.