Abstract:
A computer-readable medium comprises data structure for providing information about levels of similarity between pairs of N documents. The data structure comprises a plurality of entries of similarity values representing levels of similarity for a plurality of pairs of the documents. Each of the similarity values represents a level of similarity of one document of a given pair relative to the other document of the given pair. The similarity value of each entry is greater than a threshold similarity value that is greater than zero. The plurality of similarity-value entries are fewer than N2−N in number if the similarity values are asymmetric with regard to document pairing, and the plurality of similarity-value entries are fewer than N 2 - N 2 in number if the similarity values are symmetric with regard to document pairing. A method and apparatus for generating the data structure are described.
Abstract:
A computer-implemented method of solving a system optimization problem having a plurality of parameters of unknown value is comprised of randomly generating sets of values for unknown parameters within an the optimization problem. A population of original candidate solutions is generated by applying an algorithm for deterministic optimization to each of the sets of values. The population of solutions is ranked. Additional candidate solutions are iteratively generated from at least certain of the solutions in the population. The validity of the additional candidate solutions is checked, and the valid additional candidate solutions are added to the population of solutions. The population of solutions is re-ranked and at least one solution from the population of solutions is output when a predetermined criterion is met whereby the values for the parameters in the output solution may be used for controlling a system.
Abstract:
A method and apparatus for performing an informed semantic merge operation comprises selecting a source region in a document and a target region in the same or a different document. A bi-directionally coupled surface region is identified in the source region and a bi-directionally coupled surface region is identified in the target region. A first semantic object coupled to the surface region in the source region is identified and a second semantic object coupled to the surface region in the target region is identified. The subcomponents of the first semantic object are combined with the subcomponents of the second semantic object by merging.
Abstract:
A method and apparatus for the recording and maintenance of semantic elements in electronically-held information objects provide for grounding semantic objects in an ontology, such that inheritance and other relations between concepts are preserved in persistent storage. The disclosed method and apparatus provide semantic document authors with a means to anchor concept references to specific, persistent, semantic objects, thereby providing the system with access to all properties of the underlying data model of the semantic objects being referenced, while also specifying the type and scope of their relations, as well as behavioral aspects of the visual and editing environment.
Abstract:
A technique for representing an information need and employing one or more filters to select documents that satisfy the represented information need, including a technique of creating filters that involves (a) dividing a set of documents into one or more subsets such that each subset can be used as the source of features for creating a filtering profile or used to set or validate the score threshold for the profile and (b) determining whether multiple profiles are required and how to combine them to create an effective filter. Multiple profiles can be incorporated into an individual filter and the individual filters combined to create an ensemble filter. Ensemble filters can then be further combined to create meta filters.
Abstract:
One example of a semantically informed text operation comprises selecting a source region of a document and determining if the source region has a surface region bi-directionally coupled to a semantic object. The coupled semantic object is identified as are the presentation(s) associated with the semantic object. A target region of the same or anther document is selected. Any of the presentations that cannot be expressed in the target region are eliminated to identify a set of remaining presentations. A set of semantic choices based on the remaining presentations is determined. One of the semantic choices is selected and executed in the target region.
Abstract:
A method of automatically constructing a model of an activity from an unsupervised examination of a plurality of textual documents describing the activity is comprised of: extracting prototypical steps from the plurality of textual documents; sequencing the extracted steps; aligning the sequenced steps; and constructing the model based on the aligned steps. The model may take the form of a step vs. position matrix which identifies the prototypical steps that make up the activity and provides the probability of each step occupying each position within the activity. The model thus constitutes common sense knowledge that encodes the stereotypical steps of an activity and the stereotypical sequencing of the steps.
Abstract:
A computer-readable medium comprises data structure for providing information about levels of similarity between pairs of N documents. The data structure comprises a plurality of entries of similarity values representing levels of similarity for a plurality of pairs of the documents. Each of the similarity values represents a level of similarity of one document of a given pair relative to the other document of the given pair. The similarity value of each entry is greater than a threshold similarity value that is greater than zero. The plurality of similarity-value entries are fewer than N2−N in number if the similarity values are asymmetric with regard to document pairing, and the plurality of similarity-value entries are fewer than N 2 - N 2 in number if the similarity values are symmetric with regard to document pairing. A method and apparatus for generating the data structure are described.
Abstract:
A technique for representing an information need and employing one or more filters to select documents that satisfy the represented information need, including a technique of creating filters that involves (a) dividing a set of documents into one or more subsets such that each subset can be used as the source of features for creating a filtering profile or used to set or validate the score threshold for the profile and (b) determining whether multiple profiles are required and how to combine them to create an effective filter. Multiple profiles can be incorporated into an individual filter and the individual filters combined to create an ensemble filter. Ensemble filters can then be further combined to create meta filters.
Abstract:
A computer-based process is described for identifying clusters of documents that have some degree of similarity from among a set of documents that permits user interaction with the process. A plurality of seed candidate documents is identified. Candidate probes based upon the seed candidate documents are generated, and information regarding the candidate probes is displayed to a user. User input regarding the candidate probes is received, and a set of probes from which to form clusters of documents are defined based upon the user input regarding the candidate probes. A probe is selected and a cluster of documents is formed from among available documents not yet clustered using the probe. The process can be repeated to generate further clusters. The process can be implemented with a computer system, and associated programming instructions can be contained within a computer readable medium.