摘要:
A method and analytics tools for information mining incorporating domain specific knowledge and conceptual structures are disclosed, the method including: providing a first set of documents related to a first topic of interest; using a first taxonomy to categorize the first set of documents into a set of categories; providing a second set of documents related to a second topic of interest; categorizing the second set of documents according to the set of categories of the first set of documents; using an element of domain knowledge to re-categorize the first set of documents; and examining a category to identify a document of interest.
摘要:
A data warehouse is created using an input file that can contain sub-documents of different formats. A root document model including path names to all nodes among the sub-documents is generated, and a table is generated with columns being derived from the path names of the root model. The sub-documents are shredded to populate the table. Then, the dimensions of the data warehouse are defined by selecting respective columns. A routine such as a DDL may then be generated to populate the data warehouse with data.
摘要:
A method is disclosed for use with at least one initial document describing a technical concept suitable for licensing, the method comprising: retrieving a set of intellectual property documents from a data warehouse; partitioning the set of intellectual property documents into a plurality of document categories; classifying the set of intellectual property documents by an industry parameter; constructing a contingency table that includes a listing of industry classifications for each of the document categories, and identifying documents within a particular one of the document categories that have different industry classifications so as to identify at least one potential new licensee industry of the technical concept described in the initial document.
摘要:
A method and analytics tools for locating experts with specific sets of expertise are disclosed, the method including providing a collection of documents P0; generating categories representing fields of expertise derived from the collection of documents P0; refining the taxonomy of the categories by applying user domain knowledge; extracting structured fields from the collection of documents P0; constructing a contingency table having a first axis defined by the extracted structured fields and a second axis defined by the categories; and using the contingency table to identify a set of experts having a related expertise. The method may also include a network graph analysis that aids visualization of the relationship between people and expertise.
摘要:
A method for analyzing predefined subject matter in a patent database being for use with a set of target patents, each target patent related to the predefined subject matter, the method comprising: creating a feature space based on frequently occurring terms found in the set of target patents; creating a partition taxonomy based on a clustered configuration of the feature space; editing the partition taxonomy using domain expertise to produce an edited partition taxonomy; creating a classification taxonomy based on structured features present in the edited partition taxonomy; creating a contingency table by comparing the edited partition taxonomy and the classification taxonomy to provide entries in the contingency table; and identifying all significant relationships in the contingency table to help determine the presence of any white space.
摘要:
A vectorization process is employed in which chemical identifier strings are converted into respective vectors. These vectors may then be searched to identify molecules that are identical or similar to each other. The dimensions of the vector space can be defined by sequences of symbols that make up the chemical identifier strings. The International Chemical Identifier (InChI) string defined by the International Union of Pure and Applied Chemistry (IUPAC) is particularly well suited for these methods.
摘要:
A method and system for interesting relationships in text documents includes generating a dictionary of keywords in the text documents, forming categories of the text documents using the dictionary and an automated algorithm, counting occurrences of the structured variables, categories and structured variable/category combinations in the text documents, and calculating probabilities of occurrences of the structured variable/category combinations.
摘要:
A method and analytics tools for locating experts with specific sets of expertise are disclosed, the method including providing a collection of documents P0; generating categories representing fields of expertise derived from the collection of documents P0; refining the taxonomy of the categories by applying user domain knowledge; extracting structured fields from the collection of documents P0; constructing a contingency table having a first axis defined by the extracted structured fields and a second axis defined by the categories; and using the contingency table to identify a set of experts having a related expertise. The method may also include a network graph analysis that aids visualization of the relationship between people and expertise.
摘要:
A method and analytics tools for locating experts with specific sets of expertise are disclosed, the method including providing a collection of documents P0; generating categories representing fields of expertise derived from the collection of documents P0; refining the taxonomy of the categories by applying user domain knowledge; extracting structured fields from the collection of documents P0; constructing a contingency table having a first axis defined by the extracted structured fields and a second axis defined by the categories; and using the contingency table to identify a set of experts having a related expertise. The method may also include a network graph analysis that aids visualization of the relationship between people and expertise.
摘要:
According to a preferred embodiment of the present invention, a bridging system (100) and method provides a way of linking two independent data systems by receiving a dataset from a source data system. The bridging system (100) translates the dataset from a source schema to a target schema according to a set of mapping rules, and queues the translated data in persistent storage, and then sends the translated dataset to a destination data system. The system (100) includes an XML bridge (114), multiple application specific gateways (116,118), and a web admin interface (210), all in communication via a wide area network.