摘要:
A method for stabilizing a knowledge graph includes: generating a knowledge graph in which same entities in a semantic relation list between entities provided as an input are represented as a single node based on names and types of the entities; computing, on the knowledge graph, semantic similarities between all potential entity pairs of same entity types by comparing, for each potential entity pair, a type of relation associated with an entity in the entity pair and an opponent entity to the entity; and selecting, based on the semantic similarities, a representative entity from each of semantically similar entity pairs on the knowledge graph and integrating an opponent entity to the representative entity into the representative entity. The method further includes computing relation weighted values between the entities by using a graph analysis and statistic information, and adding the weighted values to the knowledge graph.
摘要:
A method for stabilizing a knowledge graph includes: generating a knowledge graph in which same entities in a semantic relation list between entities provided as an input are represented as a single node based on names and types of the entities; computing, on the knowledge graph, semantic similarities between all potential entity pairs of same entity types by comparing, for each potential entity pair, a type of relation associated with an entity in the entity pair and an opponent entity to the entity; and selecting, based on the semantic similarities, a representative entity from each of semantically similar entity pairs on the knowledge graph and integrating an opponent entity to the representative entity into the representative entity. The method further includes computing relation weighted values between the entities by using a graph analysis and statistic information, and adding the weighted values to the knowledge graph.
摘要:
An electronic document processing apparatus includes: a document set storage unit storing hash tables including hash values of documents to be processed; a content extraction unit for extracting body contents from a newly input electronic document; and a sentence separation unit for separating sentences from the extracted body contents. The apparatus further includes a duplicate document determination unit for converting the separated sentences into unique hash values by a hash algorithm, determining each of the separated checking if there is a duplicate sentence depending on whether or not there is a collision between the converted hash values and the hash values in the hash tables of the document set storage unit, and determining if the electronic document is a duplicate document based on the ratio of duplicate sentences to all of the sentences in the electronic document.
摘要:
An apparatus for verifying training data using machine learning includes: a training data separation unit for separating provided initial training data into N training data and N verification data, where N is a natural number; a machine learning unit for performing machine learning on the separated training data to generate a training model; an automatic tagging unit for automatically tagging an original text of the verification data using the generated training model to provide automatic tagging results; and an error determination unit for comparing the verification data to the automatic tagging results to determine error candidates of the training data.
摘要:
A question type and domain identifying apparatus includes: a question type identifier for recognizing the number of words of a user's question to identify whether the user's question is a query for performing information searching or a question for performing a question and answer (Q&A); a question domain distributor for distributing one of plural preset domain specialized Q&A engines, as a Q&A engine of the user's question based on the recognized word number; and a Q&A engine block, including the domain specialized Q&A engines, for selectively performing information searching or a Q&A with respect to the user's question in response to the distribution of the question domain distributor.
摘要:
The invention provides an apparatus and method for selecting an online advertisement. An apparatus for selecting an online advertisement based on contents sentiment and intention analysis includes a context analysis unit for analyzing a context of contents, a context matching advertisement recommendation unit for selecting an advertisement matching with the context of the contents from an advertisement database (DB) based on the result of the analyzed context, an sentiment information analysis unit for analyzing an sentiment object and sentiment information variously described in the contents based on the result of the analyzed context, an intention recognition unit for recognizing a writing intention of the contents, and an advertisement selection unit for excluding the selected advertisement for the contents or selecting an alternative advertisement depending on the result of the analyzed context, the result of the analyzed sentiment object and sentiment information and the recognized writing intention.
摘要:
A topic map based indexing apparatus analyzes community Q/A lists to acquire Q/A analysis information, removes redundant answers depending on the Q/A analysis information, removes insignificant answers based on the degree of reliability, ranks answer lists, and extracts the highest ranking answer as a best answer, to thereby store, in a community Q/A topic map, index information containing the community Q/A lists and the Q/A analysis information. A topic map based searching apparatus analyzes a user question to acquire question analysis information, searches similar questions from community Q/A lists belonging to a specific topic node of a pre-stored community Q/A topic map, ranks the searched similar questions depending on the question analysis information, removes redundant answers among answers to the ranked similar questions, ranks the answers, and extracts the highest ranking answer as a best answer.
摘要:
A topic map based indexing apparatus analyzes community Q/A lists to acquire Q/A analysis information, removes redundant answers depending on the Q/A analysis information, removes insignificant answers based on the degree of reliability, ranks answer lists, and extracts the highest ranking answer as a best answer, to thereby store, in a community Q/A topic map, index information containing the community Q/A lists and the Q/A analysis information. A topic map based searching apparatus analyzes a user question to acquire question analysis information, searches similar questions from community Q/A lists belonging to a specific topic node of a pre-stored community Q/A topic map, ranks the searched similar questions depending on the question analysis information, removes redundant answers among answers to the ranked similar questions, ranks the answers, and extracts the highest ranking answer as a best answer.
摘要:
A method for automatically extracting information of products, includes searching documents based on product names; and extracting sentences including advantages and disadvantages for products having the product names from the searched documents. Further, the method for automatically extracting the information of the products includes classifying the sentences by similar contents among the extracted sentences; selecting representative sentences among the classified sentences; and calculating each weight of the selected representative sentences.