摘要:
An information analysis device (1) uses a plurality of linguistic expressions as an analysis target, includes a link information generating unit (3) and a correlation value calculation unit (4). The link information generating unit (3) extracts time information included in each of a plurality of electronic documents including at least any one of the plurality of linguistic expressions and a relationship between the electronic documents in the plurality of electronic documents from the plurality of electronic documents, detects a link between one linguistic expression and another linguistic expression in the plurality of linguistic expressions and an appearance time of the link based on the extracted time information and the relationship between the electronic documents, and generates link information specifying the extracted link and the appearance time of the link. The correlation value calculation unit (4) specifies the number of appearances of links between the one linguistic expression and the other linguistic expression and an appearance time of each link based on the link information, and calculates a correlation value between the one linguistic expression and the other linguistic expression according to a degree that the link continuously appears by using the specified number of appearances of the link and the appearance time of each link.
摘要:
A new case whose type is the same as that of a case about information desired to be extracted can be generated with high accuracy.A new case generation device according to the present invention includes: new case generating means that receives a case about information desired to be extracted and a case context being text data that includes data on the case and parts present near the case, and generates, on the basis of the received case and the received case context, new cases and new case contexts with the use of document data, the type of the new cases being the same as that of the received case, and the new case contexts being text data that includes data on the new cases and parts present near the new cases and being different from the case context; similarity calculating means that calculates similarities between the case context and the new case contexts; and new case narrowing down means that narrows down, on the basis of the similarities calculated by the similarity calculating means, the new cases generated by the new case generating means and outputs a new case selected by the narrowing-down operation.
摘要:
[Problems] To accurately calculate similarity between media data and a query even if the media data or its meta data has an error.[Means for Solving the Problems] A similarity calculation device includes: a single score calculation device used when calculating similarity between first media data and a query, which calculates a single score that shows similarity between second media data different from the first media data and the query; an inter-media similarity calculation device which calculates inter-media similarity that shows the similarity between the second media data and the first media data; and a query similarity calculation device which obtains similarity between the first media data and the query by using the inter-media similarity of the second media data and the single score.
摘要:
A translation supporting apparatus which searches out a translation example useful for a translation task from within a translation example database is disclosed. The translation example database stores character strings of a first language and translation results of a second language corresponding to the character strings in a unit of a document. A retrieval request inputting apparatus inputs a translation target sentence. A similarity retrieval apparatus determines, for each translation example, a similarity to the translation target sentence, a similarity to a translation example context which is another translation example having such a predetermined relationship that it is included in the same document and is present within one sentence before or after the translation example, a similarity to a retrieval request context which is another translation target character string having such a predetermined relationship that it is included in the same document as the translation target character string and is present within the range of one sentence before or after the translation target character string, and a similarity between the translation example context and the retrieval request context, and integrates the four similarities. A similar example outputting apparatus refers to the integrated similarities and outputs those translation examples similar to the translation target character string.
摘要:
Provided is a similarity search apparatus for searching data at a higher speed than that of the prior art without limiting the types of letter of a search key. A unit position correspondence memory stores therein a table that expresses the ordinal number among units at which each unit in a search key inputted by means of a keyboard has appeared within the search key. A search section refers to the table stored in the unit position correspondence memory and operates every time units are read out one by one from a database memory including a plurality of units to generate a plurality of status parameters each of which includes a similarity, a position of coincidence and a skip number, which express with what number of units from the top of the search key the units read out from the database have coincided at what degree of similarity, and express how many units in the database have been skipped over subsequently. Through the above process, the search section updates each status parameter stored in a status parameter memory and operates upon detecting a unit string coincident at a similarity equal to or lower than an inputted similarity, to output the detected unit string as a unit string of a similarity.
摘要:
In a processor for extracting information on a specified field from a text described in a natural language, keywords and structural analysis are jointly used to improve the performance. When a set of keywords is divided in more than one sentence, this set of keywords is assembled by context defining words in a sentence. A multi-language summary generator uses this type of a processor.
摘要:
A natural-language processing system includes a registration-candidate storage section that stores therein registration-candidate dictionary data, a judgment means that compares input data against the registration-candidate dictionary data to thereby judge whether or not the input data includes a word corresponding to the registration-candidate dictionary data, an inquiry means that inquires to a user whether or not corresponding dictionary data is to be registered in a dictionary storage section to accept a user's instruction if it is judged that a corresponding word exists, a dictionary registration means that registers the corresponding dictionary data in the dictionary storage section based on the input instruction, and a natural-language processing means that executes a natural-language processing onto the input data by using the dictionary data registered in the dictionary storage section.
摘要:
A document analysis apparatus comprises: a feature expression acquisition unit acquiring a feature expression appearing during an attention period in an analysis object document collection; a document collection acquisition unit acquiring a feature expression containing document (FECD) collection in which a feature expression appears, from an analysis population including an analysis object document collection; a context determination unit specifying an analysis/FECD corresponding to an analysis object document among a FECD collection for every feature expression, and specifies a context in which the feature expression appeared in multiple analysis/FECDs; a context comparison determination unit specifying a non analysis/FECD not corresponding to an analysis object document among a FECD collection, and within that, compares a context in which the feature expression has appeared and a context specified previously; and a feature degree setting unit performing giving or the like of a feature degree to a feature expression from the comparison.
摘要:
Described are a reputation analysis device, reputation analysis method, and reputation analysis-use program capable of suitably analyzing temporal changes in reputation for an object indicated by a keyword. The disclosed reputation analysis device is provided with a voluntary activity description extraction means for extracting descriptions representing voluntary activity related to an object indicated by a keyword that has been input from within a plurality of documents; and a reputation chronological data estimation means for counting the number of occurrences of voluntary activity at each time point wherein the voluntary activity expressed by a description representing the voluntary activity related to the object has been performed, and estimating reputation chronological data for chronologically representing evaluations for the object by the agents of the voluntary activity.
摘要:
Disclosed is an information estimation device for estimating an appropriate issue time from a time representation described in a document without intervention of any operator; wherein an information estimation device (1) which is a device for estimating an issue time of a document to be estimated, includes a candidate generation unit (11) which extracts a time representation described in the document, and on the basis of the extracted time representation, generates a plurality of possible issue time candidates having possibilities corresponding to the issue time of the document; and an issue time estimation unit (12) for obtaining a temporal proximity, for each of the plurality of issue time candidates, between the issue time candidate and other issue time candidates, and on the basis of the obtained temporal proximity, estimating the issue time of the document.