摘要:
A communication assistance device (10) includes a communication level determination unit (11) so as to determine a level of a relationship between users who communicate with each other. The communication level determination unit (11) determines the level (communication level) of the relationship between the users based on similarity between the users obtained from preference information showing preferences of the users, and on user action records showing records of actions taken by a certain user toward a partner user with whom the certain user communicates out of the users.
摘要:
A meaning extraction device includes a clustering unit, an extraction rule generation unit and an extraction rule application unit. The clustering unit acquires feature vectors that transform numerical features representing the features of words having specific meanings and the surrounding words into elements, and clusters the acquired feature vectors into a plurality of clusters on the basis of the degree of similarity between feature vectors. The extraction rule generation unit performs machine learning based on the feature vectors within a cluster for each cluster, and generates extraction rules to extract words having specific meanings. The extraction rule application unit receives feature vectors generated from the words in documents which are subject to meaning extraction, specifies the optimum extraction rules for the feature vectors, and extracts the meanings of the words on the basis of which the feature vectors were generated by applying the specified extraction rules to the feature vectors.
摘要:
An information analysis apparatus that performs an analysis on text information to determine whether or not the text information corresponds to the target information. The information analysis apparatus includes a storage device that stores the text information; a density estimation unit that estimates, in units of analysis each composed of a plurality of sentences of text information, a density indicating the degree to which the target information is included in the unit of analysis; and a determination unit that obtains an evaluation value indicating the degree to which each sentence included in each unit of analysis corresponds to the target information from the estimated density of the unit of analysis, and determines whether or not the sentence corresponds to the target information based on the evaluation value.
摘要:
Sets of strings of which the drawing positions are arranged in one direction are extracted from a document as attribute groups. An attribute name score is calculated for each attribute group to determine an extent to which each attribute group is a set of attribute names. Based on the attribute name scores, an attribute name group is selected out of the attribute groups. From among the attribute groups, an attribute group which includes a string which is the same as at least one string of the attribute name group and of which the drawing position is the same as that of the string of the attribute name group is selected. From the string at the same drawing position, an attribute name is extracted. From the other strings of the selected attribute group than those at the same drawing position, an attribute value corresponding to the attribute name is extracted.
摘要:
A cooccurrence dictionary creating system includes: a language analyzing section which subjects a text to a morpheme analysis, a clause specification, and a modification relationship analysis between clauses, a cooccurrence relationship collecting section which collects cooccurrences of nouns in each clause of the text, modification relationships of nouns and declinable words, and modification relationships between declinable words as cooccurrence relationships, a cooccurrence score calculating section which calculates a cooccurrence score of the cooccurrence relationship based on a frequency of the collected cooccurrence relationship, and a cooccurrence dictionary storage section which stores a cooccurrence dictionary in which a correspondence between the calculated cooccurrence score and the cooccurrence relationship is described.
摘要:
A communication assistance device (10) includes a communication level determination unit (11) so as to determine a level of a relationship between users who communicate with each other. The communication level determination unit (11) determines the level (communication level) of the relationship between the users based on similarity between the users obtained from preference information showing preferences of the users, and on user action records showing records of actions taken by a certain user toward a partner user with whom the certain user communicates out of the users.
摘要:
When gathering words through a dictionary growth process, a dictionary growth unit (102) stores information indicating through what process of input and output a word has been gathered in a gathering process memory unit (107). Then, a clustering unit (103) classifies the word that has been gathered by the dictionary growth process into clusters on the basis of information recorded in the gathering process memory unit (107). Next, a type determination unit (104) determines whether a word comprising a cluster is of the same type as a seed word or of a different type, for each cluster into which the word has been classified, on the basis of information recorded in the gather process memory unit (107). In addition, an output unit (105) associates information indicating the gathered word, the cluster to which the word belongs and whether the cluster is of the same type as the seed word or of a different type, and displays such.
摘要:
The disclosed apparatus uses a training data generation apparatus 2, which generates training data used for creating characteristic expression extraction rules. The training data generation apparatus 2 includes: a training data candidate clustering unit 21, which clusters a plurality of training data candidates assigned labels indicating annotation classes based on feature values containing respective context information, and a training data generation unit 22 which, by referring to each cluster obtained using the clustering results, obtains the distribution of the labels of the training data candidates within the cluster, identifies training data candidates that meet a preset condition based on the obtained distribution, and generates training data using the identified training data candidates.
摘要:
Sets of strings of which the drawing positions are arranged in one direction are extracted from a document as attribute groups. An attribute name score is calculated for each attribute group to determine an extent to which each attribute group is a set of attribute names. Based on the attribute name scores, an attribute name group is selected out of the attribute groups. From among the attribute groups, an attribute group which includes a string which is the same as at least one string of the attribute name group and of which the drawing position is the same as that of the string of the attribute name group is selected. From the string at the same drawing position, an attribute name is extracted. From the other strings of the selected attribute group than those at the same drawing position, an attribute value corresponding to the attribute name is extracted.
摘要:
To provide a technique for structuralizing ontology in a prescribed form to a structure to which features of data are reflected. An ontology processing device has a structuralizing device for structuralizing properties of the ontology in the prescribed form generated from a set of instance data containing a combination of a subject, a property, and an object expressed with a character string according to the features of the object, and has a ontology storage device which stores the ontology structuralized by the structuralizing device. With this structure, the properties of the ontology in the prescribed form are corrected or expressed as an ontology structure by reflecting the characteristics of a set of the objects obtained from the data.