摘要:
A data classifier system of the present invention selects a plurality of classifications correlated to data groups so as to output classification axes based on hierarchical classifications and data groups. The data classifier system includes a basic category accumulation means, a classification axis candidate creation means and a priority calculation means. The basic category accumulation means accumulates classifications serving as basic categories used for selecting desired classifications in advance. The classification axis candidate creation means creates classification axis candidates based on combinations of classifications each correlated to at least one data among descendant classifications of each basic category. The priority calculation means calculates priorities with respect to the classification axis candidates created by the classification axis candidate creation means based on hierarchical distances of classifications in the classified hierarchy.
摘要:
A classification hierarchy regeneration system is provided, wherein when a new classification hierarchy is generated by restructuring an existing classification hierarchy, a classification hierarchy in view of hierarchical relationship of classifications and a classification hierarchy integrating classifications of the same meaning can be efficiently generated. The clustering means clusters a data group associated with a hierarchical classification, and generating a classification group, i.e., a group obtained by extracting a classification satisfying a condition defined in advance from classifications corresponding to respective data in a cluster. The cooccurrence degree calculation means calculates a degree of cooccurrence of two classifications selected from the classification group. The classification hierarchy regeneration means regenerates the hierarchy of classification based on the classification group and the degree of cooccurrence.
摘要:
In order to calculate a reliability that serves as an index of reliableness of an evaluator who evaluated a document, a reliability calculation apparatus (2) is provided with a reliability calculation unit (21) that specifies an evaluation by each evaluator with respect to each author, based on first information specifying respective correspondence relationships between documents targeted for evaluation, evaluators who evaluated the documents and contents of the evaluations, and second information specifying respective correspondence relationships between the documents and authors of the documents, and calculates the reliability of each evaluator, based on the specified evaluation with respect to each author.
摘要:
In an inverted list of each node in a taxonomy, among each node, an inverted list of the highest node is a list of integer values indicating an identifier of search subject data, and an inverted list of a node other than the highest node, in place of the identifier, is a list of integer values indicating a position in an inverted list corresponding to a node that is higher by one than the node. Furthermore, a list of integer values in an inverted list of each node is divided into two or more blocks, and a differential value between an integer value and an integer value directly before the integer value in the block is converted into a bit string of a variable length integer code.
摘要:
A communication assistance device (10) includes a communication level determination unit (11) and a topic recommendation unit (16) so as to determine a level of a relationship between users who communicate with each other and provide communication assistance using the result of the determination. The communication level determination unit (11) determines the level (communication level) of the relationship between the users based on similarity between the users obtained from preference information showing preferences of the users, and on user action records showing records of actions taken by a certain user toward a partner user with whom the certain user communicates out of the users. The topic recommendation unit (16) selects, from among a group of topics prepared in advance, a topic that can be transmitted to the partner user based on the determined level of the relationship between the users and on preferences of the certain user and the partner user.
摘要:
In the provided document clustering system (100), a concept tree structure accumulation unit (11) stores a concept tree structure that represents a hierarchical relationship among concepts represented by each of a plurality of words. For any two words, a concept similarity computation unit (12) obtains a concept similarity, which is an index indicating how close the concepts represented by the two words are. Using concept similarities for words that appear in two documents in a document set, an inter-document similarity computation unit (13) obtains an inter-document similarity, which indicates how similar the two documents are semantically. A clustering unit (14) uses inter-document similarities to cluster the documents in the document set.
摘要:
To provide a technique for structuralizing ontology in a prescribed form to a structure to which features of data are reflected. An ontology processing device has a structuralizing device for structuralizing properties of the ontology in the prescribed form generated from a set of instance data containing a combination of a subject, a property, and an object expressed with a character string according to the features of the object, and has a ontology storage device which stores the ontology structuralized by the structuralizing device. With this structure, the properties of the ontology in the prescribed form are corrected or expressed as an ontology structure by reflecting the characteristics of a set of the objects obtained from the data.
摘要:
A data classifier system of the present invention selects a plurality of classifications correlated to data groups so as to output classification axes based on hierarchical classifications and data groups. The data classifier system includes a basic category accumulation means, a classification axis candidate creation means and a priority calculation means. The basic category accumulation means accumulates classifications serving as basic categories used for selecting desired classifications in advance. The classification axis candidate creation means creates classification axis candidates based on combinations of classifications each correlated to at least one data among descendant classifications of each basic category. The priority calculation means calculates priorities with respect to the classification axis candidates created by the classification axis candidate creation means based on hierarchical distances of classifications in the classified hierarchy.
摘要:
A communication assistance device (10) includes a communication level determination unit (11) so as to determine a level of a relationship between users who communicate with each other. The communication level determination unit (11) determines the level (communication level) of the relationship between the users based on similarity between the users obtained from preference information showing preferences of the users, and on user action records showing records of actions taken by a certain user toward a partner user with whom the certain user communicates out of the users.
摘要:
A meaning extraction device includes a clustering unit, an extraction rule generation unit and an extraction rule application unit. The clustering unit acquires feature vectors that transform numerical features representing the features of words having specific meanings and the surrounding words into elements, and clusters the acquired feature vectors into a plurality of clusters on the basis of the degree of similarity between feature vectors. The extraction rule generation unit performs machine learning based on the feature vectors within a cluster for each cluster, and generates extraction rules to extract words having specific meanings. The extraction rule application unit receives feature vectors generated from the words in documents which are subject to meaning extraction, specifies the optimum extraction rules for the feature vectors, and extracts the meanings of the words on the basis of which the feature vectors were generated by applying the specified extraction rules to the feature vectors.