摘要:
A text-mining system and method automatically extracts useful information from a large set of tree-structured data by generating successive sets of candidate tree-structured association patterns for comparison with the tree-structured data. The number of times is counted that each of the candidate association patterns matches with a tree in the set of tree-structured data in order to determine which of the candidate association patterns frequently matches with a tree in the data set. Each successive set of candidate association patterns is generated from the frequent association patterns determined from the previous set of candidate association patterns.
摘要:
To provide a method for performing a plurality of aggregations in parallel and at a high speed, in a computer system so constructed that each of a plurality of processors connected across a network can use a memory area for itself and a part of the database for itself that includes data categorized into one or a plurality of groups, a method comprising the steps of: (a) ensuring space for storing results of M aggregate queries of the N aggregate queries (M is an integer equal to or less than N) in the memory area for itself in each processor; (b) executing all of the M aggregate queries for the part of the database for itself in each processor; (c) transmitting the results of the M aggregate queries executed by each processor to another processor for counting up and calculating of a final result for counting up; and (d) repeating the steps (a) to (c) until execution of the N aggregate queries is completed by each processor.
摘要:
The two-dimensional coordinates of a vertex are extracted in each of a top view and front view and, if their X-coordinates are equal to each other, the combination of their Y-coordinate values is determined to be the two-dimensional coordinates of a candidate vertex in a side view. Then, candidate line segments for the side view are extracted from the line segments connecting two candidate vertices, excepting not only those line segments for which no corresponding line segment exists in the top and front views, but a so those line segments for which corresponding horizontal or vertical line segments exist in the top and front views, and which a e not horizontal or vertical in the side view.
摘要:
It is an object of the present invention to find out parts to be a highly possible cause of failure without searching all of part data of all of products. Dispersed parts data on a parts tree are sequentially accessed from a set of known failed products, and part attribute values each having a higher support in the faulty product are extracted. In this process, a subset of parts used in the faulty product is also obtained simultaneously. The part attribute values having higher supports and the subset of parts used in the faulty product are represented as a tree in which a parts type serves as a node. Next, an information gain of a rule that having the two part attribute values is a cause of failure is calculated on two part attribute values having higher supports on the tree of the parts type. This calculation is locally performed on a common parent part of two parts and parts having a certain information gain is outputted as a cause of failure. How to select these two part attributes is performed in such a way that part attributes located closer to each other on the tree are first evaluated, and first found part attributes are made a candidate of a cause of failure.
摘要:
It is an object of the present invention to find out parts to be a highly possible cause of failure without searching all of part data of all of products.Dispersed parts data on a parts tree are sequentially accessed from a set of known failed products, and part attribute values each having a higher support in the faulty product are extracted. In this process, a subset of parts used in the faulty product is also obtained simultaneously. The part attribute values having higher supports and the subset of parts used in the faulty product are represented as a tree in which a parts type serves as a node. Next, an information gain of a rule that having the two part attribute values is a cause of failure is calculated on two part attribute values having higher supports on the tree of the parts type. This calculation is locally performed on a common parent part of two parts and parts having a certain information gain is outputted as a cause of failure. How to select these two part attributes is performed in such a way that part attributes located closer to each other on the tree are first evaluated, and first found part attributes are made a candidate of a cause of failure.
摘要:
A candidate synonym acquisition device acquires a set of candidate synonyms similar to an input word for each writer from data for each writer, and acquires a set of candidate synonyms similar to the input word from a collective data. A generated candidate synonym set is inputted to a candidate synonym determination device to evaluate the candidate synonyms of the collective data. In the evaluation, the status of “absolute” is given to a word matching a word ranked first in the candidate synonyms for each writer and the status of “negative” is given to words matching words ranked second and lower therein.
摘要:
Provides graphics display apparatus, systems and methods for effectively presenting information obtained by data mining, and to improve the visibility of the display of individual data elements and attributes of data included in a particular category while allowing an overview of whole large-scale hierarchical data to be provided. An example embodiment includes an aggregation unit for performing aggregation of attributes of nodes in the hierarchical data according to given aggregation criteria; a filtering unit for filtering the result of aggregation performed by the aggregation unit according to given filtering criteria to select nodes to be displayed from the hierarchical data; and a visualization unit for generating a graphics image that includes the nodes to be displayed selected by the filtering unit and reflects the hierarchical structure of the hierarchical data.
摘要:
Different virtual labels, for example, like +1 and −1, are assigned to two data sets. A change analysis problem for the two data sets is reduced to a supervised learning problem by using the virtual labels. Specifically, a classifier such as logical regression, decision tree and SVM is prepared and is trained by use of a data set obtained by merging the two data sets assigned the virtual labels. A feature selection function of the resultant classifier is used to rank and output both every attribute contributing to classification and its contribution rate.
摘要:
An apparatus is provided with base table storage sections that store base tables and delta tables for the base tables, a summary table storage section that stores a summary table for storing results of queries to a plurality of base tables and delta information about the summary table, delta data processing sections that insert delta data of the base tables into the delta tables, and a delta computation processing section that generates delta information about the summary table. The delta computation processing section is provided with a generation section that generates delta information about a specified base table on the basis of an update that has been performed for the base table, in a situation where a subsequent update of the specified base table is permitted; and a control section that performs control so that, when a different base table is specified, delta information is generated in a different transaction.