摘要:
A rule learning method for making a computer perform rule learning processing in machine learning includes firstly calculating an evaluation value of respective features in a training example by using data and weights of the training examples; selecting a given number of features in descending order of the evaluation values; secondly calculating a confidence value for one of the given number of selected features; updating the weights of training example, by using the data and weights of the training examples, and the confidence value corresponding to the one feature; firstly repeating the updating for the remaining features of the given number of features; and secondly repeating, for a given number of times, the firstly calculating, the selecting, the secondly calculating, the updating, and the firstly repeating.
摘要:
A rule learning method in machine learning includes distributing features to a given number of buckets based on a weight of the features which are correlated with a training example; specifying a feature with a maximum gain value as a rule based on a weight of the training example from each of the buckets; calculating a confidence value of the specified rule based on the weight of the training example; storing the specified rule and the confidence value in a rule data storage unit; updating the weights of the training examples based on the specified rule, the confidence value of the specified rule, data of the training example, and the weight of the training example; and repeating the distributing, the specifying, the calculating, the storing, and the updating, when the rule and the confidence value are to be further generated.
摘要:
A named entity extraction apparatus includes an extraction result acquisition unit for acquiring a named entity extraction result obtained as a result of a named entity extraction process; and a lexicon information creation unit for creating lexicon information which is utilized as clues in extracting named entities from text data, on the basis of the named entity extraction result acquired by said extraction result acquisition unit.
摘要:
A generation-target selecting unit selects supervised data from a supervised-data storage unit. A supervised generation unit generates the supervised data to produce new supervised data. A validity determining unit makes a rule learning unit learn the generated data and the supervised data, and makes an extracting unit to extract information using test data to evaluate a result of extracting the information. When the result is improved compared with a result before adding the supervised data generated, the supervised data generated is taken as the correct supervised data.
摘要:
An object of the present invention is to carry out publication control for a portion of contents according to its valid period. This invention includes: reading out publication data including first data whose publication should be controlled, publication control condition data relating to a valid period of the first data, and second data whose publication does not have to be controlled from a publication data storage storing the publication data to judge whether or not a condition defined in the publication control condition data is satisfied; and upon detecting that the condition defined in the publication control condition data is satisfied, generating current publication data including the first data corresponding to the publication control condition data whose condition is judged to be satisfied and the second data and outputting the generated current publication data. In this way, when the publication of the first data is controlled based on the publication control condition data concerning the valid period, it becomes possible to control not to open information whose validity has been lost such as the contact telephone number to inquire the event, to the public, for example, after the event ended or the like.
摘要:
In an information processing device, a display control unit arranges one of options at a predetermined center position, while arranging others radially around the center option, and displays the options in a selectable manner. Before a user provides input through an input unit by operating an arrow key, a cursor is placed on the center option. The options are displayed in a matrix, and the cursor is initially placed and displayed in the center of the matrix.
摘要:
This invention provides a technique to correctly inform the human being of content of contents to be published but to prevent machines from collecting part of the contents whose distribution is not desired by the information provider. This invention includes: reading out contents data to be published, which includes text data, and identifying a character string whose output as the text data should be avoided from the contents data; converting the identified character string into substitution data other than the text data so as to maintain content of the identified character string; and generating publication contents data to maintain publication content of the contents data by using data other than the identified character string in the contents data and the substitution data. Thus, by carrying out such a processing, it becomes possible to conceal the character string against machines without changing the publication content for the human being.
摘要:
This invention provides a technique to correctly inform the human being of content of contents to be published but to prevent machines from collecting part of the contents whose distribution is not desired by the information provider. This invention includes: reading out contents data to be published, which includes text data, and identifying a character string whose output as the text data should be avoided from the contents data; converting the identified character string into substitution data other than the text data so as to maintain content of the identified character string; and generating publication contents data to maintain publication content of the contents data by using data other than the identified character string in the contents data and the substitution data. Thus, by carrying out such a processing, it becomes possible to conceal the character string against machines without changing the publication content for the human being.
摘要:
An index-item extracting unit extracts an index item that forms an index of an electronic document, together with appearing position information of the index item, from the electronic document. An index-list creating unit creates link information that includes the appearing position in the electronic document of the extracted index item as a link, attaches the created link information to the index item, and creates an index list by arranging the index item to which the link information is attached.