摘要:
A multimedia object retrieval apparatus and method for retrieving multimedia objects from structured documents containing both a multimedia object and relevant explanation text. The apparatus and method parse an input structured document into a parsing result such as an HTML DOM tree; recognize a main block in the input parsing result and output a main block annotated structured document model; extract a pair of a multimedia object and corresponding explanation, and output a structured object index such as an XML format object index; and search through the structured object index to form a target object list. The apparatus and method can be applied to various kinds of structured documents, and can extract object explanations with a high precision. The apparatus and method may also identify the relationship between the object and the title of the input structured document.
摘要:
A management module of an annotation server receives annotation data, which contains location information of web page data, content information of an annotation, and position information of an object to which the annotation is linked, from a web client machine together with a registration request. Then, the management module issues an annotation ID. The module retrieves a text, which consists of an object to which the annotation is linked and an adjacent part that has a relationship satisfying a predetermined condition with the object, as context information from a source text of the web page. Then, the module registers the context information together with the annotation data that is received in advance into an annotation database.
摘要:
An apparatus which translates a document (character string) having various kinds of typographical information, such as on a font size and on a font style, as character attributes, and reflects the typographical information attached to an original text of the document. The apparatus performs a morpheme analysis with typographical information saved, by regarding a piece of typographical information as a morpheme for a document having typographical information between characters. The apparatus judges typographical information attached to each character forming a morpheme after performing a morpheme analysis for a sentence having a piece of typographical information as a character attribute, and determines morpheme typographical information when characters forming a single morpheme carry different pieces of typographical information. The apparatus also separates a sentence whose morpheme is analyzed into a piece of typographical information and an original text translates the original document into some other language and converts to an appropriate one the piece of typographical information as necessary, anticipation of a case in which the piece of typographical information attached to the original document cannot be attached "as is" to its translation result.
摘要:
Abstract of the Disclosure An annotation server stores annotation data sent from a web client into a first annotation database. The annotation server retrieves annotation data whose description information requires an execution result of a predetermined program from the first database, and incorporates the execution result of the predetermined program into the description information for the retrieved annotation data. Then, the computer transfers the data to the second database. Receiving a sending request for annotation data from a web client, the annotation server retrieves the annotation data from the second database and sends it to the web client that sent the request. Therefore, the web client displays the latest information as an annotation over a web page according to the annotation data received from the computer.
摘要:
A text data generation program generates text data as a target of a text processing tool. The program controls a computer to receive location information of web page data from the text-processing tool, acquires web page data from a web server via a communication device in response to the location information, acquires annotation data, which is linked with the web page data, from an annotation server via the communication device, converts contents of the acquired annotation into a form that can be interpreted by the text-processing tool, embeds the converted contents at a position to which the annotation should link, and outputs the web page data to which the contents of the annotation are embedded to the text-processing tool.
摘要:
The present invention provides a file recognition apparatus and method for recognizing specific information type with respect to a web page file group collected from the Internet or stored in other storage apparatus. The file recognition apparatus of the invention comprises: a file grouping section for classifying, from a predetermined viewpoint, the file group to be recognized by file type; a file type recognition section for recognizing the type of the files according to characteristics specific to the specific information type; and a file-type-recognition correction section for correcting the recognition result of each file in consideration of the recognition precision of all files in the group. The apparatus and method of the invention can recognize various types of information, and can obtain satisfying reorganization precision.
摘要:
An evaluation apparatus learns the correspondence between domains and evaluation items from a Web page group in Internet, generates an evaluation set group, and generates a specified domain evaluation set by extracting evaluation items corresponding to the specified domain from the evaluation set group. Then, it evaluates a Web page to be evaluated based on the specified domain evaluation set.
摘要:
A query-and-response processing method for analyzing the intention of a query provided by a user reduces search result information to an amount manageable for the user, sorts out the result information, and presents it in an easily readable form to the user. A search request analyzer analyzes a search request provided from the user, a search criteria generator generates search criteria, then a search executor searches through a database. A query intention analyzer analyzes the intention of a query from the user, such as a query topic, and an output formatter, based on the result of the analysis, selects items to be presented to the user from the search results and determines the output format of the search results. A presentation module receives the results and presents the data to the user.
摘要:
An object of the present invention is to carry out publication control for a portion of contents according to its valid period. This invention includes: reading out publication data including first data whose publication should be controlled, publication control condition data relating to a valid period of the first data, and second data whose publication does not have to be controlled from a publication data storage storing the publication data to judge whether or not a condition defined in the publication control condition data is satisfied; and upon detecting that the condition defined in the publication control condition data is satisfied, generating current publication data including the first data corresponding to the publication control condition data whose condition is judged to be satisfied and the second data and outputting the generated current publication data. In this way, when the publication of the first data is controlled based on the publication control condition data concerning the valid period, it becomes possible to control not to open information whose validity has been lost such as the contact telephone number to inquire the event, to the public, for example, after the event ended or the like.
摘要:
The accuracy of retrieving or clipping documents is improved. A document to be processed is input via a document input section. Event specifying means looks up knowledge information stored in knowledge information storing means to specify the type of an event described in the input document. Attribute value extracting means extracts, from the document, attribute values of attributes relating to the specified event. Correlating means performs a process of correlating the attribute values extracted by the attribute value extracting means with entities in the real world. Document storing means stores information (normalized information) generated by the correlating means and the document or information specifying a storage location thereof in a manner associated with each other. Document extracting means compares a query input from a user interface section with the normalized information and extracts, from the document storing means, matching documents or information specifying their storage locations.