摘要:
An apparatus comprises: unit configured to divide input document data into a body region, a caption region, and an object region; unit configured to acquire text information included in each of the body region and the caption region; unit configured to search the text information in the body region for an anchor term, to extract an anchor term from the text information in the caption region, and to generate a bi-directional link between a portion corresponding to the anchor term in the body region and a portion of the object region to which the caption region is appended; and unit configured to convert the input document data into digital document data in which the portion corresponding to the anchor term in the body region and the portion corresponding to the object region to which the caption region is appended are bi-directionally linked based on the link.
摘要:
An image processing apparatus comprises: a character information acquisition unit configured to acquire character information included in each of a body region and a caption region; an accumulation unit configured to divide the character information acquired from the body region into predetermined set units and to accumulate the character information and position information of the divided set unit in a memory; an anchor term extraction unit configured to extract an anchor term from the character information acquired from the caption region; an anchor term search unit configured to search, based on the character information accumulated in the memory for each set unit, for the set unit including the anchor term extracted; a link information generation unit configured to generate link generation information that associates the set unit found by the anchor term search unit with the object region to which the caption region including the anchor term is appended.
摘要:
An image in which at least one color component among a plurality of color components in an input color image has a resolution lower than that of the other color components is held as a base image. A rectangular region that includes an object image is extracted, a region of the object image and a region of a background image are specified in the rectangular region, and fill-up processing of the specified object image is executed in the base image. Fill-up processing executed at a boundary between the object image and the base image differs from fill-up processing executed at a boundary between the object image and the background image.
摘要:
This invention generates a digital document by applying character recognition to character images in a document image, and rendering the character recognition result on the document image in a transparent color. This digital document allows to specify a part corresponding to a search keyword on the document image upon conducting a search. When this digital document is generated, it includes a description required to use glyph data (font data) of a simple character shape commonly to a plurality of character types as font data used upon rendering the character recognition result. Therefore, even when the digital document needs to save font data, an increase in file size can be minimized. Also, by rendering using a simple character shape, the data size of the font data itself can be reduced.
摘要:
The present invention is intended to generate data optimal for both display and reuse from an image. From an input image, vector data of a display foreground layer, vector data of a non-display foreground layer, and a display background layer in which a portion of the input image is filled are generated. Next, electronic data including the display foreground layer, display background layer, and the non-display foreground layer is generated. By using the multi-layered electronic data, a composite image of the display foreground layer and the display background layer is provided for display, and the layers for display are switched for reuse. This makes it possible to provide data optimal for both display and reuse.
摘要:
According to the present invention, it is possible to create electronic document data capable of highlighting an object detected through a search so that a user can easily recognize it. An image processing apparatus extracts an object from an input image and extracts metadata related to the object. The image processing apparatus, when determines to describe with a shape in accordance with the shape of the object, creates a vector path description of frame described with a shape in accordance with the shape of the object. Then, the image processing apparatus creates an electronic document including data of the input image and the vector path description of frame with which the metadata is associated. When a keyword search is performed on the created electronic document, highlight display is performed in accordance with the vector path description of frame with which metadata that matches the keyword is associated.
摘要:
This invention generates a digital document by applying character recognition to character images in a document image, and rendering the character recognition result on the document image in a transparent color. This digital document allows to specify a part corresponding to a search keyword on the document image upon conducting a search. When this digital document is generated, it includes a description required to use glyph data (font data) of a simple character shape commonly to a plurality of character types as font data used upon rendering the character recognition result. Therefore, even when the digital document needs to save font data, an increase in file size can be minimized. Also, by rendering using a simple character shape, the data size of the font data itself can be reduced.
摘要:
An image processing apparatus extracts an object area (e.g., character, picture, line drawing, and table) from an input image and acquires a metadata to be associated with the object. The image processing apparatus generates a transparent graphics description for an object area having an attribute that requires generation of the transparent graphics description, and generates an electronic document while associating the transparent graphics description with the metadata. As transparent graphics description, an arbitrary shape of graphics can be used. Accordingly, the image processing apparatus can generate electronic document data suitable for a highlight expression, which is easy for users to recognize in a search operation using a keyword to search an object included in an electronic document.
摘要:
This invention provides the following environment. That is, an original document file corresponding to a document to be copied is specified from image data of that document to be copied, and a print process is made based on the specified file so as to prevent deterioration of image quality. Also, when a document to be copied is not registered, a registration process is executed to suppress deterioration of image quality in an early stage. Furthermore, since the document is converted into vector data, re-use of such document is facilitated, and deterioration of image quality can be suppressed even when an image process such as enlargement or the like is made. To this end, when an original digital file cannot be specified, an apparatus of this embodiment executes a vectorization process (S54), converts the obtained vector data into a data format that can be re-used by an application (S55), and registers the converted file in a file server (S56). With this registration process, since the location of the file is settled, that location information is composited on an image to be scanned using an identifier such as a two-dimensional barcode or the like (S48), and the composite image can be printed (S49). Even when the printed document is scanned again, a registered digital file can be easily specified.
摘要:
An image in which at least one color component among a plurality of color components in an input color image has a resolution lower than that of the other color components is held as a base image. A rectangular region that includes an object image is extracted, a region of the object image and a region of a background image are specified in the rectangular region, and fill-up processing of the specified object image is executed in the base image. Fill-up processing executed at a boundary between the object image and the base image differs from fill-up processing executed at a boundary between the object image and the background image.