摘要:
The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.
摘要:
A logic process apparatus for composite graphs in a fixed layout document is provided in this invention. The apparatus includes a composite graph block extraction unit, for extracting composite graph blocks from the fixed layout document, a document parsing unit, for parsing the fixed layout document to obtain text primitives contained therein, a legend primitive extraction unit, for extracting legend primitives from the text primitives, a correlation detection unit, for detecting correlations between the composite graph blocks and the legend primitives, and a correlation storage unit, for storing the detected correlations. A logic process method for composite graphs in a fixed layout document is also provided.
摘要:
The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.
摘要:
An extraction device for the composite graph in a fixed layout document comprising: a document parsing unit, for parsing the fixed layout document, and determining the primitives of the fixed layout document and their types; a layer generation unit, for extracting text primitives so as to form a text layer, and using the rest non-text primitives to form a non-text layer; a page analysis unit, for processing the text layer and the non-text layer with page analyses respectively; a block generation unit, for generating a text block in the text layer and a graph block in the non-text layer; a correlation block determination unit, for determining text blocks correlating to every graph block and merging those correlated text blocks and graph blocks into a composite graph block; an identifier storage unit, for storing the identifiers of all the primitives contained in the composite graph block.
摘要:
The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.