摘要:
Semantic objects are created that provide a structure for markup language representations of documents. The semantic objects include text runs that are produced from the markup language representation and that are placed into semantic blocks that group text runs according to how text is logically structured in the document being represented. The text runs of each semantic block are ordered to correspond to the logical order of the document being represented. The semantic blocks corresponding to each page of the document being represented are ordered to correspond to the logical order of the document being represented. The ordered semantic blocks including the ordered text runs are saved as a semantic object which can they be utilized to make use of the logical structure of the document being represented by the markup language.