Invention Grant
US09355087B2 Identification of content in an electronic document 有权
电子文件内容的识别

Identification of content in an electronic document
Abstract:
In some embodiments, a method includes receiving an electronic document that comprises a plurality of sections. The method includes marking the plurality of sections as a content section or a non-content section using a visual attribute of the sections that includes at least one of a width of the section, a density of the plurality of hyperlinks in the section, a size of a font of text in the section and whether a title of the electronic document overlaps with text in the section. The method also includes storing the marking other plurality of sections of the electronic document in a machine-readable medium.
Public/Granted literature
Information query
Patent Agency Ranking
0/0