Invention Grant
US08260062B2 System and method for identifying document genres 有权
识别文件类型的系统和方法

System and method for identifying document genres
Abstract:
A system, a computer readable storage medium including instructions, and method for generating genre models used to identify genres of a document. For each document image in a set of document images that are associated with one or more genres, the document image is segmented into a plurality of tiles, wherein the tiles in the plurality of tiles are sized so that document page features are identifiable, and features of the document image and the plurality of tiles are computed. At least one genre classifier is trained to classify document images as being associated with one or more genres based on the features of the document images in the set of document images, the features of the plurality of tiles of the set of documents images, and the one or more genres associated with each document image in the set of documents images.
Public/Granted literature
Information query
Patent Agency Ranking
0/0