发明授权
US09058382B2 Augmenting a training set for document categorization 有权
增加文件分类培训

Augmenting a training set for document categorization
摘要:
A method and system for augmenting a training set used to train a classifier of documents is provided. The augmentation system augments a training set with training data derived from features of documents based on a document hierarchy. The training data of the initial training set may be derived from the root documents of the hierarchies of documents. The augmentation system generates additional training data that includes an aggregate feature that represents the overall characteristics of a hierarchy of documents, rather than just the root document. After the training data is generated, the augmentation system augments the initial training set with the newly generated training data.
公开/授权文献
信息查询
0/0