摘要:
A device for classifying a document comprises a module to generate a data tree structure and configured to assign terms to a first plurality of nodes of the data tree structure, where each of the first plurality of nodes is assigned a weight. In assigning the weights of the first plurality of nodes, a first generation of combinations of possible weights assignable as the weights of the first plurality of nodes is obtained, and a second generation of combinations of possible weights assignable as the weights of the first plurality of nodes is obtained by performing the genetic algorithms in the first generation of combinations of possible weights. The device determines whether the document is in a document class based at least the weights of the first plurality of nodes.