发明授权
US07792846B1 Training procedure for N-gram-based statistical content classification 有权
基于N-gram的统计内容分类的训练程序

Training procedure for N-gram-based statistical content classification
摘要:
A training procedure for N-gram based statistical document classification has been disclosed. In one embodiment, a set of N-grams is selected out of a second set of N-grams, each of the N-grams having a sequence of N bytes, where N is an integer. Then a statistical content classification model is generated based on occurrences of the N-grams, if any, in a set of training documents and a set of validation documents. The statistical content classification model is provided to content filters to classify content.
信息查询
0/0