发明授权
US07707129B2 Text classification by weighted proximal support vector machine based on positive and negative sample sizes and weights 有权
基于正,负样本大小和权重的加权近端支持向量机进行文本分类

Text classification by weighted proximal support vector machine based on positive and negative sample sizes and weights
摘要:
Embodiments of the invention relate to improvements to the support vector machine (SVM) classification model. When text data is significantly unbalanced (i.e., positive and negative labeled data are in disproportion), the classification quality of standard SVM deteriorates. Embodiments of the invention are directed to a weighted proximal SVM (WPSVM) model that achieves substantially the same accuracy as the traditional SVM model while requiring significantly less computational time. A weighted proximal SVM (WPSVM) model in accordance with embodiments of the invention may include a weight for each training error and a method for estimating the weights, which automatically solves the unbalanced data problem. And, instead of solving the optimization problem via the KKT (Karush-Kuhn-Tucker) conditions and the Sherman-Morrison-Woodbury formula, embodiments of the invention use an iterative algorithm to solve an unconstrained optimization problem, which makes WPSVM suitable for classifying relatively high dimensional data.
信息查询
0/0