Invention Grant
- Patent Title: Text classification by weighted proximal support vector machine based on positive and negative sample sizes and weights
- Patent Title (中): 基于正,负样本大小和权重的加权近端支持向量机进行文本分类
-
Application No.: US11384889Application Date: 2006-03-20
-
Publication No.: US07707129B2Publication Date: 2010-04-27
- Inventor: Dong Zhuang , Benyu Zhang , Zheng Chen , Hua-Jun Zeng , Jian Wang
- Applicant: Dong Zhuang , Benyu Zhang , Zheng Chen , Hua-Jun Zeng , Jian Wang
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agency: Perkins Coie LLP
- Main IPC: G06F15/18
- IPC: G06F15/18 ; G06E1/00 ; G06E3/00

Abstract:
Embodiments of the invention relate to improvements to the support vector machine (SVM) classification model. When text data is significantly unbalanced (i.e., positive and negative labeled data are in disproportion), the classification quality of standard SVM deteriorates. Embodiments of the invention are directed to a weighted proximal SVM (WPSVM) model that achieves substantially the same accuracy as the traditional SVM model while requiring significantly less computational time. A weighted proximal SVM (WPSVM) model in accordance with embodiments of the invention may include a weight for each training error and a method for estimating the weights, which automatically solves the unbalanced data problem. And, instead of solving the optimization problem via the KKT (Karush-Kuhn-Tucker) conditions and the Sherman-Morrison-Woodbury formula, embodiments of the invention use an iterative algorithm to solve an unconstrained optimization problem, which makes WPSVM suitable for classifying relatively high dimensional data.
Public/Granted literature
- US20070239638A1 Text classification by weighted proximal support vector machine Public/Granted day:2007-10-11
Information query