Reclassification of training data to improve classifier accuracy

Invention Grant

US09342588B2 Reclassification of training data to improve classifier accuracy 有权

Title translation: 重新分类训练数据，提高分类精度

Please log in to see more content

Patent Title: Reclassification of training data to improve classifier accuracy
Patent Title (中): 重新分类训练数据，提高分类精度
Application No.: US11764291

Application Date: 2007-06-18
Publication No.: US09342588B2

Publication Date: 2016-05-17
Inventor: Rajesh Balchandran , Linda M. Boyer , Gregory Purdy
Applicant: Rajesh Balchandran , Linda M. Boyer , Gregory Purdy
Applicant Address: US NY Armonk
Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
Current Assignee Address: US NY Armonk
Agency: Cuenot, Forsythe & Kim, LLC
Main IPC: G06F17/27
IPC: G06F17/27 ; G06F17/30

Reclassification of training data to improve classifier accuracy

Abstract:

A method of creating a statistical classification model for a classifier within a natural language understanding system can include processing training data using an existing statistical classification model. Sentences of the training data correctly classified into a selected class of the statistical classification model can be selected. The selected sentences of the training data can be assigned to a fringe group or a core group according to confidence score. The training data can be updated by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class. A new statistical classification model can be built from the updated training data. The new statistical classification model can be output.

Abstract(Chinese):

在自然语言理解系统内创建用于分类器的统计分类模型的方法可以包括使用现有的统计分类模型处理训练数据。可以选择正确分类为所选类别的统计分类模型的训练数据句子。训练数据的所选句子可以根据置信度得分分配给边缘组或核心组。可以通过将边缘组与所选类的边缘子类和具有所选类的核心子类的核心组相关联来更新训练数据。可以从更新的训练数据构建新的统计分类模型。可以输出新的统计分类模型。

Public/Granted literature

US20080312906A1 Reclassification of Training Data to Improve Classifier Accuracy Public/Granted day:2008-12-18

Information query

Espacenet