Adversarial training data augmentation data for text classifiers

Invention Grant

US11093707B2 Adversarial training data augmentation data for text classifiers 有权

Please log in to see more content

Patent Title: Adversarial training data augmentation data for text classifiers
Application No.: US16247620

Application Date: 2019-01-15
Publication No.: US11093707B2

Publication Date: 2021-08-17
Inventor: Ming Tan , Ruijian Wang , Inkit Padhi , Saloni Potdar
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Lieberman & Brandsdorfer, LLC
Main IPC: G06F40/279
IPC: G06F40/279 ; G06F40/205 ; G06K9/62 ; G06F16/35 ; G06N3/08

Adversarial training data augmentation data for text classifiers

Abstract:

An intelligent computer platform to introduce adversarial training to natural language processing (NLP). An initial training set is modified with synthetic training data to create an adversarial training set. The modification includes use of natural language understanding (NLU) to parse the initial training set into components and identify component categories. One or more paraphrase terms are identified with respect to the components and component categories, and function as replacement terms. The synthetic training data is effectively a merging of the initial training set with the replacement terms. As input is presented, a classifier leverages the adversarial training set to identify the intent of the input and to output a classification label to generate accurate and reflective response data.

Public/Granted literature

US20200226212A1 Adversarial Training Data Augmentation Data for Text Classifiers Public/Granted day:2020-07-16

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F40/00	处理自然语言数据（语音分析或综合，语音识别G10L）
G06F40/20	.自然语言分析（自然语言的语义分析入G06F40/30）
G06F40/279	..文字实体的识别