DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS

Invention Publication

US20230153528A1 DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS 审中-公开

Please log in to see more content

Patent Title: DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS
Application No.: US17984743

Application Date: 2022-11-10
Publication No.: US20230153528A1

Publication Date: 2023-05-18
Inventor: Duy Vu , Varsha Kuppur Rajendra , Dai Hoang Tran , Shivashankar Subramanian , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson
Applicant: Oracle International Corporation
Applicant Address: US CA Redwood Shores
Assignee: Oracle International Corporation
Current Assignee: Oracle International Corporation
Current Assignee Address: US CA Redwood Shores
Main IPC: G06F40/279
IPC: G06F40/279 ; G06F40/166 ; G06N5/02

DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS

Abstract:

Techniques for augmentation and batch balancing of training data to enhance negation and fairness of a machine learning model. In one particular aspect, a method is provided that includes generating a list of demographic words associated with a demographic group, searching an unlabeled corpus of text to identify unlabeled examples in a target domain comprising at least one demographic word from the list of demographic words, rewriting the unlabeled examples to create one or more versions of each of the unlabeled examples and generate a fairness invariance data set, and training the machine learning model using unlabeled examples from the fairness invariance data set.

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F40/00	处理自然语言数据（语音分析或综合，语音识别G10L）
G06F40/20	.自然语言分析（自然语言的语义分析入G06F40/30）
G06F40/279	..文字实体的识别