Learning representations of generalized cross-modal entailment tasks

Invention Grant

US11250299B2 Learning representations of generalized cross-modal entailment tasks 有权

Please log in to see more content

Patent Title: Learning representations of generalized cross-modal entailment tasks
Application No.: US16668680

Application Date: 2019-10-30
Publication No.: US11250299B2

Publication Date: 2022-02-15
Inventor: Farley Lai , Asim Kadav , Ning Xie
Applicant: NEC Laboratories America, Inc.
Applicant Address: US NJ Princeton
Assignee: NEC Laboratories America, Inc.
Current Assignee: NEC Laboratories America, Inc.
Current Assignee Address: US NJ Princeton
Agent Joseph Kolodka
Main IPC: G06K9/00
IPC: G06K9/00 ; G06K9/62 ; G06N3/04 ; G06N3/08 ; G06K9/46 ; G06K9/72 ; G06K9/32

Learning representations of generalized cross-modal entailment tasks

Abstract:

A method is provided for determining entailment between an input premise and an input hypothesis of different modalities. The method includes extracting features from the input hypothesis and an entirety of and regions of interest in the input premise. The method further includes deriving intra-modal relevant information while suppressing intra-modal irrelevant information, based on intra-modal interactions between elementary ones of the features of the input hypothesis and between elementary ones of the features of the input premise. The method also includes attaching cross-modal relevant information to the features from the input premise to the features from the input hypothesis to form a cross-modal representation, based on cross-modal interactions between pairs of different elementary features from different modalities. The method additionally includes classifying a relationship between the input premise and the input hypothesis using a label selected from the group consisting of entailment, neutral, and contradiction based on the cross-modal representation.

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )