Multimodal fine-grained mixing method and system, device, and storage medium

发明授权

US11436451B2 Multimodal fine-grained mixing method and system, device, and storage medium 有权

请登陆查看更多内容

专利标题： Multimodal fine-grained mixing method and system, device, and storage medium
申请号： US17577099

申请日： 2022-01-17
公开(公告)号： US11436451B2

公开(公告)日： 2022-09-06
发明人: Qing Liao , Ye Ding , Binxing Fang , Xuan Wang
申请人： Harbin Institute of Technology (Shenzhen) (Shenzhen Institute of Science and Technology Innovation, Harbin Institute of Technology) , Dongguan University of Technology
申请人地址： CN Guangdong; CN Guangdong
专利权人： Harbin Institute of Technology (Shenzhen) (Shenzhen Institute of Science and Technology Innovation, Harbin Institute of Technology),Dongguan University of Technology
当前专利权人： Harbin Institute of Technology (Shenzhen) (Shenzhen Institute of Science and Technology Innovation, Harbin Institute of Technology),Dongguan University of Technology
当前专利权人地址： CN Guangdong; CN Guangdong
优先权： CN202110094267.1 20210125
主分类号： G06K9/62
IPC分类号： G06K9/62

Multimodal fine-grained mixing method and system, device, and storage medium

摘要：

The present disclosure provides a multimodal fine-grained mixing method and system, a device, and a storage medium. The method includes: extracting data features from multimodal graphic and textual data, and obtaining each composition of the data features, the data features including a visual regional feature and a text word feature; performing fine-grained classification on modal information of each composition of the data features, to obtain classification results; and performing inter-modal and intra-modal information fusion on each composition according to the classification results, to obtain a fusion feature. The method enables a multimodal model to utilize a complementary characteristic of the multimodal data, with no influence by irrelevant information.

公开/授权文献

US20220237420A1 MULTIMODAL FINE-GRAINED MIXING METHOD AND SYSTEM, DEVICE, AND STORAGE MEDIUM 公开/授权日：2022-07-28

信息查询

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )
G06K9/62	.应用电子设备进行识别的方法或装置