Generating embeddings in a multimodal embedding space for cross-lingual digital image retrieval

Invention Grant

US11734339B2 Generating embeddings in a multimodal embedding space for cross-lingual digital image retrieval 有权

Please log in to see more content

Patent Title: Generating embeddings in a multimodal embedding space for cross-lingual digital image retrieval
Application No.: US17075450

Application Date: 2020-10-20
Publication No.: US11734339B2

Publication Date: 2023-08-22
Inventor: Ajinkya Kale , Zhe Lin , Pranav Aggarwal
Applicant: Adobe Inc.
Applicant Address: US CA San Jose
Assignee: Adobe Inc.
Current Assignee: Adobe Inc.
Current Assignee Address: US CA San Jose
Agency: Keller Preece PLLC
Main IPC: G06F16/535
IPC: G06F16/535 ; G06F16/538 ; G06F16/242 ; G06F40/279 ; G06N3/08 ; G06N3/04 ; G06F18/21

Generating embeddings in a multimodal embedding space for cross-lingual digital image retrieval

Abstract:

The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.

Public/Granted literature

US20220121702A1 GENERATING EMBEDDINGS IN A MULTIMODAL EMBEDDING SPACE FOR CROSS-LINGUAL DIGITAL IMAGE RETRIEVAL Public/Granted day:2022-04-21

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/50	.•静态图像数据
G06F16/53	..••查询
G06F16/535	...•••基于附加数据的过滤,例如,用户或组配置文件