Visual and Audio Multimodal Searching System

Invention Publication

US20240362279A1 Visual and Audio Multimodal Searching System 审中-公开

Please log in to see more content

Patent Title: Visual and Audio Multimodal Searching System
Application No.: US18306638

Application Date: 2023-04-25
Publication No.: US20240362279A1

Publication Date: 2024-10-31
Inventor: Harshit Kharbanda , Belinda Luna Zeng , Viviana Caso Corella , Christopher James Kelley , Jessica Lee , Pendar Yousefi , Dounia Berrada , Sundeep Vaddadi , Kai Yu , Balint Miklos , Severin Heiniger , Louis Wang
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Main IPC: G06F16/9532
IPC: G06F16/9532 ; G06F16/538 ; G06F40/40

Visual and Audio Multimodal Searching System

Abstract:

A multimodal search system is described. The system can receive image data captured by a camera of a user device. Additionally, the system can receive audio data associated with the image data. The audio data can be captured by a microphone of the user device. Moreover, the system can process the image data to generate visual features. Furthermore, the system can process the audio data to generate a plurality of words. The system can generate a plurality of search terms based on the plurality of words and the visual features. Subsequently, the system can determine one or more search results associated with the plurality of search terms and provide the one or more search results as an output.

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/90	.•与检索数据类型无关的数据库功能
G06F16/95	..••从网上检索
G06F16/953	...•••查询，例如通过使用网络搜索引擎
G06F16/9532	....••••查询式