Patent search ap:("GOOGLE LLC") AND inv:"Neel Joshi" Page 1

1.

发明申请
MULTIMODAL DIALOGS USING LARGE LANGUAGE MODEL(S) AND VISUAL LANGUAGE MODEL(S) 有权

公开(公告)号：US20250005293A1

公开(公告)日：2025-01-02

申请号：US18217313

申请日：2023-06-30

Applicant: GOOGLE LLC

Inventor： Tuan Nguyen , Sergei Volnov , William A. Truong , Yunfan Ye , Sana Mithani , Neel Joshi , Alexey Galata , Tzu-Chan Chuang , Liang-yu Chen , Qiong Huang , Krunal Shah , Sai Aditya Chitturu

IPC: G06F40/40 , G06F40/30

Abstract: Implementations relate to leveraging large language model(s) (LLMs) and vision language model(s) (VLMs) to facilitate human-to-computer dialogs. In various implementations, one or more digital images may be processed using one or more VLMs to generate VLM output indicative of a state of an environment. An LLM prompt may be assembled based on the VLM output and a natural language input. The LLM prompt may be processed using one or more LLMs to generate content that is responsive to the natural language input. The content that is responsive to the natural language input may subsequently be rendered at one or more output devices.

Patent Agency Ranking