-
公开(公告)号:US20250087207A1
公开(公告)日:2025-03-13
申请号:US18736113
申请日:2024-06-06
Applicant: Google LLC
Inventor: Harshit Kharbanda , Jessica Lee , Christopher James Kelley , Fabian Roth , Dounia Berrada , Samer Hassan Hassan , Afroz Mohiuddin , Misha Khalman , Ali Essam Ali Elqursh , Belinda Luna Zeng
IPC: G10L15/183 , G06F16/583 , G06V10/778 , G06V30/14 , G06V30/148 , G10L15/22 , G10L15/30
Abstract: The present disclosure provides computer-implemented methods, systems, and devices for responding to requests associated with an image. A computing system obtains, wherein the image depicts a first set of textual content. The computing system determines one or more characteristics of the first set of textual content. The computing system determines a response type from a plurality of response types based on the one or more characteristics. The computing system generates a model input, wherein the model input comprises data descriptive of the first set of textual content and a prompt associated with the response type. The computing system provides providing the model input as an input to a machine-learned language model. The computing system receives a second set of text as an output of the machine-learned language model as a result of the machine-learned language model processing the model input. The computing system provides the second set of text for display to a user, wherein the second set of textual content is associated with the response type.