Systems and methods for analyzing text extracted from images and performing appropriate transformations on the extracted text

Invention Grant

US12033620B1 Systems and methods for analyzing text extracted from images and performing appropriate transformations on the extracted text 有权

Please log in to see more content

Patent Title: Systems and methods for analyzing text extracted from images and performing appropriate transformations on the extracted text
Application No.: US18463951

Application Date: 2023-09-08
Publication No.: US12033620B1

Publication Date: 2024-07-09
Inventor: Harshit Kharbanda , Jessica Lee , Christopher James Kelley , Fabian Roth , Dounia Berrada , Samer Hassan Hassan , Afroz Mohiuddin , Mikhail Khalman , Ali Essam Ali Elqursh , Belinda Luna Zeng
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: GOOGLE LLC
Current Assignee: GOOGLE LLC
Current Assignee Address: US CA Mountain View
Agency: DORITY & MANNING P.A.
Main IPC: G06F3/0483
IPC: G06F3/0483 ; G06F16/30 ; G06F16/33 ; G06F16/583 ; G06V10/778 ; G06V30/14 ; G06V30/148 ; G10L15/183 ; G10L15/22 ; G10L15/30

Systems and methods for analyzing text extracted from images and performing appropriate transformations on the extracted text

Abstract:

The present disclosure provides computer-implemented methods, systems, and devices for responding to requests associated with an image. A computing system obtains, wherein the image depicts a first set of textual content. The computing system determines one or more characteristics of the first set of textual content. The computing system determines a response type from a plurality of response types based on the one or more characteristics. The computing system generates a model input, wherein the model input comprises data descriptive of the first set of textual content and a prompt associated with the response type. The computing system provides providing the model input as an input to a machine-learned language model. The computing system receives a second set of text as an output of the machine-learned language model as a result of the machine-learned language model processing the model input. The computing system provides the second set of text for display to a user, wherein the second set of textual content is associated with the response type.

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F3/00	用于将所要处理的数据转变成为计算机能够处理的形式的输入装置；用于将数据从处理机传送到输出设备的输出装置，例如，接口装置
G06F3/01	.用于用户和计算机之间交互的输入装置或输入和输出组合装置（G06F3/16优先）
G06F3/048	..基于图形用户界面的交互技术
G06F3/0481	...基于显示交互对象的特定属性或基于隐喻的环境，例如类似窗口或图标的桌面组件的交互，或通过光标的特性或外观的改变辅助的
G06F3/0483	....与结构化页面环境的交互，例如书的隐喻