-
公开(公告)号:US20240153297A1
公开(公告)日:2024-05-09
申请号:US18501982
申请日:2023-11-03
Applicant: Google LLC
Inventor: Zizhao Zhang , Zifeng Wang , Vincent Perot , Jacob Devlin , Chen-Yu Lee , Guolong Su , Hao Zhang , Tomas Jon Pfister
IPC: G06V30/24 , G06F16/21 , G06V30/19 , G06V30/412
CPC classification number: G06V30/24 , G06F16/211 , G06V30/19147 , G06V30/412
Abstract: A method for extracting entities comprises obtaining a document that includes a series of textual fields that includes a plurality of entities. Each entity represents information associated with a predefined category. The method includes generating, using the document, a series of tokens representing the series of textual fields. The method includes generating an entity prompt that includes the series of tokens and one of the plurality of entities and generating a schema prompt that includes a schema associated with the document. The method includes generating a model query that includes the entity prompt and the schema prompt and determining, using an entity extraction model and the model query, a location of the one of the plurality of entities among the series of tokens. The method includes extracting, from the document, the one of the plurality of entities using the location of the one of the plurality of entities.