Polar relative distance transformer
Abstract:
A system can comprise a processor that can facilitate performance of operations, comprising accessing a document comprising a plurality of text bounding boxes, wherein each respective text bounding box of the plurality of text bounding boxes comprises respective text, for each respective text bounding box, determining respective text bounding box coordinates and respective text bounding box input embeddings, based on the respective text bounding box coordinates, determining respective text bounding box positional encodings for each respective text bounding box, based on a transformer-based deep learning model applied to the respective text bounding box input embeddings, respective text bounding box coordinates, respective text bounding box positional encodings, and bias information representative of a modification to an attention weight of the transformer-based deep learning model, determining respective output embeddings for each respective text bounding box, and based on the respective output embeddings, generating respective bounding box labels for each respective bounding box.
Public/Granted literature
Information query
Patent Agency Ranking
0/0