Machine Learning Based Document Visual Element Extraction

    公开(公告)号:US20230419020A1

    公开(公告)日:2023-12-28

    申请号:US17808293

    申请日:2022-06-22

    Applicant: Google LLC

    CPC classification number: G06F40/174 G06V30/412 G06V30/19

    Abstract: A method includes obtaining a document with textual fields and a visual element. For each textual field, the method includes determining a textual offset for the textual field that indicates a location of the textual field relative to each other textual field in the document. The method includes detecting, using a machine learning vision model, the visual element and determining a visual element offset indicating a location of the visual element relative to each textual field in the document. The method includes assigning the visual element a visual element anchor token and inserting the visual element anchor token into the textual fields in an order based on the visual element offset and the respective textual offsets. The method also includes, after inserting the visual element anchor token, extracting, using a text-based extraction model, from the textual fields, structured entities representing the series of textual fields and the visual element.

    Image Compression and Reconstruction Using Machine Learning Models

    公开(公告)号:US20250069270A1

    公开(公告)日:2025-02-27

    申请号:US18724026

    申请日:2022-01-24

    Applicant: Google LLC

    Abstract: A method includes obtaining image data, identifying a machine learning-compressible (ML-compressible) portion of the image data, and determining a location of the ML-compressible portion within the image data. The method also includes selecting, from a plurality of ML compression models, an ML compression model for the ML-compressible portion based on an image content thereof, and generating, based on the ML-compressible portion and by the ML compression model, an ML-compressed representation of the ML-compressible portion. The method further includes generating a compressed image data file that includes the ML-compressed representation and the location of the ML-compressible portion, and outputting the compressed image data file. The compressed image data file is configured to cause an ML decompression model corresponding to the ML compression model to generate a reconstruction of the ML-compressible portion of the image data based on the ML-compressed representation.

Patent Agency Ranking