FINDING NATURAL IMAGES IN DOCUMENT PAGES

    公开(公告)号:US20220198185A1

    公开(公告)日:2022-06-23

    申请号:US17127174

    申请日:2020-12-18

    Inventor: Tim Prebble

    Abstract: An image processing method includes: generating, from combined connected components (CCs) of a document image, candidate text CCs, candidate background CCs, and candidate natural image CCs where the candidate background CCs are excluded from the combined CCs to generate the candidate natural image CCs with a predetermined criterion dependent on the candidate text CCs; generating a final natural image bounding box by expanding a candidate natural image bounding box of the candidate natural image CCs and including in the expanded candidate natural image bounding box at least one combined CC that intersects the expanded candidate natural image bounding box; and modifying, based on the final natural image bounding box, the document image and displaying the modified document image to a user.

    EXTRACTING TEXT FROM AN IMAGE
    2.
    发明申请

    公开(公告)号:US20230094651A1

    公开(公告)日:2023-03-30

    申请号:US17490770

    申请日:2021-09-30

    Inventor: Tim Prebble

    Abstract: A method for extracting text from an input image and generating a document includes: generating an edges mask from the input image; generating an edges image that is derived from the edges mask; identifying, within the edges mask, one or more probable text areas; extracting a first set of text characters by performing a first optical character recognition (OCR) operation on each of one or more probable text portions, of the derived edges image, corresponding to each of the probable text areas; generating a modified image by erasing, from the input image, image characters corresponding to the first set of text characters extracted by the first OCR operation; and generating a document by overlaying the extracted first set of text characters on the modified image.

    Extracting text from an image
    5.
    发明授权

    公开(公告)号:US12062246B2

    公开(公告)日:2024-08-13

    申请号:US17490770

    申请日:2021-09-30

    Inventor: Tim Prebble

    CPC classification number: G06V30/18 G06F18/2431 G06T7/11 G06T7/13 G06T11/00

    Abstract: A method for extracting text from an input image and generating a document includes: generating an edges mask from the input image; generating an edges image that is derived from the edges mask; identifying, within the edges mask, one or more probable text areas; extracting a first set of text characters by performing a first optical character recognition (OCR) operation on each of one or more probable text portions, of the derived edges image, corresponding to each of the probable text areas; generating a modified image by erasing, from the input image, image characters corresponding to the first set of text characters extracted by the first OCR operation; and generating a document by overlaying the extracted first set of text characters on the modified image.

    Finding natural images in document pages

    公开(公告)号:US11721119B2

    公开(公告)日:2023-08-08

    申请号:US17127174

    申请日:2020-12-18

    Inventor: Tim Prebble

    CPC classification number: G06V30/413 G06T11/20 G06T2210/12

    Abstract: An image processing method includes: generating, from combined connected components (CCs) of a document image, candidate text CCs, candidate background CCs, and candidate natural image CCs where the candidate background CCs are excluded from the combined CCs to generate the candidate natural image CCs with a predetermined criterion dependent on the candidate text CCs; generating a final natural image bounding box by expanding a candidate natural image bounding box of the candidate natural image CCs and including in the expanded candidate natural image bounding box at least one combined CC that intersects the expanded candidate natural image bounding box; and modifying, based on the final natural image bounding box, the document image and displaying the modified document image to a user.

    Background noise reduction using a variable range of color values dependent upon the initial background color distribution

    公开(公告)号:US11069043B1

    公开(公告)日:2021-07-20

    申请号:US16818089

    申请日:2020-03-13

    Inventor: Tim Prebble

    Abstract: A method to reduce background noise in a document image. The method includes extracting, from the document image, a connected component corresponding to a background of the document image, generating a histogram of pixel values of the connected component, generating, using a non-linear mapping function based on the histogram, a non-linear probability distribution of the pixel values in the connected component, generating, based at least on a comparison between the non-linear probability distribution and a predetermined threshold, a replacement range of the pixel values, selecting, from the connected component, a pixel having a pixel value within the replacement range, and converting the pixel value of the pixel to a uniform background color.

    Finding the page background color range

    公开(公告)号:US11330149B1

    公开(公告)日:2022-05-10

    申请号:US17150070

    申请日:2021-01-15

    Inventor: Tim Prebble

    Abstract: A method to reduce background noise in a document image includes: extracting, from the document image, a connected component corresponding to a background of the document image; generating a histogram of pixel values of the connected component; generating a replacement range using a range pruning algorithm that narrows a range of the histogram by iteratively discarding at least one pixel value and corresponding pixel count of the histogram from at least one side of the histogram; selecting, from the connected component, at least one pixel having a corresponding pixel value within the replacement range; converting the corresponding pixel value of the at least one pixel to a uniform background color; and outputting, subsequent to the converting, the document image.

Patent Agency Ranking