Continuous learning for document processing and analysis

    公开(公告)号:US12118813B2

    公开(公告)日:2024-10-15

    申请号:US17518191

    申请日:2021-11-03

    发明人: Stanislav Semenov

    摘要: A document processing method includes receiving one or more documents, performing optical character recognition on the one or more documents to detect words comprising symbols in the one or more documents, and determining a encoding value for each of the symbols. It further includes applying a first hash function to each encoding value to generate a first set of hashed symbol values, applying a second hash function to each hashed symbol value to generate a vector array including a second set of hashed symbol values, and applying a linear transformation to each value of the second set of hashed symbol values of the vector array. The method also includes applying an irreversible non-linear activation function to the vector array to obtain abstract values associated with the symbols and saving the abstract values to train a neural network to detect fields in an input document.

    Image generation method, computing device, and storage medium

    公开(公告)号:US12118808B2

    公开(公告)日:2024-10-15

    申请号:US17830518

    申请日:2022-06-02

    摘要: An image generation method obtains an original image. A character area, a background area, and a position of each flawless character in the original image are determined. The character area is segmented to obtain a first image of each flawless character. A background is removed from the first image to obtain a second image. First image processing is performed on the second image to obtain a third image. Second image processing is performed on the second image to obtain fourth images. Third image processing is performed on the fourth images respectively to obtain fifth images. A similarity between each fifth image and the third image is calculated. When the similarity is greater than a defect threshold, a background image is segmented. Brightness of the background image is adjusted. The target fourth image and adjusted background image are synthesized. The method can generate images with defective characters quickly.

    Remotely verifying an identity of a person

    公开(公告)号:US12099585B2

    公开(公告)日:2024-09-24

    申请号:US17275361

    申请日:2019-09-12

    申请人: ISX IP Ltd

    摘要: A computer-implemented method for remotely verifying an identity of a user is presented. The method comprises a first data processing device (120) receiving a live video stream (102) of the user from a second data processing device (140) via a video data connection (108) having a video bandwidth. Establishing a separate data connection (110) between the first (120) and second (140) data processing devices, the data connection (110) having a data bandwidth. The first data processing device (120) receiving, via the data connection (110), identifying data (104) captured from an identifying means from the second data processing device (140), or another data processing device. The first data processing device (120) determining first biometric data based on the identifying data (104) and comparing to second biometric data based on the live video stream (102). The first data processing device (120) then verifying an identity of the user based on a correspondence between the first biometric data and the second biometric data.

    Systems and methods for recovering numerical readings of cumulative flow meters based on noisy image data

    公开(公告)号:US12080085B2

    公开(公告)日:2024-09-03

    申请号:US18097961

    申请日:2023-01-17

    申请人: Yuri P. Garbuzov

    发明人: Yuri P. Garbuzov

    IPC分类号: G06F7/24 G06V30/148

    CPC分类号: G06V30/153

    摘要: A meter readout on a meter has digits including a first digit, a second digit, etc. A sequence of images of the meter is obtained. The images include images of the digits in the meter readout. Automated recognition of the digits in the images result in likelihood arrays indicating the likelihoods for the digit values for the digits imaged in the meter images. Short chains of digit values are identified and spliced together to form a series of single digit, two-digit, three-digits, etc. paths that are built up based on the likelihood arrays. Various criteria are used to discard most of the chains and thereby avoid the combinatorial explosion of possible paths and thereby produce reliable meter readings without consuming considerable computational resources.