- 专利标题: Document classification of files on the client side before upload
-
申请号: US17223922申请日: 2021-04-06
-
公开(公告)号: US11948383B2公开(公告)日: 2024-04-02
- 发明人: William J. Farmer, II , Sreenidhi Narayanamangalathu Kesavan , Dimitri Bilenkin , William Clayton Jackson , Karthikeyan Palanivelu , Siddharth Mangalik
- 申请人: Capital One Services, LLC
- 申请人地址: US VA McLean
- 专利权人: Capital One Services, LLC
- 当前专利权人: Capital One Services, LLC
- 当前专利权人地址: US VA McLean
- 代理机构: Sterne, Kessler, Goldstein & Fox P.L.L.C.
- 主分类号: G06V30/413
- IPC分类号: G06V30/413 ; G06N20/00
摘要:
A method for classifying a document in real-time is disclosed. The method includes identifying one or more sections of the document likely to contain text based on a contrast between dark space and light space in an image of the document. Optical character recognition is performed within the identified sections of the document to identify a set of words within each identified section of the document. The sets of words are extracted from the identified sections of the document, and a subset of the sets of words is selected for classifying the document based on a preconfigured option. The document is then classified by inputting the selected subset of words into one or more machine learning models. The method includes transmitting the document and the determined classification of the document to an external server.
公开/授权文献
信息查询