-
1.
公开(公告)号:US20250029416A1
公开(公告)日:2025-01-23
申请号:US18770714
申请日:2024-07-12
Applicant: KYOCERA Document Solutions Inc.
Inventor: Naomichi HIGASHIYAMA , Kunihiko TANAKA , Kota KAMISONO , Noa KANEDA
IPC: G06V30/414 , G06V10/44 , G06V20/70 , G06V30/10
Abstract: In an image reading apparatus, a character area extractor extracts character areas from a document image in units of rows. A title reliability calculator calculates a title reliability level of each character area using a feature quantity data set and a machine learning model. A character recognizer converts character areas of which title reliability levels exceed a threshold into text data. A title judger collates text data with a title candidate. In a case in which one piece of text data coinciding with a title candidate is judged and detected, the title judger sets the text data as a title of the document image. In a case in which a plurality of pieces of coinciding text data are judged and detected, the title judger sets text data of which a title reliability level is the highest among the detected pieces of text data as a title of the document image.
-
公开(公告)号:US20250029414A1
公开(公告)日:2025-01-23
申请号:US18776386
申请日:2024-07-18
Applicant: KYOCERA Document Solutions Inc.
Inventor: Kota KAMISONO , Kunihiko TANAKA , Naomichi HIGASHIYAMA , Noa KANEDA
IPC: G06V30/412 , G06V30/416
Abstract: In an image reading apparatus, the image reading apparatus reads a document bundle to acquire a document image, and determines a first page of the document image according to the selected division method by a user. A page number recognizer extracts a page number from the document image by executing page number recognition processing, and determines the document image indicating a first page to be a first page of the document. A layout recognizer detects a marginal area or background color from the document image by executing layout recognition processing, and determines the first page of the document. A title recognizer extracts a title by executing title recognition processing and determines the first page of the document. A divider divides the document image into documents on the basis of the determined first page, converts the divided document images into files, and stores the files in a storage device.
-