Machine learning for document compression

Invention Grant

US11783611B2 Machine learning for document compression 有权

Please log in to see more content

Patent Title: Machine learning for document compression
Application No.: US17009526

Application Date: 2020-09-01
Publication No.: US11783611B2

Publication Date: 2023-10-10
Inventor: Hans-Martin Ramsl
Applicant: SAP SE
Applicant Address: DE Walldorf
Assignee: SAP SE
Current Assignee: SAP SE
Current Assignee Address: DE Walldorf
Agency: SCHWEGMAN LUNDBERG & WOESSNER, P.A.
Main IPC: G06K9/00
IPC: G06K9/00 ; G06N20/00 ; G06K9/34 ; G06N3/04 ; G06V30/414 ; G06V30/148 ; G06V30/10

Machine learning for document compression

Abstract:

In an example embodiment, machine learning is used to intelligently compress documents to reduce the overall footprint of storing large amounts of files for an organization. Specifically, a document is split into parts, with each part representing a grouping of text or an image. Optical character recognition is performed to identify the text in images. Machine learning techniques are then applied to a part of a document in order to determine how relevant the document is for the organization. The parts that are deemed to be not relevant may then be reduced in size, either by omitting them completely or by summarizing them. This allows for the compression to be tailored specifically to the organization, resulting in the ability to compress or eliminate parts of documents that other organizations might have found relevant (and thus would not have been compressed or eliminated through traditional means).

Public/Granted literature

US20220067364A1 MACHINE LEARNING FOR DOCUMENT COMPRESSION Public/Granted day:2022-03-03

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )