Document data classification using a noise-to-content ratio

    公开(公告)号:US10275523B1

    公开(公告)日:2019-04-30

    申请号:US15668537

    申请日:2017-08-03

    Abstract: A method and system for classifying document data is described. The method may include classifying a first portion of an electronic document as substantive content or noise, classifying a second portion of the electronic document as substantive content or noise, determining a first feature of the first portion of the electronic document indicative of substantive content using a machine learning algorithm, and determining a second feature of the second portion of the electronic document indicative of noise using the machine learning algorithm.

Patent Agency Ranking