-
公开(公告)号:US10275523B1
公开(公告)日:2019-04-30
申请号:US15668537
申请日:2017-08-03
Applicant: Amazon Technologies, Inc.
Inventor: Bernhard Wolkerstorfer , Lei Li , Narendra S. Parihar
Abstract: A method and system for classifying document data is described. The method may include classifying a first portion of an electronic document as substantive content or noise, classifying a second portion of the electronic document as substantive content or noise, determining a first feature of the first portion of the electronic document indicative of substantive content using a machine learning algorithm, and determining a second feature of the second portion of the electronic document indicative of noise using the machine learning algorithm.