Invention Grant
- Patent Title: Identifying longform articles
-
Application No.: US14931576Application Date: 2015-11-03
-
Publication No.: US09773166B1Publication Date: 2017-09-26
- Inventor: Miriam King Connor , Isabelle L. Stanton , Amarnag Subramanya
- Applicant: Google Inc.
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Fish & Richardson P.C.
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06K9/66

Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying documents. One of the methods includes obtaining a collection of training documents, the training documents including positive documents identified as being longform documents and negative documents identified as not being longform documents; extracting one or more features from the training documents, wherein the features represent lexical or textual content of the training documents; and generating a longform document classifier trained using feature instances extracted from the training documents, wherein the generated longform document classifier is trained such that input documents are classified as being longform documents or classified as not being longform documents.
Information query