Invention Grant
- Patent Title: Document representation for machine-learning document classification
-
Application No.: US15623071Application Date: 2017-06-14
-
Publication No.: US10482118B2Publication Date: 2019-11-19
- Inventor: Xin Zheng
- Applicant: SAP SE
- Applicant Address: DE Walldorf
- Assignee: SAP SE
- Current Assignee: SAP SE
- Current Assignee Address: DE Walldorf
- Agency: Fish & Richardson P.C.
- Main IPC: G06F17/30
- IPC: G06F17/30 ; G06F16/35 ; G06F17/28 ; G06N20/00

Abstract:
Methods, systems, and computer-readable storage media for providing weighted vector representations of documents, with actions including receiving text data, the text data including a plurality of documents, each document including a plurality of words, processing the text data to provide a plurality of word-vectors, each word-vector being based on a respective word of the plurality of words, determining a plurality of similarity scores based on the plurality of word-vectors, each similarity score representing a degree of similarity between word-vectors, grouping words of the plurality of words into clusters based on the plurality of similarity scores, each cluster including two or more words of the plurality of words, and providing a document representation for each document in the plurality of documents, each document representation including a feature vector, each feature corresponding to a cluster.
Public/Granted literature
- US20180365248A1 DOCUMENT REPRESENTATION FOR MACHINE-LEARNING DOCUMENT CLASSIFICATION Public/Granted day:2018-12-20
Information query