WORD CLUSTERING AND CATEGORIZATION
    1.
    发明申请

    公开(公告)号:US20200035229A1

    公开(公告)日:2020-01-30

    申请号:US16044031

    申请日:2018-07-24

    摘要: A system for categorizing words into clusters includes a receiver to receive a set of sentences formed by a plurality of words. The set of sentences is indicative of interaction of a user with a virtual assistant. A categorizer categorizes the plurality of words into a first set of clusters by using a first clustering technique, and categorizes the plurality of words into a second set of clusters by using a second clustering technique. A detector detects words that appear in similar clusters after categorization by the first clustering technique and the second clustering technique. Similarity of clusters is based on a nature of words forming the clusters. A generator generates a confidence score for each of the plurality of words based on the detection. The confidence score of a word is indicative of accuracy of the categorization of the word.