DETERMINING TERM SCORES BASED ON A MODIFIED INVERSE DOMAIN FREQUENCY

    公开(公告)号:US20170154107A1

    公开(公告)日:2017-06-01

    申请号:US15325807

    申请日:2014-12-11

    CPC classification number: G06F16/345 G06F16/35 G06F16/36

    Abstract: Determining term scores based on a modified inverse domain frequency is disclosed. One example is a system including a data processing engine, an evaluator, and a data analytics module. The data processing engine identifies a key term associated with a system, and a sub-plurality of a plurality of documents, the sub-plurality of documents associated with the event. The evaluator determines, based on the presence or absence of the key term, a first distribution related to the sub-plurality of documents, and a second distribution related to the plurality of documents, and evaluates, for the key term, a term score based on the first distribution and the second distribution, the term score indicative of a modified inverse domain frequency based on the sub-plurality of documents. The data analytics module includes the key term in a word cloud when the term score for the key term satisfies a threshold.

Patent Agency Ranking