-
公开(公告)号:US10733220B2
公开(公告)日:2020-08-04
申请号:US15794487
申请日:2017-10-26
摘要: Embodiments of the invention include method, systems and computer program products for using a target similarity calculation to identify relevant content in a corpus of documents or records. The computer-implemented method includes creating, by a processor, a term frequency (TF) list for one or more documents of a corpus. The processor calculates an inverse document frequency (IDF) for each listed term. The processor calculates a TF-IDF for each listed term. The processor determines a similarity ranking for one or more documents of the corpus using a target similarity calculation using the TF-IDF for each listed term.
-
公开(公告)号:US20190130024A1
公开(公告)日:2019-05-02
申请号:US15794487
申请日:2017-10-26
IPC分类号: G06F17/30
摘要: Embodiments of the invention include method, systems and computer program products for using a target similarity calculation to identify relevant content in a corpus of documents or records. The computer-implemented method includes creating, by a processor, a term frequency (TF) list for one or more documents of a corpus. The processor calculates an inverse document frequency (IDF) for each listed term. The processor calculates a TF-IDF for each listed term. The processor determines a similarity ranking for one or more documents of the corpus using a target similarity calculation using the TF-IDF for each listed term.
-