-
公开(公告)号:US20200234005A1
公开(公告)日:2020-07-23
申请号:US16839192
申请日:2020-04-03
发明人: Pamela Bogdan , Gary Gressel , Gary Reser , Alex Rubarkh , Kenneth Shirley
IPC分类号: G06F40/216 , G06F40/30 , G06F16/35 , G06F16/28
摘要: Aspects of the subject disclosure may include, for example, a process that performs a statistical, natural-language processing analysis on a group of text documents to determine a group of topics. The topics are determined according to parameters obtained by training on a sample of documents. One or more topics in a subset of topics are associated to each document, resulting in topic-document pairs. A bias is identified for each topic-document pair, and clusters of topics are created from the subset of topics. Each cluster of topics is determined from a value for each bias of each topic-document pair and from a frequency of occurrence of each topic. Each cluster is presentable according to a corresponding image configuration based on all or a subset of the bias dimensions and the frequency of occurrence of topics in a cluster that distinguishes the cluster from other clusters. Other embodiments are disclosed.
-
公开(公告)号:US11010548B2
公开(公告)日:2021-05-18
申请号:US16839192
申请日:2020-04-03
发明人: Pamela Bogdan , Gary Gressel , Gary Reser , Alex Rubarkh , Kenneth Shirley
IPC分类号: G06F40/00 , G06F40/216 , G06F16/28 , G06F16/35 , G06F40/30
摘要: Aspects of the subject disclosure may include, for example, a process that performs a statistical, natural-language processing analysis on a group of text documents to determine a group of topics. The topics are determined according to parameters obtained by training on a sample of documents. One or more topics in a subset of topics are associated to each document, resulting in topic-document pairs. A bias is identified for each topic-document pair, and clusters of topics are created from the subset of topics. Each cluster of topics is determined from a value for each bias of each topic-document pair and from a frequency of occurrence of each topic. Each cluster is presentable according to a corresponding image configuration based on all or a subset of the bias dimensions and the frequency of occurrence of topics in a cluster that distinguishes the cluster from other clusters. Other embodiments are disclosed.
-
公开(公告)号:US20210232764A1
公开(公告)日:2021-07-29
申请号:US17232546
申请日:2021-04-16
发明人: Pamela Bogdan , Gary Gressel , Gary Reser , Alex Rubarkh , Kenneth Shirley
IPC分类号: G06F40/216 , G06F16/28 , G06F16/35 , G06F40/30
摘要: Aspects of the subject disclosure may include, for example, a process that performs a statistical, natural-language processing analysis on a group of text documents to determine a group of topics. The topics are determined according to parameters obtained by training on a sample of documents. One or more topics in a subset of topics are associated to each document, resulting in topic-document pairs. A bias is identified for each topic-document pair, and clusters of topics are created from the subset of topics. Each cluster of topics is determined from a value for each bias of each topic-document pair and from a frequency of occurrence of each topic. Each cluster is presentable according to a corresponding image configuration based on all or a subset of the bias dimensions and the frequency of occurrence of topics in a cluster that distinguishes the cluster from other clusters. Other embodiments are disclosed.
-
-