Method for automatically finding frequently asked questions in a helpdesk data set
    1.
    发明授权
    Method for automatically finding frequently asked questions in a helpdesk data set 失效
    在帮助台数据集中自动查找常见问题的方法

    公开(公告)号:US06804670B2

    公开(公告)日:2004-10-12

    申请号:US09935473

    申请日:2001-08-22

    IPC分类号: G06F1730

    摘要: A system and method automatically identify candidate helpdesk problem categories that are most amenable to automated solutions. The system generates a dictionary wherein each word in the text data set is identified, and the number of documents containing these words is counted, and a corresponding count is generated. The documents are partitioned into clusters. For each generated cluster, the system sorts the dictionary terms in order of decreasing occurrence frequency. It then determines a search space by selecting the top dictionary terms as specified by a user defined depth of search. Next, the system chooses a set of terms from the search space as specified by a user-defined value indicating the desired level of detail. For each possible combination of frequent terms in the search space, the system finds the set of examples containing all the terms, and then determines if the frequency is sufficiently high and the overlap sufficiently low for this candidate set of examples to be a frequently asked question.

    摘要翻译: 系统和方法自动识别最适合于自动化解决方案的候选帮助台问题类别。 系统生成字典,其中识别文本数据集中的每个字,并且包含这些字的文档的数量被计数,并且生成相应的计数。 这些文件被分成几个集群。 对于每个生成的集群,系统按照发生次数减少的顺序排列字典项。 然后,通过选择由用户定义的搜索深度指定的顶部词典术语来确定搜索空间。 接下来,系统从搜索空间中选择一组由用户定义的值指定的值,该值指示所需的详细程度。 对于搜索空间中频繁项的每个可能的组合,系统找到包含所有项的示例集合,然后确定频率是否足够高,并且该候选组示例的重叠足够低是常见问题 。