-
公开(公告)号:EP2595065B1
公开(公告)日:2019-08-14
申请号:EP11189099.2
申请日:2011-11-15
发明人: Larsson, Tomas , Lindgren, Mats
-
公开(公告)号:EP2595065A1
公开(公告)日:2013-05-22
申请号:EP11189099.2
申请日:2011-11-15
发明人: Larsson, Tomas , Lindgren, Mats
CPC分类号: G06F16/24578 , G06F16/3346 , G06F16/35 , G06F16/353 , G06F16/355 , G06F16/358 , G06F16/36 , G06F16/367 , G06F16/38 , G06F16/93
摘要: A device for categorizing data sets obtained from a number of sources comprises a symbol frequency determining unit (24) that determines the frequency of appearance of symbols in a first collection of data sets and the frequency of appearance of symbols in a second collection of data sets, a significance determining unit (26) that determines the most significant symbols for the second collection based on the frequency of appearance in the first collection and the frequency of appearance in the second collection, a grouping unit (28) that groups the most significant symbols into groups according to their appearance in the same data set and a ranking unit (30) that ranks the data sets in relation to the symbol groups according to a ranking scheme.
摘要翻译: 用于对从多个源获得的数据集进行分类的装置包括符号频率确定单元(24),其确定数据集的第一集合中的符号的出现频率以及在第二数据集合集合中出现符号的频率 ,基于所述第一集合中出现的频率和所述第二集合中出现的频率来确定所述第二集合的最高有效符号的重要性确定单元(26),分组单元(28),其对所述最重要符号 根据它们在相同数据集中的出现以及根据排名方案对与符号组相关的数据集进行排序的排名单元(30)进行分组。
-