-
公开(公告)号:US20090313243A1
公开(公告)日:2009-12-17
申请号:US12324619
申请日:2008-11-26
申请人: Paul Buitelaar , Pinar Wennerberg , Sonja Zillner
发明人: Paul Buitelaar , Pinar Wennerberg , Sonja Zillner
CPC分类号: G06F16/367
摘要: A semantic data resource of a domain is processed by calculating relevance scores for terms which occur in domain corpora and weighting the semantic data resource depending on the relevance scores calculated for these terms. The semantic data resource may include domain-specific terms and relations, such as a domain ontology, a domain terminology and a domain classification. The domain ontology may include a domain-specific-hierarchy of terms assigned to nodes which are connected by edges and may be encoded in a web ontology language. The relevance scores may be chi-square scores which are calculated depending on a frequency of a term in the domain corpora and an expected frequency of the term.
摘要翻译: 通过计算域语料库中出现的术语的相关性分数来处理域的语义数据资源,并根据为这些术语计算的相关性分数对语义数据资源进行加权。 语义数据资源可以包括域特定的术语和关系,诸如域本体论,域术语和域分类。 域本体可以包括分配给通过边缘连接并且可以以网络本体语言编码的节点的项的域特定层级。 相关性分数可以是根据域语料库中的术语的频率和该术语的预期频率来计算的卡方分数。