Hierarchical data classification using frequency analysis
摘要:
A method of classifying individual documents in a document collection according to a hierarchy may include selecting an object from the hierarchy, generating one or more variants for the object, and for each of the one or more variants, determining a frequency threshold based at least in part on how frequently the one or more variants occurs in the document collection. The method may also include selecting a first document in the document collection, where the first document includes one or more objects that match at least one of the one or more variants. The method may additionally include determining that the number of the one or more objects exceeds the frequency threshold and classifying the first document with the object in the hierarchy.
公开/授权文献
信息查询
0/0