-
公开(公告)号:US06973446B2
公开(公告)日:2005-12-06
申请号:US09730616
申请日:2000-12-06
申请人: Hiroshi Mamitsuka , Naoki Abe
发明人: Hiroshi Mamitsuka , Naoki Abe
CPC分类号: G06N5/025 , G06F2216/03
摘要: A general-purpose knowledge finding method for efficient knowledge finding by selectively sampling only data in large information amounts from a database. Learning means 104 causes a lower-order learning algorithm, inputted via an input unit 107, to perform learning on plural partial samples generated by sampling from data stored in a high-speed main memory 120, to obtain plural hypotheses. Data selection means 105 uses the hypotheses to estimate information amounts of respective candidate data read from a large-capacity data storage device 130, and additionally stores only data in large information amounts into the high-speed main memory 120. A control unit 106 repeats the processing a predetermined number of times, and stores obtained final hypotheses. A prediction unit 102 predicts a label value of unknown-labeled data inputted into the input unit 107 by the final hypotheses, and an output unit 101 outputs the predicted value.
摘要翻译: 通过有选择地只从数据库中抽取大量信息中的数据来高效地进行知识发现的通用知识发现方法。 学习装置104使得经由输入单元107输入的低阶学习算法通过从存储在高速主存储器120中的数据进行采样而产生的多个部分样本执行学习,以获得多个假设。 数据选择装置105使用假设来估计从大容量数据存储装置130读取的各个候选数据的信息量,并且仅将大量信息量的数据仅存储到高速主存储器120中。 控制单元106重复处理预定次数,并存储获得的最终假设。 预测单元102通过最终假设来预测输入到输入单元107的未知标签数据的标签值,并且输出单元101输出预测值。