DATA PROCESSING METHOD AND SYSTEM, AND RELEVANT APPARARTUS
    21.
    发明申请
    DATA PROCESSING METHOD AND SYSTEM, AND RELEVANT APPARARTUS 有权
    数据处理方法和系统以及相关装置

    公开(公告)号:US20130159236A1

    公开(公告)日:2013-06-20

    申请号:US13722078

    申请日:2012-12-20

    CPC classification number: G06N5/02 G06F17/2785 G06F17/3071 G06N5/003

    Abstract: Embodiments of the present invention disclose a data processing method including: sending global initial statistical information to each slave node; merging received local statistical information of each slave node, to obtain new global statistical information; if Gibbs sampling performed by a slave node has ended, calculating a probability distribution between a document and topic and a probability distribution between the topic and a word according to the new global statistical information; according to the probability distributions obtained through calculation, establishing a likelihood function of a text set, and maximizing the likelihood function, to obtain a new hLDA hyper-parameter; and if iteration of solving for an hLDA hyper-parameter has converged, and according to the new hLDA hyper-parameter, calculating and outputting the probability distribution between the document and topic and the probability distribution between the topic and word.

    Abstract translation: 本发明的实施例公开了一种数据处理方法,包括:向每个从节点发送全局初始统计信息; 合并接收每个从节点的本地统计信息,获得新的全局统计信息; 如果从节点执行的吉布斯抽样已经结束,则根据新的全局统计信息计算文档和主题之间的概率分布以及主题与单词之间的概率分布; 根据通过计算获得的概率分布,建立文本集的似然函数,并最大化似然函数,获得新的hLDA超参数; 并且如果求解hLDA超参数的迭代已经收敛,并且根据新的hLDA超参数,计算和输出文档和主题之间的概率分布以及主题与词之间的概率分布。

Patent Agency Ranking