发明申请
US20070239431A1 Scalable probabilistic latent semantic analysis 有权
可扩展概率潜在语义分析

Scalable probabilistic latent semantic analysis
摘要:
A scalable two-pass scalable probabilistic latent semantic analysis (PLSA) methodology is disclosed that may perform more efficiently, and in some cases more accurately, than traditional PLSA, especially where large and/or sparse data sets are provided for analysis. The improved methodology can greatly reduce the storage and/or computational costs of training a PLSA model. In the first pass of the two-pass methodology, objects are clustered into groups, and PLSA is performed on the groups instead of the original individual objects. In the second pass, the conditional probability of a latent class, given an object, is obtained. This may be done by extending the training results of the first pass. During the second pass, the most likely latent classes for each object are identified.
公开/授权文献
信息查询
0/0