发明申请
- 专利标题: Parallel Processing Of Data Sets
- 专利标题(中): 数据集并行处理
-
申请号: US12942736申请日: 2010-11-09
-
公开(公告)号: US20120117008A1公开(公告)日: 2012-05-10
- 发明人: Ning-Yi Xu , Feng-Hsiung Hsu , Feng Yan
- 申请人: Ning-Yi Xu , Feng-Hsiung Hsu , Feng Yan
- 申请人地址: US WA Redmond
- 专利权人: MICROSOFT CORPORATION
- 当前专利权人: MICROSOFT CORPORATION
- 当前专利权人地址: US WA Redmond
- 主分类号: G06F9/46
- IPC分类号: G06F9/46 ; G06N5/02 ; G06N5/04 ; G06F15/18
摘要:
Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.
公开/授权文献
- US08868470B2 Parallel processing of data sets 公开/授权日:2014-10-21
信息查询