Reduction server for fast distributed training
摘要:
A data processing system, that includes: one or more host processing devices, the one or more host processing devices may be configured to support instantiation of a plurality of virtual machines such that a first set of virtual machines run one or more worker processes, each worker process operating on a respective data set to produce a respective gradient. The host processing devices may be configured to support instantiation of a second set of virtual machines running one or more reducer processes that operate on each respective gradient produced by each worker process to produce an aggregated gradient. The one or more reducer processes may cause the aggregated gradient to be broadcasted to each worker process.
公开/授权文献
信息查询
0/0