Parallelized machine learning with distributed lockless training
Abstract:
Systems and methods are disclosed for providing distributed learning over a plurality of parallel machine network nodes by allocating a per-sender receive queue at every machine network node and performing distributed in-memory training; and training each unit replica and maintaining multiple copies of the unit replica being trained, wherein all unit replicas train, receive unit updates and merge in parallel in a peer-to-peer fashion, wherein each receiving machine network node merges updates at later point in time without interruption and wherein the propagating and synchronizing unit replica updates are lockless and asynchronous.
Public/Granted literature
Information query
Patent Agency Ranking
0/0