Abstract:
A computer system is optimized for implementing a neural network nodal graph that has dense inputs and sparse inputs. The computer system has a local machine that receives user inputs and is optimized for computing power, and has a remote machine that stores embedding matrices and parameters, and is optimized for memory capacity. In accordance with a cost function applied to each node, the neural network nodal graph is divided into graph segments based on its types of inputs and needed computing resources for execution. In accordance with the cost functions, the graph segments are divided between the remote and local machines for execution, and the results of all the graph segments are combined in the local machine.
Abstract:
When an online system receives a request to present content items to a user, a content selection system included in the online system selects content items for presentation to the user during a latency period from the time the request was received until the time when the content items are sent. A feedback control mechanism communicates with each computing device of the content selection system to determine the latency period of each computing device. The feedback control mechanism also determines a target latency period in which content items are selected. By comparing the latency period of each computing device to the target latency period, an amount of information to be evaluated by each computing device is determined based on whether a computing device's latency period is greater than or less than the target latency period.
Abstract:
The present disclosure is directed to a high-capacity training and prediction machine learning platform that can support high-capacity parameter models (e.g., with 10 billion weights). The platform implements a generic feature transformation layer for joint updating and a distributed training framework utilizing shard servers to increase training speed for the high-capacity model size. The models generated by the platform can be utilized in conjunction with existing dense baseline models to predict compatibilities between different groupings of objects (e.g., a group of two objects, three objects, etc.).
Abstract:
The present disclosure is directed to a high-capacity training and prediction machine learning platform that can support high-capacity parameter models (e.g., with 10 billion weights). The platform implements a generic feature transformation layer for joint updating and a distributed training framework utilizing shard servers to increase training speed for the high-capacity model size. The models generated by the platform can be utilized in conjunction with existing dense baseline models to predict compatibilities between different groupings of objects (e.g., a group of two objects, three objects, etc.).
Abstract:
When an online system receives a request to present content items to a user, a content selection system included in the online system selects content items for presentation to the user during a latency period from the time the request was received until the time when the content items are sent. A feedback control mechanism communicates with each computing device of the content selection system to determine the latency period of each computing device. The feedback control mechanism also determines a target latency period in which content items are selected. By comparing the latency period of each computing device to the target latency period, an amount of information to be evaluated by each computing device is determined based on whether a computing device's latency period is greater than or less than the target latency period.