Abstract:
Techniques are provided for delayed processing for arm policy determination for content management system messaging, including, during a delayed processing window, receiving reward data for arm actions taken, where the arm actions were chosen based on a previous version of an arm choice policy, and the previous version of the arm choice policy was determined based on a previous set of reward data for a previous set of arm actions taken. When the delayed processing window has closed, a new arm choice policy is determined based at least in part on the action-reward data, and the previous set of reward data and/or the previous arm choice policy. After a request to choose an arm choice is received, a particular arm action to take is determined based on the new arm choice policy. This chosen arm is provided in response to the request.
Abstract:
Techniques are provided for delayed processing for arm policy determination for content management system messaging, including, during a delayed processing window, receiving reward data for arm actions taken, where the arm actions were chosen based on a previous version of an arm choice policy, and the previous version of the arm choice policy was determined based on a previous set of reward data for a previous set of arm actions taken. When the delayed processing window has closed, a new arm choice policy is determined based at least in part on the action-reward data, and the previous set of reward data and/or the previous arm choice policy. After a request to choose an arm choice is received, a particular arm action to take is determined based on the new arm choice policy. This chosen arm is provided in response to the request.
Abstract:
Computer-implemented techniques include, during a delayed processing window, receiving reward data for arm actions taken, where the arm actions were chosen based on a previous version of an arm choice policy, and the previous version of the arm choice policy was determined based on a previous set of reward data for a previous set of arm actions taken. When the delayed processing window has closed, a new arm choice policy is determined based at least in part on the action-reward data, and the previous set of reward data and/or the previous arm choice policy. After a request to choose an arm choice is received, a particular arm action to take is determined based on the new arm choice policy. This chosen arm is provided in response to the request.
Abstract:
Techniques for efficiently selecting upsell content to serve to users of an online service based on complex user archetypes. Serving upsell content to a user may include selecting upsell content to serve to the user based on efficiently matching the user to a complex user archetype in response to receiving a request from a computing device of the user. The user archetype may be complex in the sense that it represents a specific pattern of user interaction with the online service over a period of time, as opposed to being based solely on user demographic information such as the users' age, sex, gender, ethnicity, location of residence, profession, and the like. In addition, the specific pattern may vary depending on the type of online service. For example, according to some of the disclosed embodiments, upsell content may be efficiently selected to serve to a user of a content item management service based on the user having added five or more work files to the content item management service or edited one or more work files hosted with the content item management service during the last twenty-eight days.