-
公开(公告)号:US10608976B2
公开(公告)日:2020-03-31
申请号:US15793787
申请日:2017-10-25
Applicant: Dropbox, Inc.
Inventor: Aditi Jain , Manveer Singh Chawla , Thomas Berg , Swapnil Zarekar , Robert Kajic , Karandeep Johar , Aaron Feldstein , Walter Kim , Joe Nudell , Jenny Dong , Jared Wilson , Luke Thompson , David Kriegman
Abstract: Computer-implemented techniques include, during a delayed processing window, receiving reward data for arm actions taken, where the arm actions were chosen based on a previous version of an arm choice policy, and the previous version of the arm choice policy was determined based on a previous set of reward data for a previous set of arm actions taken. When the delayed processing window has closed, a new arm choice policy is determined based at least in part on the action-reward data, and the previous set of reward data and/or the previous arm choice policy. After a request to choose an arm choice is received, a particular arm action to take is determined based on the new arm choice policy. This chosen arm is provided in response to the request.
-
2.
公开(公告)号:US20200259777A1
公开(公告)日:2020-08-13
申请号:US16789200
申请日:2020-02-12
Applicant: Dropbox, Inc.
Inventor: Aditi Jain , Manveer Singh Chawla , Thomas Berg , Swapnil Zarekar , Robert Kajic , Karandeep Johar , Aaron Feldstein , Walter Kim , Joe Nudell , Jenny Dong , Jared Wilson , Luke Thompson , David Kriegman
Abstract: Techniques are provided for delayed processing for arm policy determination for content management system messaging, including, during a delayed processing window, receiving reward data for arm actions taken, where the arm actions were chosen based on a previous version of an arm choice policy, and the previous version of the arm choice policy was determined based on a previous set of reward data for a previous set of arm actions taken. When the delayed processing window has closed, a new arm choice policy is determined based at least in part on the action-reward data, and the previous set of reward data and/or the previous arm choice policy. After a request to choose an arm choice is received, a particular arm action to take is determined based on the new arm choice policy. This chosen arm is provided in response to the request.
-
公开(公告)号:US11171909B2
公开(公告)日:2021-11-09
申请号:US16789200
申请日:2020-02-12
Applicant: Dropbox, Inc.
Inventor: Aditi Jain , Manveer Singh Chawla , Thomas Berg , Swapnil Zarekar , Robert Kajic , Karandeep Johar , Aaron Feldstein , Walter Kim , Joe Nudell , Jenny Dong , Jared Wilson , Luke Thompson , David Kriegman
Abstract: Techniques are provided for delayed processing for arm policy determination for content management system messaging, including, during a delayed processing window, receiving reward data for arm actions taken, where the arm actions were chosen based on a previous version of an arm choice policy, and the previous version of the arm choice policy was determined based on a previous set of reward data for a previous set of arm actions taken. When the delayed processing window has closed, a new arm choice policy is determined based at least in part on the action-reward data, and the previous set of reward data and/or the previous arm choice policy. After a request to choose an arm choice is received, a particular arm action to take is determined based on the new arm choice policy. This chosen arm is provided in response to the request.
-
-