Risk Quantification for Policy Deployment
    1.
    发明申请
    Risk Quantification for Policy Deployment 审中-公开
    风险量化政策部署

    公开(公告)号:US20160148251A1

    公开(公告)日:2016-05-26

    申请号:US14552047

    申请日:2014-11-24

    IPC分类号: G06Q30/02

    摘要: Risk quantification, policy search, and automated safe policy deployment techniques are described. In one or more implementations, techniques are utilized to determine safety of a policy, such as to express a level of confidence that a new policy will exhibit an increased measure of performance (e.g., interactions or conversions) over a currently deployed policy. In order to make this determination, reinforcement learning and concentration inequalities are utilized, which generate and bound confidence values regarding the measurement of performance of the policy and thus provide a statistical guarantee of this performance. These techniques are usable to quantify risk in deployment of a policy, select a policy for deployment based on estimated performance and a confidence level in this estimate (e.g., which may include use of a policy space to reduce an amount of data processed), used to create a new policy through iteration in which parameters of a policy are iteratively adjusted and an effect of those adjustments are evaluated, and so forth.

    摘要翻译: 描述风险量化,策略搜索和自动化安全策略部署技术。 在一个或多个实现中,利用技术来确定策略的安全性,例如表示新策略将相对于当前部署的策略表现出增加的性能测量(例如,交互或转换)的置信度。 为了做出这一决定,利用强化学习和集中不平等,产生和束缚关于策略绩效测量的置信度值,从而提供这种表现的统计保证。 这些技术可用于量化策略部署中的风险,根据估计的性能和置信水平选择部署策略(例如,可能包括使用策略空间来减少处理的数据量),使用 通过迭代创建一个新策略,在该策略中迭代地调整策略的参数,并评估这些调整的效果,等等。

    Searching for Safe Policies to Deploy
    2.
    发明申请
    Searching for Safe Policies to Deploy 审中-公开
    搜索部署的安全策略

    公开(公告)号:US20160148250A1

    公开(公告)日:2016-05-26

    申请号:US14551975

    申请日:2014-11-24

    IPC分类号: G06Q30/02

    摘要: Risk quantification, policy search, and automated safe policy deployment techniques are described. In one or more implementations, techniques are utilized to determine safety of a policy, such as to express a level of confidence that a new policy will exhibit an increased measure of performance (e.g., interactions or conversions) over a currently deployed policy. In order to make this determination, reinforcement learning and concentration inequalities are utilized, which generate and bound confidence values regarding the measurement of performance of the policy and thus provide a statistical guarantee of this performance. These techniques are usable to quantify risk in deployment of a policy, select a policy for deployment based on estimated performance and a confidence level in this estimate (e.g., which may include use of a policy space to reduce an amount of data processed), used to create a new policy through iteration in which parameters of a policy are iteratively adjusted and an effect of those adjustments are evaluated, and so forth.

    摘要翻译: 描述风险量化,策略搜索和自动化安全策略部署技术。 在一个或多个实现中,利用技术来确定策略的安全性,例如表示新策略将相对于当前部署的策略表现出增加的性能测量(例如,交互或转换)的置信度。 为了做出这一决定,利用强化学习和集中不平等,产生和束缚关于策略绩效测量的置信度值,从而提供这种表现的统计保证。 这些技术可用于量化策略部署中的风险,根据估计的性能和置信水平选择部署策略(例如,可能包括使用策略空间来减少处理的数据量),使用 通过迭代创建一个新策略,在该策略中迭代地调整策略的参数,并评估这些调整的效果,等等。

    TECHNIQUES FOR PROVIDING SEQUENTIAL RECOMMENDATIONS TO USERS

    公开(公告)号:US20180165590A1

    公开(公告)日:2018-06-14

    申请号:US15373849

    申请日:2016-12-09

    IPC分类号: G06N7/00 G06F17/30 G06N99/00

    摘要: Certain embodiments involve generating personalized recommendations for users by inferring a propensity of each individual user to accept a recommendation. For example, a system generates a personalized user model based on a historical transition matrix that provides state transition probabilities from a general population of users. The probabilities are adjusted based on the propensity for a user to accept a recommendation. The system determines a recommended action for the user to transition between predefined states based on the user model. Once the user has performed an activity that transitions from a current state, the system adjusts a probability distribution for an estimate of the propensity based on whether the activity is the recommended action.

    Recommending Advertisements Using Ranking Functions

    公开(公告)号:US20170206549A1

    公开(公告)日:2017-07-20

    申请号:US14997987

    申请日:2016-01-18

    IPC分类号: G06Q30/02 H04L29/08 G06N99/00

    摘要: A digital medium environment is described to recommend advertisements using ranking functions. A ranking function is configured to compute a score by applying a user context vector associated with a user to individual ranking weight vectors associated with advertisements, and provide the advertisement with the highest score to the user. In order to learn the ranking weight vectors for the ranking function, training data is obtained that includes user interactions with advertisements during previous sessions as well as user context vectors. The ranking weight vectors for the ranking function associated with each advertisement can then be learned by controlling the score generated by the ranking function to be higher for positive interactions than the negative interactions. To do so, the ranking weight vectors may be learned by optimizing an area under the curve ranking loss (AUCL) for the ranking function.

    System Identification Framework
    5.
    发明申请
    System Identification Framework 审中-公开
    系统识别框架

    公开(公告)号:US20150262205A1

    公开(公告)日:2015-09-17

    申请号:US14207145

    申请日:2014-03-12

    CPC分类号: G06Q30/0202 G06Q10/067

    摘要: Optimizing customer lifetime value (LTV) techniques are described. In one or more implementations, a simulator is configured to derive a prediction model based on data indicative of user interaction online with marketing offers. The prediction model may be produced by automatically classifying variables according to feature types and matching each feature type to a response function that defines how the variable responds to input actions. The classification of variables and/or corresponding response functions per the prediction model may consider dependencies between variables and dependencies between successive states. An evaluator may then be invoked to apply the prediction model to test a proposed marketing strategy offline. Application of the prediction model is designed to predict user response to simulated offers/actions and enable evaluation of marketing strategies with respect to one or more long-term objectives.

    摘要翻译: 描述优化客户寿命价值(LTV)技术。 在一个或多个实现中,模拟器被配置为基于指示用户在线与营销优惠的交互的数据来导出预测模型。 可以通过根据特征类型自动对变量进行分类并将每个特征类型匹配到定义变量如何响应于输入动作的响应函数来产生预测模型。 每个预测模型的变量分类和/或相应的响应函数可以考虑变量之间的依赖关系和连续状态之间的依赖关系。 然后可以调用评估者来应用预测模型以离线测试建议的营销策略。 预测模型的应用旨在预测用户对模拟报价/行动的反应,并能够对一个或多个长期目标进行营销策略的评估。

    LEARNING USER PREFERENCES USING SEQUENTIAL USER BEHAVIOR DATA TO PREDICT USER BEHAVIOR AND PROVIDE RECOMMENDATIONS

    公开(公告)号:US20180129971A1

    公开(公告)日:2018-05-10

    申请号:US15348747

    申请日:2016-11-10

    IPC分类号: G06N99/00

    摘要: Certain embodiments involve learning user preferences and predicting user behavior based on sequential user behavior data. For example, a system obtains data about a sequence of prior actions taken by multiple users. The system determines a similarity between a prior action taken by the various users and groups the various users into groups or clusters based at least in part on the similarity. The system trains a machine-learning algorithm such that the machine-learning algorithm can be used to predict a subsequent action of a user among the various users based on the various clusters. The system further obtains data about a current action of a new user and determines which of the clusters to associate with the new user based on the new user's current action. The system determines an action to be recommended to the new user based on the cluster associated with the new user. The action can include a series or sequence of actions to be taken by the new user. The system further provides the series or sequence of actions or an action of the series or sequence to the new user.

    Metric Forecasting Employing a Similarity Determination in a Digital Medium Environment

    公开(公告)号:US20180276691A1

    公开(公告)日:2018-09-27

    申请号:US15465449

    申请日:2017-03-21

    IPC分类号: G06Q30/02 G06N3/08

    摘要: Metric forecasting techniques and systems in a digital medium environment are described that leverage similarity of elements, one to another, in order to generate a forecast value for a metric for a particular element. In one example, training data is received that describes a time series of values of the metric for a plurality of elements. The model is trained to generate the forecast value of the metric, the training using machine learning of a neural network based on the training data. The training includes generating dimensional-transformation data configured to transform the training data into a simplified representation to determine similarity of the plurality of elements, one to another, with respect to the metric over the time series. The training also includes generating model parameters of the neural network based on the simplified representation to generate the forecast value of the metric.

    Metric Forecasting in a Digital Medium Environment

    公开(公告)号:US20180211266A1

    公开(公告)日:2018-07-26

    申请号:US15413892

    申请日:2017-01-24

    IPC分类号: G06Q30/02 G06Q10/10

    CPC分类号: G06Q30/0202 G06Q10/109

    摘要: Metric forecasting techniques in a digital medium environment are described. A time series interval is identified by an analytics system that is exhibited by input usage data. The input usage data describes values of a metric involved in the provision of the digital content by a service provider system. A determination is then made by the analytics system as to whether historical usage data includes the identified time series interval. A forecast model is then selected by the analytics system from a plurality of forecast models based on a result of the determination and the identified time series interval. Forecast data is then generated by a forecast module of the analytics system. The forecast data is configured to predict at least one value of the metric based on the selected forecast model, a result of the determination, and the input usage data.

    Automated System for Safe Policy Improvement
    9.
    发明申请
    Automated System for Safe Policy Improvement 审中-公开
    安全政策改进自动化系统

    公开(公告)号:US20160148246A1

    公开(公告)日:2016-05-26

    申请号:US14551898

    申请日:2014-11-24

    IPC分类号: G06Q30/02

    摘要: Risk quantification, policy search, and automated safe policy deployment techniques are described. In one or more implementations, techniques are utilized to determine safety of a policy, such as to express a level of confidence that a new policy will exhibit an increased measure of performance (e.g., interactions or conversions) over a currently deployed policy. In order to make this determination, reinforcement learning and concentration inequalities are utilized, which generate and bound confidence values regarding the measurement of performance of the policy and thus provide a statistical guarantee of this performance. These techniques are usable to quantify risk in deployment of a policy, select a policy for deployment based on estimated performance and a confidence level in this estimate (e.g., which may include use of a policy space to reduce an amount of data processed), used to create a new policy through iteration in which parameters of a policy are iteratively adjusted and an effect of those adjustments are evaluated, and so forth.

    摘要翻译: 描述风险量化,策略搜索和自动化安全策略部署技术。 在一个或多个实现中,利用技术来确定策略的安全性,例如表示新策略将相对于当前部署的策略表现出增加的性能测量(例如,交互或转换)的置信度。 为了做出这一决定,利用强化学习和集中不平等,产生和束缚关于策略绩效测量的置信度值,从而提供这种表现的统计保证。 这些技术可用于量化策略部署中的风险,根据估计的性能和置信水平选择部署策略(例如,可能包括使用策略空间来减少处理的数据量),使用 通过迭代创建一个新策略,在该策略中迭代地调整策略的参数,并评估这些调整的效果,等等。

    TESTING A MARKETING STRATEGY OFFLINE USING AN APPROXIMATE SIMULATOR
    10.
    发明申请
    TESTING A MARKETING STRATEGY OFFLINE USING AN APPROXIMATE SIMULATOR 审中-公开
    使用大致模拟器测试营销策略

    公开(公告)号:US20150134443A1

    公开(公告)日:2015-05-14

    申请号:US14080038

    申请日:2013-11-14

    IPC分类号: G06Q30/02

    CPC分类号: G06Q30/0242

    摘要: In various example embodiments, a system and method for testing marketing strategies and approximate simulators offline for lifetime value marketing. In example embodiments, real world data, simulated data, and one or more policies that resulted in the simulated data are obtained. Errors between the real world data and the simulated data are determined. Using the determined errors, bounds are determined. Simulators are ranked based on the determined bounds, whereby a lower bound indicates a first simulator providing simulated data closer to the real world data then a second simulator having a higher bound.

    摘要翻译: 在各种示例实施例中,一种用于测试营销策略和近似模拟器以离线生活价值营销的系统和方法。 在示例实施例中,获得导致模拟数据的真实世界数据,模拟数据和一个或多个策略。 确定真实世界数据和模拟数据之间的错误。 使用确定的错误,确定边界。 模拟器基于所确定的边界进行排名,由此下界指示提供更接近真实世界数据的模拟数据的第一模拟器,然后是具有较高边界的第二模拟器。