-
公开(公告)号:EP3975063A1
公开(公告)日:2022-03-30
申请号:EP20198329.3
申请日:2020-09-25
发明人: Kandemir, Melih , Peters, Jan , Flynn, Hamish
摘要: Device and computer implemented method of machine learning a model for mapping a dataset to a solution of a task depending on a first parameter, characterized by determining (204) a second parameter for assigning the second parameter to the first parameter in a first iteration of learning and determining (204) a third parameter for determining a rate for changing the first parameter in at least one iteration of learning depending on the third parameter and depending on a measure for evaluating the solution to the task, wherein determining (204) the second parameter or third parameter comprises determining (204-2a) a solution of an initial value problem that depends on a derivative of the measure with respect to the first parameter, wherein determining (204-2a) the solution comprises determining a first part of the solution of the initial value problem depending on an initial value, determining a second part of the solution of the initial value problem depending on the first part, determining (204-2c) for the first part a partial derivative, determining (204-2c) for the second part a partial derivative, and determining (204-2d) the second parameter and/or the third parameter depending on at least one of the partial derivatives.
-
公开(公告)号:EP4386632A1
公开(公告)日:2024-06-19
申请号:EP22213403.3
申请日:2022-12-14
申请人: Robert Bosch GmbH
发明人: Vinogradska, Julia , Peters, Jan , Berkenkamp, Felix , Bottero, Alessandro Giacomo , Luis Goncalves, Carlos Enrique
摘要: According to various embodiments, a method for training a control policy is described, comprising estimating the variance of a value function which associates a state with a value of the state or a pair of state and action with a value of the pair by solving a Bellman uncertainty equation, wherein, for each of multiple states, the reward function of the Bellman uncertainty equation is set to the difference of the total uncertainty about the mean of the value of the subsequent state following the state and the average aleatoric uncertainty of the value of the subsequent state and biasing the control policy in training towards regions for which the estimation gives a higher variance of the value function than for other regions.
-
公开(公告)号:EP4307055A1
公开(公告)日:2024-01-17
申请号:EP22184158.8
申请日:2022-07-11
申请人: Robert Bosch GmbH
发明人: Vinogradska, Julia , Peters, Jan , Berkenkamp, Felix , Bottero, Alessandro Giacomo , Luis Goncalves, Carlos Enrique
摘要: The invention relates to a computer-implemented control method (700) of constrained controlling of a computer-controlled system. The system is controlled according to a control input, which is safe if a constraint quantity resulting from the controlling of the computer-controlled system exceeds a constraint threshold. A current control input is determined based on previous control inputs and corresponding previous noisy measurements. The computer-controlled system is controlled according to the current control input, thereby obtaining a current noisy measurement of the resulting constraint quantity. The current control input is determined based on a mutual information between a first random variable representing the constraint quantity resulting from the current control input and a second random variable indicating whether a further control input is safe.
-
-