-
公开(公告)号:US20240037177A1
公开(公告)日:2024-02-01
申请号:US18022475
申请日:2020-09-29
Applicant: NEC Corporation
Inventor: Shinji Ito
IPC: G06F17/11
CPC classification number: G06F17/11
Abstract: In an optimization device, an acquisition means acquires a reward obtained by executing a certain policy. An updating means updates a probability distribution of the policy based on the obtained reward. Here, the updating means uses a weighted sum of the probability distributions updated in a past as a constraint. A determination means determines the policy to be executed, based on the updated probability distributions.
-
公开(公告)号:US20240103812A1
公开(公告)日:2024-03-28
申请号:US18275121
申请日:2021-02-03
Applicant: NEC Corporation
Inventor: Shinji Ito
IPC: G06F7/76
CPC classification number: G06F7/76
Abstract: To enable selection of useful vector sequence a1,a2, . . . ,aT in a bandit linear optimization algorithm for which a fixed strategy is ineffective, an information processing apparatus (1) includes a vector selection unit (11) that selects a vector at in each round t∈[T] (T is any natural number) from a subset A of a d-dimensional vector space Rd (d is any natural number). The vector selection unit (11) uses l1,l2, . . . ,lT∈Rd as loss vectors to select the vector at in each round t such that an asymptotic behavior of an expected value of tracking regret R(u)=Σt∈[T]ltTat−Σt∈[T]ltTut with respect to any comparative vector sequence u1,u2, . . . ,uT∈A or an asymptotic behavior ignoring logarithmic factors of the expected value of the tracking regret R(u) is constrained from above by a preset function A(d,T,P), where P is a natural number not less than 1 given by P=|{t∈[T−1]|ut≠ut+1}|.
-
公开(公告)号:US11949809B2
公开(公告)日:2024-04-02
申请号:US17765139
申请日:2019-10-07
Applicant: NEC Corporation
Inventor: Shinji Ito
IPC: H04M15/00 , H04L41/0823 , H04W24/02
CPC classification number: H04M15/66 , H04L41/0823 , H04W24/02
Abstract: An optimization apparatus (100) includes a setting unit (110) that sets a predetermined non-linear objective function, a policy determination unit (120) that determines a policy to be executed in online optimization in a bandit problem, based on the non-linear objective function, a policy execution unit (130) that acquires a reward as an execution result of the determined policy, an update rate determination unit (140) that determines an update rate of the non-linear objective function by a multiplicative weight update method, based on the acquired reward and the non-linear objective function, and an update unit (150) that updates the non-linear objective function, based on the update rate.
-
4.
公开(公告)号:US20230214855A1
公开(公告)日:2023-07-06
申请号:US17927999
申请日:2020-05-29
Applicant: NEC Corporation
Inventor: Shinji Ito
IPC: G06Q30/0201
CPC classification number: G06Q30/0201
Abstract: An optimization apparatus includes: a selection unit that selects, as a correction value, an element having a magnitude equal to or smaller than a predetermined value from among convex hulls of a policy set; an acquisition unit that acquires a result of execution of a second policy executed in a second round, the second round being a round a predetermined round before a first round for executing a first policy that is determined from among the policy set; a calculation unit that calculates an estimated value of a loss vector in the execution of the policy based on the result of the execution and the correction value selected in the second round; an update unit that updates a first probability distribution based on the estimated value; and a determination unit that determines a policy for a next round based on the updated first probability distribution.
-
公开(公告)号:US11586951B2
公开(公告)日:2023-02-21
申请号:US16761071
申请日:2018-08-17
Applicant: NEC CORPORATION
Inventor: Shinji Ito , Ryohei Fujimaki
IPC: G06N5/04 , G06N20/00 , G06Q30/02 , G06N5/00 , G06Q30/0201 , G06N5/045 , G06Q30/0202
Abstract: A learning unit 81 generates a plurality of sample groups from samples to be used for learning, and generates a plurality of prediction models while inhibiting overlapping of a sample group to be used for learning among the generated sample groups. An optimization unit 82 generates an objective function based on an explained variable predicted by the prediction model and based on a constraint condition for optimization, and optimizes a generated objective function. An evaluation unit 83 evaluates an optimization result by using a sample group that has not been used in learning of a prediction model used for generating an objective function targeted for the optimization.
-
-
-
-