OPTIMIZATION DEVICE, OPTIMIZATION METHOD, AND RECORDING MEDIUM

    公开(公告)号:US20240037177A1

    公开(公告)日:2024-02-01

    申请号:US18022475

    申请日:2020-09-29

    Inventor: Shinji Ito

    CPC classification number: G06F17/11

    Abstract: In an optimization device, an acquisition means acquires a reward obtained by executing a certain policy. An updating means updates a probability distribution of the policy based on the obtained reward. Here, the updating means uses a weighted sum of the probability distributions updated in a past as a constraint. A determination means determines the policy to be executed, based on the updated probability distributions.

    INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

    公开(公告)号:US20240103812A1

    公开(公告)日:2024-03-28

    申请号:US18275121

    申请日:2021-02-03

    Inventor: Shinji Ito

    CPC classification number: G06F7/76

    Abstract: To enable selection of useful vector sequence a1,a2, . . . ,aT in a bandit linear optimization algorithm for which a fixed strategy is ineffective, an information processing apparatus (1) includes a vector selection unit (11) that selects a vector at in each round t∈[T] (T is any natural number) from a subset A of a d-dimensional vector space Rd (d is any natural number). The vector selection unit (11) uses l1,l2, . . . ,lT∈Rd as loss vectors to select the vector at in each round t such that an asymptotic behavior of an expected value of tracking regret R(u)=Σt∈[T]ltTat−Σt∈[T]ltTut with respect to any comparative vector sequence u1,u2, . . . ,uT∈A or an asymptotic behavior ignoring logarithmic factors of the expected value of the tracking regret R(u) is constrained from above by a preset function A(d,T,P), where P is a natural number not less than 1 given by P=|{t∈[T−1]|ut≠ut+1}|.

    OPTIMIZATION APPARATUS, OPTIMIZATION METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING OPTIMIZATION PROGRAM

    公开(公告)号:US20230214855A1

    公开(公告)日:2023-07-06

    申请号:US17927999

    申请日:2020-05-29

    Inventor: Shinji Ito

    CPC classification number: G06Q30/0201

    Abstract: An optimization apparatus includes: a selection unit that selects, as a correction value, an element having a magnitude equal to or smaller than a predetermined value from among convex hulls of a policy set; an acquisition unit that acquires a result of execution of a second policy executed in a second round, the second round being a round a predetermined round before a first round for executing a first policy that is determined from among the policy set; a calculation unit that calculates an estimated value of a loss vector in the execution of the policy based on the result of the execution and the correction value selected in the second round; an update unit that updates a first probability distribution based on the estimated value; and a determination unit that determines a policy for a next round based on the updated first probability distribution.

Patent Agency Ranking