OPERATION RULE DETERMINATION DEVICE, OPERATION RULE DETERMINATION METHOD, AND RECORDING MEDIUM

    公开(公告)号:US20240345547A1

    公开(公告)日:2024-10-17

    申请号:US18294590

    申请日:2021-08-23

    CPC classification number: G05B13/0265 G05B13/042

    Abstract: An operation rule determination device includes: an evaluation function setting unit that sets a second evaluation function that has been altered from a first evaluation function in which a condition relating to operation of a controlled object is reflected, such that a difference in an evaluation function between time steps of evaluation relating to the operation of the controlled object is reduced; and a learning unit that performs learning on an operation rule of the controlled object using the second evaluation function, and performs learning on the operation rule of the controlled object using a learning result and the first evaluation function.

    PARAMETER LEARNING APPARATUS, PARAMETER LEARNING METHOD, AND COMPUTER READABLE RECORDING MEDIUM

    公开(公告)号:US20220222442A1

    公开(公告)日:2022-07-14

    申请号:US17614646

    申请日:2019-05-31

    Abstract: A parameter learning apparatus 100 extracts one entity in a document and a related text representation as a one-term document fact, outputs a one-term partial predicate fact including only the one entity using a predicate fact that includes entities and a predicate, calculates a first one-term score indicating the degree of establishment of the one-term document fact using a one-term partial predicate feature vector, a one-term text representation feature vector, and a one-term entity feature vector that are calculated from parameters, calculates a second one-term score with respect to a combination of one entity and a predicate or a text representation that is not extracted as the one-term partial predicate fact, updates the parameters such that the first one-term score is higher than the second one-term score, and calculates a score indicating the degree of establishment of the predicate fact and a score indicating the degree of establishment of a combination of entities and a predicate that is not obtained as the predicate fact using these scores.

    ARITHMETIC APPARATUS, ACTION DETERMINATION METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING CONTROL PROGRAM

    公开(公告)号:US20220027708A1

    公开(公告)日:2022-01-27

    申请号:US17311752

    申请日:2018-12-13

    Abstract: In an arithmetic apparatus (10), a prediction state determination unit (11) determines a plurality of prediction states for each of a plurality of candidate actions that can be executed in a first state by using a plurality of transition information units. A degree of variation calculation unit (12) calculates degrees of variation of the plurality of prediction states determined for each of the plurality of candidate actions by the prediction state determination unit (11). A candidate action selection unit (13) selects some of the candidate actions among the aforementioned plurality of candidate actions based on the plurality of degrees of variation calculated by the degree of variation calculation unit (12).

    PARAMETER LEARNING APPARATUS, PARAMETER LEARNING METHOD, AND COMPUTER READABLE RECORDING MEDIUM

    公开(公告)号:US20220245183A1

    公开(公告)日:2022-08-04

    申请号:US17613549

    申请日:2019-05-31

    Abstract: An entity combination and a text representation are obtained as a first fact, and the entity combination and a related predicate are obtained as a second fact. Word distributed representations are input to a neural network, real vectors at appearance positions of entities are specified and used as distributed representations. A first score indicating a degree of establishment of the first fact is calculated based on the distributed representations and on entity distributed representations. A second score indicating a degree of establishment is calculated with respect to an entity combination and a text representation that are not the first fact. A third score indicating a degree of establishment of the second fact is calculated based on predicate distributed representations and on entity distributed representations. A fourth score indicating a degree of establishment is calculated also with respect to an entity combination and a predicate that are not the second fact. The entity distributed representations, the predicate distributed representations, or weight parameters are updated by a gradient method, so that the first score becomes higher than one of the second score and the fourth score, and the third score becomes higher than one of the second score and the fourth score.

    OPERATION RULE DETERMINATION DEVICE, OPERATION RULE DETERMINATION METHOD, AND RECORDING MEDIUM

    公开(公告)号:US20220197230A1

    公开(公告)日:2022-06-23

    申请号:US17611694

    申请日:2019-05-22

    Abstract: An operation rule determination device includes an environment execution unit that obtains a state of a control target after each operation and the degree associated with the state for a series of operations on the control target, by using degree information in which the state and the degree of desirability of the state are associated with each other, and a risk-considered history generation unit that calculates a cumulative degree obtained by accumulating the obtained degree for the series of operations, and, when the cumulative degree satisfies a condition, reduces the degree associated with the state after the series of operations in the degree information.

    LEARNING DEVICE, LEARNING METHOD, CONTROL SYSTEM, AND RECORDING MEDIUM

    公开(公告)号:US20240394554A1

    公开(公告)日:2024-11-28

    申请号:US18695021

    申请日:2021-10-04

    Inventor: Takuya HIRAOKA

    Abstract: A learning device calculates each of a plurality of second evaluation values that include noise using a plurality of evaluation models, each of which calculates, on the basis of both a second state resulting from a first action performed by a control target in a first state, and a second action calculated from the second state using a policy model, a second evaluation value obtained by including noise in an index value indicating the result of evaluating the second action in the second state; and updates the policy model or the parameters of the policy model on the basis of the smallest of the plurality of second evaluation values and a first evaluation value, which is an index value indicating the result of evaluating the first action in the first state.

    PROCESSING DEVICE, PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING PROGRAM

    公开(公告)号:US20240013028A1

    公开(公告)日:2024-01-11

    申请号:US18027299

    申请日:2020-10-06

    CPC classification number: G06N3/042

    Abstract: A processing device includes: an entity pre-processing means for receiving a set of atoms, the atom indicating a combination of a predicate and an associative array of entities that are arguments of the predicate, and calculating for each of the entities an entity feature vector for each of the entities, the vector reflecting a correspondence between keys and the entity in the associative array; and a post-processing means for receiving a query indicating contents of processing, and executing the processing indicated by the query using the entity feature vector.

    PARAMETER CALCULATING DEVICE, PARAMETER CALCULATING METHOD, AND RECORDING MEDIUM HAVING PARAMETER CALCULATING PROGRAM RECORDED THEREON

    公开(公告)号:US20210065056A1

    公开(公告)日:2021-03-04

    申请号:US16961121

    申请日:2018-01-10

    Inventor: Takuya HIRAOKA

    Abstract: Provided is a parameter calculating device that takes human prior knowledge into account. The parameter calculating device according to the present invention is provided with: an identifying means that identifies intermediate states from a certain state to a target state and rewards concerning the intermediate states on the basis of a plurality of states concerning a target system, associated information by which two states among the plurality of states are associated with each other, rewords concerning at least some of the states, model information including parameters representing the states of the target system, and given ranges concerning the parameters; and a parameter calculating means that calculates the values of the parameters in the case where the identified rewards and the degrees of the differences between the values of the parameters and the given ranges satisfy predetermined conditions.

    DIALOG APPARATUS, DIALOG SYSTEM, AND COMPUTER-READABLE RECORDING MEDIUM

    公开(公告)号:US20200050669A1

    公开(公告)日:2020-02-13

    申请号:US16492664

    申请日:2017-03-13

    Inventor: Takuya HIRAOKA

    Abstract: A dialog apparatus 100 is an apparatus for responding to a dialog act of a user. The dialog apparatus 100 is provided with: a policy unit 40 configured to set a score to each of response candidates included in a set of response candidates based on the state of a dialog being performed with the user and a policy parameter, and referring to the set scores, to select one of the response candidates as a dialog act of the dialog apparatus 100; and a policy parameter updating unit 60 configured to obtain a reward in the state of the dialog using a reward function that, as the reward, returns an evaluation of a behavior performed in a specific circumstance as a quantitatively represented numeric value, and to update the policy parameter based on the obtained reward.

Patent Agency Ranking