METHOD FOR TRAINING DECISION-MAKING MODEL PARAMETER, DECISION DETERMINATION METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20230032324A1

    公开(公告)日:2023-02-02

    申请号:US17966127

    申请日:2022-10-14

    Abstract: A method for training a decision-making model parameter, a decision determination method, an electronic device, and a non-transitory computer-readable storage medium are provided. In the method, a perturbation parameter is generated according to a meta-parameter, and first observation information of a primary training environment is acquired based on the perturbation parameter. According to the first observation information, an evaluation parameter of the perturbation parameter is determined. According to the perturbation parameter and the evaluation parameter thereof, an updated meta-parameter is generated. The updated meta-parameter is determined as a target meta-parameter, when it is determined, according to the meta-parameter and the updated meta-parameter, that a condition for stopping primary training is met. According to the target meta-parameter, a target memory parameter corresponding to a secondary training task is determined, where the target memory parameter and the target meta-parameter are used to make a decision corresponding to a prediction task.

    MODEL TRAINING
    12.
    发明申请

    公开(公告)号:US20220198153A1

    公开(公告)日:2022-06-23

    申请号:US17694034

    申请日:2022-03-14

    Abstract: A model training method, a model training platform, an electronic device and a storage medium are provided, which can be used in the field of artificial intelligence, particularly the fields of natural language processing and deep learning. The model training method includes: receiving an input; determining, based on the input, a user-oriented prefabricated function; determining, based on the input, a model training function; determining, based on the input, a pre-trained model; determining, based on the input, a network structure associated with the pre-trained model so as to support use of the pre-trained model; training, based on the input, the model by using the prefabricated function, the model training function, and the pre-trained model; and providing an output associated with a trained model.

    DIALOGUE MODEL TRAINING METHOD
    15.
    发明申请

    公开(公告)号:US20240412002A1

    公开(公告)日:2024-12-12

    申请号:US18747641

    申请日:2024-06-19

    Abstract: A method is provided. The method includes: obtaining a first sample dataset; inputting at least one first question text corresponding to at least one piece of first sample data into a dialog model separately to obtain at least one first answer prediction result; inputting each second question text into the dialog model to obtain a second answer prediction result output by the dialog model; inputting the second answer prediction result into a reward model to obtain a score of the second answer prediction result output by the reward model; determining a comprehensive loss based on the at least one first answer prediction result, a first answer text of each of the at least one piece of first sample data, and a score corresponding to each of at least one piece of second sample data; and adjusting at least one parameter of the dialog model based on the comprehensive loss.

Patent Agency Ranking