TASK EXECUTION METHOD FOR LARGE MODEL, DEVICE, AND MEDIUM

    公开(公告)号:US20250094792A1

    公开(公告)日:2025-03-20

    申请号:US18968790

    申请日:2024-12-04

    Abstract: A task execution method for a large model, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, particularly to fields of deep learning technology and large model technology. The method includes: executing a modality routing task by using a target computing unit based on a target feature to be processed to obtain a modality recognition result; executing a field routing task by using the target computing unit based on the target feature to be processed and a target field gating model parameter to obtain a field recognition result; and executing a feedforward task by using the target computing unit based on the target feature to be processed and a target feedforward task model parameter to obtain a task execution result

    METHOD OF TRAINING DEEP LEARNING MODEL AND METHOD OF PROCESSING NATURAL LANGUAGE

    公开(公告)号:US20230047980A1

    公开(公告)日:2023-02-16

    申请号:US17976049

    申请日:2022-10-28

    Abstract: A method of training a deep learning model, a method of processing a natural language, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence, in particular to deep learning technology and natural language processing technology. The method includes: inputting first sample data into a first deep learning model to obtain a first output result; training the first deep learning model according to the first output result and a first target output result, the first target output result is obtained by processing the first sample data using a reference deep learning model; inputting second sample data into a second deep learning model to obtain a second output result; and training the second deep learning model according to the second output result and a second target output result, to obtain a trained second deep learning model.

    DATA PROCESSING
    4.
    发明申请

    公开(公告)号:US20250028958A1

    公开(公告)日:2025-01-23

    申请号:US18908380

    申请日:2024-10-07

    Abstract: A data processing method, and a data processing model and a training method therefor are provided, and relate to the field of artificial intelligence, and specifically, to natural language processing, deep learning technologies, and large model technologies. An implementation solution includes: determining input data, where the input data includes a plurality of tokens; determining a correlation between each of the plurality of tokens and each of a plurality of expert networks based on a gating matrix, where the plurality of expert networks are used to reinforce the plurality of tokens; allocating the plurality of tokens to the plurality of expert networks in a uniform manner based on the correlation and a preset capacity of each expert network, to reinforce the plurality of tokens; and determining a data processing result based on the plurality of reinforced tokens.

Patent Agency Ranking