TASK EXECUTION METHOD FOR LARGE MODEL, DEVICE, AND MEDIUM

    公开(公告)号:US20250094792A1

    公开(公告)日:2025-03-20

    申请号:US18968790

    申请日:2024-12-04

    Abstract: A task execution method for a large model, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, particularly to fields of deep learning technology and large model technology. The method includes: executing a modality routing task by using a target computing unit based on a target feature to be processed to obtain a modality recognition result; executing a field routing task by using the target computing unit based on the target feature to be processed and a target field gating model parameter to obtain a field recognition result; and executing a feedforward task by using the target computing unit based on the target feature to be processed and a target feedforward task model parameter to obtain a task execution result

    DATA PROCESSING
    2.
    发明申请

    公开(公告)号:US20250028958A1

    公开(公告)日:2025-01-23

    申请号:US18908380

    申请日:2024-10-07

    Abstract: A data processing method, and a data processing model and a training method therefor are provided, and relate to the field of artificial intelligence, and specifically, to natural language processing, deep learning technologies, and large model technologies. An implementation solution includes: determining input data, where the input data includes a plurality of tokens; determining a correlation between each of the plurality of tokens and each of a plurality of expert networks based on a gating matrix, where the plurality of expert networks are used to reinforce the plurality of tokens; allocating the plurality of tokens to the plurality of expert networks in a uniform manner based on the correlation and a preset capacity of each expert network, to reinforce the plurality of tokens; and determining a data processing result based on the plurality of reinforced tokens.

Patent Agency Ranking