SYSTEM AND METHOD FOR CROSS-MODAL INTERACTION BASED ON PRE-TRAINED MODEL

    公开(公告)号:US20240070436A1

    公开(公告)日:2024-02-29

    申请号:US17900592

    申请日:2022-08-31

    CPC classification number: G06N3/0454 G06F40/284

    Abstract: A method is provided for data processing performed by a processing system. The method comprises determining a set of first tokens for first data and a set of second token for second data, each token comprising information associated with a segment of the respective data, determining pair-wise similarities between the set of first tokens and the set of second tokens, each pair comprising a first token in the set of first tokens and a second token in the set of second tokens, determining, for each first token in the set of first tokens, a maximum similarity based on the determined pair-wise similarities between the respective first token and the second tokens in the set of second tokens, and determining a first similarity between the first data and the second data by aggregating the maximum similarities corresponding to the first tokens in the set of first set of tokens.

    DATA PROCESSING METHOD AND RELATED DEVICE
    2.
    发明公开

    公开(公告)号:US20240119268A1

    公开(公告)日:2024-04-11

    申请号:US18524523

    申请日:2023-11-30

    CPC classification number: G06N3/048

    Abstract: This disclosure relates to the field of artificial intelligence, and discloses a data processing method. The method includes: obtaining a transformer model including a target network layer and a target module; and processing to-be-processed data by using the transformer model, to obtain a data processing result. The target module is configured to: perform a target operation on a feature map output at the target network layer, to obtain an operation result, and fuse the operation result and the feature map output, to obtain an updated feature map output. In this disclosure, the target module is inserted into the transformer model, and the operation result generated by the target module and an input are fused, so that information carried in a feature map output by the target network layer of the transformer model is increased.

    MODEL DISTILLATION METHOD AND RELATED DEVICE

    公开(公告)号:US20240185086A1

    公开(公告)日:2024-06-06

    申请号:US18443052

    申请日:2024-02-15

    CPC classification number: G06N3/096 G06N3/045

    Abstract: This disclosure relates to the field of artificial intelligence, and provides model distillation methods and apparatuses. In an implementation, a method including: obtaining first input data and second input data from a second computing node, wherein the first input data is output data of the third sub-model, and the second input data is output data processed by the fourth sub-model, processing the first input data by using the first sub-model, to obtain a first intermediate output, processing the second input data by using the second sub-model, to obtain a second intermediate output, wherein the first intermediate output and the second intermediate output are used to determine a first gradient, and distilling the first sub-model based on the first gradient, to obtain an updated first sub-model.

    MODEL COMPRESSION METHOD AND APPARATUS
    5.
    发明公开

    公开(公告)号:US20230229912A1

    公开(公告)日:2023-07-20

    申请号:US18123768

    申请日:2023-03-20

    CPC classification number: G06N3/08

    Abstract: A model compression method is provided, which can be applied to the field of artificial intelligence. The method includes: obtaining a first neural network model, a second neural network model, and a third neural network model; processing first to-be-processed data using the first neural network model, to obtain a first output; processing the first to-be-processed data using the third neural network model, to obtain a second output; determining a first target loss based on the first output and the second output, and updating the second neural network model based on the first target loss, to obtain an updated second neural network model; and compressing the updated second neural network model to obtain a target neural network model. The model generated based on the method has higher processing precision.

    DATA PROCESSING METHOD AND RELATED DEVICE

    公开(公告)号:US20220383078A1

    公开(公告)日:2022-12-01

    申请号:US17882895

    申请日:2022-08-08

    Abstract: In a data processing method, a processing device obtains a first neural network model and an available resource state of a terminal device, and determines a second neural network model based on the first neural network model and the available resource state. An appropriate model size is determined based on the available resource state, and a part of the first neural network model is selected, based on the determined model size, as the second neural network model on which data processing is to be performed.

Patent Agency Ranking