MULTIMODAL DATA GENERATION
    1.
    发明申请

    公开(公告)号:US20250094713A1

    公开(公告)日:2025-03-20

    申请号:US18967529

    申请日:2024-12-03

    Abstract: A multimodal data generation method is provided. The method includes: inputting a query data sequence into a multimodal model, to obtain a plurality of tokens in a response data sequence, where a current token is generated through the following operations: inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model generates the current token based on the query data sequence and the current response data sequence, in response to determining that the current token belongs to a first data modality; or inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model denoises an initial token sequence based on the query data sequence and the current response data sequence, to generate a result token sequence, in response to determining that the current token belongs to a second data modality.

    TRAINING METHOD FOR A DEEP LEARNING MODEL

    公开(公告)号:US20250061305A1

    公开(公告)日:2025-02-20

    申请号:US18936686

    申请日:2024-11-04

    Abstract: A training method, an inference method, a device, an apparatus, and a medium for a deep learning model are provided. A first model includes a plurality of first parameters, a second model comprises a plurality of second parameters, which is initialized to parameter values of a plurality of target parameters selected from the plurality of first parameters. The training method includes: determining a target loss for both the first model and the second model; adjusting parameter values, including: in response to determining that the target loss indicates that the parameter values of at least part of the target parameters need to be adjusted, synchronously adjusting the parameter values of the corresponding second parameters; and in response to determining that the target loss indicates that the parameter values of at least part of the second parameters need to be adjusted, synchronously adjusting the parameter values of the corresponding target parameters.

    MODEL TRAINING METHOD, MODEL REASONING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20250094802A1

    公开(公告)日:2025-03-20

    申请号:US18965684

    申请日:2024-12-02

    Abstract: Provided is a model training method, a model reasoning method, an electronic device, and a storage medium, relating to the field of data processing, and especially to the technical fields of artificial intelligence, big data, deep learning and large models. The model training method includes: folding an initial token sequence for training a model based on a folding feature value for folding a token sequence to obtain at least a first token sequence subjected to the folding, wherein the initial token sequence represents a token sequence composed of T1 tokens, and the first token sequence has a sequence length less than that of the initial token sequence; and inputting at least the first token sequence into a preset model to train the preset model so as to obtain a target model.

    TASK EXECUTION METHOD AND APPARATUS FOR LARGE MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20250094534A1

    公开(公告)日:2025-03-20

    申请号:US18968798

    申请日:2024-12-04

    Abstract: A task execution method for a large model relates to fields of artificial intelligence, deep learning and large model technologies, and includes executing attention tasks in a task group to be fused using a target computing unit to obtain attention features, where the attention task corresponds to a weighted matrix to be fused, the weighted matrix to be fused is obtained by weighting a matrix to be fused using a weight; obtaining a processing result according to the attention features; determining a loss information according to the processing result; and weighting and fusing matrices to be fused using the target computing unit according to weights for the task group to be fused if the loss information converges, to obtain a fusion matrix for a target task group, where a target task in the target task group is executed by the target computing unit according to the fusion matrix.

    METHOD OF TRAINING FEATURE DETERMINATION MODEL, METHOD OF PERFORMING SEMANTIC ANALYSIS, AND ELECTRONIC DEVICE

    公开(公告)号:US20220327290A1

    公开(公告)日:2022-10-13

    申请号:US17852413

    申请日:2022-06-29

    Abstract: There is provided a method of training a feature determination model, which relates to a field of deep learning and natural language processing. The method is implemented to include: determining, by a plurality of feature determination layers arranged in stages, a feature vector for each segment in a pre-training text; and pre-training the feature determination model according to the feature vector. A current stage feature vector is determined by a feature determination layer of a current stage according to a preceding segment feature vector determined for a preceding segment, and a preceding stage feature vector determined by a feature determination layer of a preceding stage. A method of training a feature determination model for a target task, a method of performing semantic analysis for a target task, an electronic device, and a computer storage medium are also provided.

Patent Agency Ranking