LARGE MODEL-BASED METHOD OF GENERATING TEXT AND METHOD OF TRAINING TEXT GENERATION MODEL

    公开(公告)号:US20250094877A1

    公开(公告)日:2025-03-20

    申请号:US18969719

    申请日:2024-12-05

    Abstract: A large model-based method of generating a text, a method of training a text generation model, a device, and a medium are provided, which relate to a field of artificial intelligence technology, specifically to fields of deep learning, natural language processing and large model technologies. The large model-based method of generating a text includes: acquiring a memory state for a text to be processed, where the memory state is generated based on a previous text of the text to be processed; determining an embedding feature of the text to be processed as an initial hidden state, and processing the memory state and the initial hidden state by using a first attention mechanism to obtain an updated hidden state; and generating a subsequent text for the text to be processed based on the updated hidden state.

    MULTIMODAL DATA GENERATION
    2.
    发明申请

    公开(公告)号:US20250094713A1

    公开(公告)日:2025-03-20

    申请号:US18967529

    申请日:2024-12-03

    Abstract: A multimodal data generation method is provided. The method includes: inputting a query data sequence into a multimodal model, to obtain a plurality of tokens in a response data sequence, where a current token is generated through the following operations: inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model generates the current token based on the query data sequence and the current response data sequence, in response to determining that the current token belongs to a first data modality; or inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model denoises an initial token sequence based on the query data sequence and the current response data sequence, to generate a result token sequence, in response to determining that the current token belongs to a second data modality.

    GENERATING INSTRUCTION DATA
    3.
    发明申请

    公开(公告)号:US20250004771A1

    公开(公告)日:2025-01-02

    申请号:US18755148

    申请日:2024-06-26

    Abstract: A method, apparatus, device, and medium for generating instruction data is provided. The method includes: obtaining a natural language-based reference instruction to direct a large model to generate response data meeting multiple first requirements; obtaining a structured disassembly result of the reference instruction to derive several reference slots and slot values corresponding to these requirements; determining multiple sample slots and sample slot values based on the reference slots, slot values, and a predetermined rule; and generating a natural language-based sample instruction from these sample slots and values, which directs the large model to generate response data that fulfills multiple second requirements.

    METHOD OF EXECUTING TASK FOR LARGE LANGUAGE MODEL, DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20240378077A1

    公开(公告)日:2024-11-14

    申请号:US18782617

    申请日:2024-07-24

    Abstract: A method of executing a task for a large language model, a device, and a storage medium are provided, which relate to a field of artificial intelligence technology, and in particular to fields of deep learning, large language model, natural language processing and computer vision technologies. The method includes: determining, by using a determination unit, a target attention task from a plurality of attention tasks to be processed, based on a sparse representation corresponding to a feature to be processed, where the target attention task is a task corresponding to a non-fully masked region of the feature, the sparse representation represents a mask position of the feature, and the mask position represents mask endpoint positions in at least two non-intersecting intervals in a mask matrix corresponding to the feature; and executing the target attention task by using a computing unit, so as to obtain an attention feature.

    SPEECH RECOGNITION AND CODEC METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20230090590A1

    公开(公告)日:2023-03-23

    申请号:US17738651

    申请日:2022-05-06

    Abstract: The present disclosure provides speech recognition and codec methods and apparatuses, an electronic device and a storage medium, and relates to the field of artificial intelligence such as intelligent speech, deep learning and natural language processing. The speech recognition method may include: acquiring an audio feature of to-be-recognized speech; encoding the audio feature to obtain an encoding feature; truncating the encoding feature to obtain continuous N feature fragments, N being a positive integer greater than one; and acquiring, for any one of the feature segments, corresponding historical feature abstraction information, encoding the feature segment in combination with the historical feature abstraction information, and decoding an encoding result to obtain a recognition result corresponding to the feature segment, wherein the historical feature abstraction information is information obtained by feature abstraction of recognized historical feature fragments.

    INFORMATION SEARCH METHOD AND DEVICE, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20230008897A1

    公开(公告)日:2023-01-12

    申请号:US17932598

    申请日:2022-09-15

    Abstract: An information search method includes: obtaining search words at least including a question to be searched and obtaining an initial text vector representation of the search words; obtaining a video corresponding to the search words, and obtaining multi-modality vector representations of the video; starting from the initial text vector representation, performing N rounds of interaction between the video and the search words based on the multi-modality vector representations and a text vector representation of the search words of a current round, to generate a target fusion vector representation, where N is an integer greater than or equal to 1; and obtaining target video frames matching the question to be searched by annotating the video based on the target fusion vector representation.

Patent Agency Ranking