LARGE LANGUAGE MODEL TRAINING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20250094806A1

    公开(公告)日:2025-03-20

    申请号:US18967167

    申请日:2024-12-03

    Abstract: Provided is a large language model training method, an electronic device and a storage medium, relating to the field of artificial intelligence technologies, and in particular, to the fields of deep learning, natural language processing and large model. The method includes: performing dimension reduction parameter fusion on a two-dimensional parameter matrix on each channel in each network layer in a first large language model, respectively, to obtain a second large language model; performing layer reduction parameter fusion on network layers in the second large language model based on a three-dimensional parameter matrix of each network layer in the second large language model to obtain a third large language model; and training the third large language model to obtain a target large language model under the condition that the target loss function determined based on the first and third large language models meets a preset first function condition.

    MULTIMODAL DATA GENERATION
    312.
    发明申请

    公开(公告)号:US20250094713A1

    公开(公告)日:2025-03-20

    申请号:US18967529

    申请日:2024-12-03

    Abstract: A multimodal data generation method is provided. The method includes: inputting a query data sequence into a multimodal model, to obtain a plurality of tokens in a response data sequence, where a current token is generated through the following operations: inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model generates the current token based on the query data sequence and the current response data sequence, in response to determining that the current token belongs to a first data modality; or inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model denoises an initial token sequence based on the query data sequence and the current response data sequence, to generate a result token sequence, in response to determining that the current token belongs to a second data modality.

    METHOD OF GENERATING CODE BASED ON LARGE MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20250094139A1

    公开(公告)日:2025-03-20

    申请号:US18965152

    申请日:2024-12-02

    Abstract: A method of generating a code based on a large model, an electronic device and a storage medium are provided, which relate to the field of artificial intelligence technology, in particular to the fields of deep learning technology and large model technology. The method includes: acquiring a first descriptive text input by a user, where the first descriptive text is configured to characterize a code requirement; searching for a positive code and a negative code matching the first descriptive text, where each of the positive code and the negative code is determined based on a preference operation of the user for a historical code output by the large model; generating a second descriptive text according to the first descriptive text, the positive code, and the negative code; and inputting the second descriptive text into the large model to output a target code matching the code requirement.

    DATA MARKING METHOD, APPARATUS, SYSTEM, DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20250078305A1

    公开(公告)日:2025-03-06

    申请号:US18043705

    申请日:2022-06-20

    Abstract: The present disclosure provides a data marking method, apparatus, system, device, and storage medium, and relates to the technical field of data processing, and in particular to fields such as artificial intelligence, big data, and deep learning. The specific implementation solution is as follows: acquiring multiple pictures whose contents are continuous, wherein the multiple pictures contain at least one same object; for each object, determining a position offset of the object by using position information of the object in two adjacent pictures, wherein the two adjacent pictures include a first previous picture and a second previous picture, the second previous picture is a picture before and adjacent to a picture to be marked in time sequence; the first previous picture is a picture before and adjacent to the second previous picture in time sequence; determining estimated position information of the object in the picture to be marked based on the position information of the second previous picture and the position offset; marking the object in the picture to be marked based on the estimated position information. The present disclosure can speed up the marking of the same object in multiple pictures.

    Text processing method, device and storage medium

    公开(公告)号:US12223271B2

    公开(公告)日:2025-02-11

    申请号:US17874394

    申请日:2022-07-27

    Abstract: Provided are a text processing method, a device and a storage medium, relating to a field of computer technology, and especially to a field of artificial intelligence, such as natural language processing and deep learning. The specific implementation scheme includes: performing text processing on first text, by using a text processing acceleration operator; and processing, in parallel and faster, content after the text processing, by using the text processing acceleration operator. Text processing and parallel acceleration are carried out by the text processing acceleration operator, which can improve the speed of text processing.

    Method for automatically producing map data, and related apparatus

    公开(公告)号:US12196572B2

    公开(公告)日:2025-01-14

    申请号:US17961930

    申请日:2022-10-07

    Abstract: The present disclosure provides a method and apparatus for automatically producing map data. The method includes: performing track rectification on crowdsourcing tracks based on corresponding standard tracks, and locating each map element included, based on depth information of track point images included in the rectified crowdsourcing tracks; comparing a latest map element obtained based on the rectified crowdsourcing tracks locating and an old map element at a corresponding locating position using a pre-built entity semantic map; determining, in response to a change in the latest map element compared to the old map element, a target processing method according to a processing standard of a changed map element pre-abstracted from a map element update specification; and processing the latest map element according to the target processing method to obtain a processed latest map.

    METHOD AND APPARATUS FOR TRAINING A LARGE LANGUAGE MODEL, AND MEDIUM

    公开(公告)号:US20250013876A1

    公开(公告)日:2025-01-09

    申请号:US18889928

    申请日:2024-09-19

    Abstract: An apparatus for training a large language model includes: at least one sample text instruction is input into a target large language model to obtain at least one standard response text, and the at least one sample text instruction is input into a large language model to be trained to obtain at least one predicted response text. A first sample response text is determined from the at least one standard response text according to the score difference between a first quality score of a standard response text and a second quality score of a predicted response text. A first target training sample is generated according to the first sample response text and a sample text instruction corresponding to the first sample response text, and a training dataset is constructed according to the first target training sample.

    METHOD AND APPARATUS FOR DIALOGUE
    320.
    发明申请

    公开(公告)号:US20250013679A1

    公开(公告)日:2025-01-09

    申请号:US18889817

    申请日:2024-09-19

    Inventor: Jinghan ZHANG

    Abstract: The present disclosure provides a method and apparatus for dialogue, relates to the field of artificial intelligence technology, in particular to the field of natural language processing and deep learning technology, and can be used in application scenarios such as generative search, intelligent editing of documents, intelligent assistants, virtual assistants, or intelligent e-commerce. A specific embodiment of the method includes: determining an application scenario corresponding to user query information; acquiring user data in the application scenario; invoking a tool in the application scenario, to process the user query information and the user data to obtain a tool execution result; and generating, based on the tool execution result, answer information corresponding to the user query information.

Patent Agency Ranking