METHOD OF PROCESSING MULTIMEDIA DATA, DEVICE AND MEDIUM

    公开(公告)号:US20230115737A1

    公开(公告)日:2023-04-13

    申请号:US18080432

    申请日:2022-12-13

    Abstract: A method of processing multimedia data, a device, and a medium, which relates to a field of an artificial intelligence technology, in particular to fields of knowledge graph and deep learning. The method of processing the multimedia data includes: recognizing the multimedia data so as to obtain at least one key information of the multimedia data; querying a predetermined knowledge base according to the at least one key information, so as to determine a multimedia name associated with the at least one key information and an association degree between the multimedia name and the at least one key information; and determining, in the multimedia name, a name of the multimedia data based on a similarity between alternative multimedia data for the multimedia name and the multimedia data, in response to the association degree being less than a first threshold value.

    VIDEO PROCESSING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20220027634A1

    公开(公告)日:2022-01-27

    申请号:US17450158

    申请日:2021-10-06

    Abstract: A video processing method, an electronic device and a storage medium are provided, and relate to the field of artificial intelligence, and particularly relates to the fields of deep learning, model training, knowledge mapping, video processing and the like. The method includes: acquiring a plurality of first video frames, and performing fine-grained splitting on the plurality of first video frames to obtain a plurality of second video frames; performing feature encoding on the plurality of second video frames according to multi-mode information related to the plurality of second video frames, to obtain feature fusion information for characterizing fusion of the multi-mode information; and performing similarity matching on the plurality of second video frames according to the feature fusion information, and obtaining a target video according to a result of the similarity matching.

    METHOD AND APPARATUS FOR TRAINING QUESTION SOLVING MODEL, QUESTION SOLVING METHOD AND APPARATUS

    公开(公告)号:US20240354658A1

    公开(公告)日:2024-10-24

    申请号:US18745529

    申请日:2024-06-17

    CPC classification number: G06N20/00 G06N5/04

    Abstract: A method and apparatus for training a question solving model, a question solving method and apparatus, an electronic device and a readable storage medium are disclosed. The method for training a question solving model includes: acquiring a first sample question; inputting the first sample question and a solving step grabbing template into a large language model to obtain a first sample solving step; inputting the first sample question, the first sample solving step and an answer grabbing template into the large language model to obtain a first sample answer; pre-training a step planning model according to the first sample question and the first sample solving step; pre-training the large language model according to the first sample question, the first sample solving step and the first sample answer; and acquiring the question solving model according to the step planning model and the large language model obtained by pre-training. The question solving method includes: acquiring a to-be-solved question; inputting the to-be-solved question into a step planning model to obtain a solving step; and inputting the to-be-solved question and the solving step into a large language model to obtain an answer.

    METHOD FOR TRAINING IMAGE-TEXT MATCHING MODEL, COMPUTING DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20230005284A1

    公开(公告)日:2023-01-05

    申请号:US17943458

    申请日:2022-09-13

    Abstract: A computer-implemented method is provided. The method includes: obtaining a sample text and a sample image corresponding to the sample text; labeling a true semantic tag for the sample text according to a first preset rule; obtaining a text feature representation of the sample text and a predicted semantic tag output by a text coding sub-model; obtaining an image feature representation of the sample image output by an image coding sub-model; calculating a first loss based on the true semantic tag and the predicted semantic tag; calculating a contrast loss based on the text feature representation of the sample text and the image feature representation of the sample image; adjusting parameters of the text coding sub-model based on the first loss and the contrast loss; and adjusting parameters of the image coding sub-model based on the contrast loss.

    MULTIMODAL CONTENT PROCESSING METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20210192142A1

    公开(公告)日:2021-06-24

    申请号:US17024756

    申请日:2020-09-18

    Abstract: The present disclosure discloses a multimodal content processing method, apparatus, device and storage medium, which relate to the technical field of artificial intelligence. The specific implementation is: receiving a content processing request of a user which is configured to request semantic understanding of multimodal content to be processed, analyzing the multimodal content to obtain the multimodal knowledge nodes corresponding to the multimodal content, determining a semantic understanding result of the multimodal content according to the multimodal knowledge nodes, a pre-constructed multimodal knowledge graph and the multimodal content, the multimodal knowledge graph including: the multimodal knowledge nodes and an association relationship between multimodal knowledge nodes. The technical solution can obtain an accurate semantic understanding result, realize an accurate application of multimodal content, and solve the problem in the prior art that multimodal content understanding is inaccurate.

    METHOD AND APPARATUS FOR PROCESSING MODEL GENERATION RESULT, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20240303430A1

    公开(公告)日:2024-09-12

    申请号:US18667504

    申请日:2024-05-17

    CPC classification number: G06F40/20

    Abstract: A technical solution for processing a model generation result, which relates to the field of artificial intelligence technologies is disclosed. An implementation includes: disassembling a text generation result of a generative large model to obtain a plurality of result logic units; wherein each result logic unit includes a segment in the text generation result; each segment is capable of independently identifying one premise or conclusion in a logical inference relationship of the text generation result; and the text generation result is a response result generated by the generative large model based on text input information; generating a logical inference graph capable of characterizing a logical inference relationship among the plurality of result logic units based on the plurality of result logic units; and determining whether logical inference of generation of the text generation result by the generative large model is correct or not based on the logical inference graph.

Patent Agency Ranking