Method and Apparatus for Selecting Sample Corpus Used to Optimize Translation Model

    公开(公告)号:US20230140997A1

    公开(公告)日:2023-05-11

    申请号:US18089392

    申请日:2022-12-27

    CPC classification number: G06F40/58

    Abstract: A method and apparatus for selecting a sample corpus used to optimize a translation model, an electronic device, a computer readable storage medium, and a computer program product are provided. The method includes: after acquiring a first corpus, translating the first corpus by using a to-be-optimized translation model to acquire a second corpus with different types of languages, then translating the second corpus by using the to-be-optimized translation model to acquire a third corpus, then determining a difficulty level of the first corpus based on a similarity between the first corpus and the third corpus, and finally determining the first corpus as a sample corpus used to perform optimization training on the to-be-optimized translation model in response to the difficulty level satisfying requirements of a difficulty level threshold.

    Method for Evaluating Text Content, and Related Apparatus

    公开(公告)号:US20230196026A1

    公开(公告)日:2023-06-22

    申请号:US18109813

    申请日:2023-02-14

    CPC classification number: G06F40/30 G06F40/289

    Abstract: A method for evaluating a text content, which may include: after splitting a to-be-evaluated text into a plurality of clauses arranged in sequence according to punctuation information of the to-be-evaluated text, determining a first clause of the plurality of clauses as an actual tune name; then, determining actual prosodic information based on a Chinese phonetic alphabet text of a third clause to a last clause in response to that a number of clauses, whose numbers of Chinese characters satisfy character count requirements of clauses corresponding to the actual tune name, from the third clause to the last clause exceeds a number threshold; and finally, in response to the actual prosodic information being consistent with a standard prosodic information of the actual tune name, evaluating the to-be-evaluated text as a Ci-poetry text.

    TRANSLATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20230153548A1

    公开(公告)日:2023-05-18

    申请号:US17885152

    申请日:2022-08-10

    CPC classification number: G06F40/58

    Abstract: A translation method, an electronic device and a storage medium, which relate to the field of artificial intelligence technologies, such as machine learning technologies, information processing technologies, are disclosed. An implementation includes: acquiring an intermediate translation result generated by each of multiple pre-trained translation models for a to-be-translated specified sentence in a same iteration of a translation process, so as to obtain multiple intermediate translation results; acquiring a co-occurrence word based on the multiple intermediate translation results; and acquiring a target translation result of the specified sentence based on the co-occurrence word.

    TRAINING METHOD, TEXT TRANSLATION METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20230076471A1

    公开(公告)日:2023-03-09

    申请号:US17982965

    申请日:2022-11-08

    Abstract: A training method, a text translation method, an electronic device, and a storage medium, which relate to a field of artificial intelligence, in particular to fields of natural language processing and deep learning technologies. A specific implementation solution includes: performing a feature extraction on source sample text data to obtain a sample feature vector sequence; obtaining a target sample feature vector according to the sample feature vector sequence; performing an autoregressive decoding and a non-autoregressive decoding on the sample feature vector sequence, respectively; performing a length prediction on the target sample feature vector; training a predetermined model by using translation sample data, the autoregressive text translation result, the non-autoregressive text translation result, a true length value of the source sample text, the first predicted length value, a true length value of the translation sample text, and the second predicted length value to obtain the text translation model.

    TEXT PROCESSING METHOD
    5.
    发明申请

    公开(公告)号:US20230101401A1

    公开(公告)日:2023-03-30

    申请号:US18056197

    申请日:2022-11-16

    Abstract: A text processing method is provided. The method includes: a first probability value of each candidate character of a plurality of candidate characters corresponding to a target position is determined based on character feature information corresponding to the target position in a text fragment to be processed, wherein the character feature information is determined based on a context at the target position in the text fragment to be processed; a second probability value of each candidate character of the plurality of candidate characters is determined based on a character string including the candidate character and at least one character in at least one position in the text fragment to be processed adjacent to the target position; and a correction character at the target position is determined based on the first probability value and the second probability value of each candidate character of the plurality of candidate characters.

    Translation Method, Apparatus and Storage Medium

    公开(公告)号:US20230095352A1

    公开(公告)日:2023-03-30

    申请号:US18074853

    申请日:2022-02-05

    Abstract: The present disclosure provides a translation method and apparatus, an electronic device, and a non-transitory storage medium. An implementation includes: determining an encoded feature of a sentence to be translated by an encoding module; determining, by a graph network module, a knowledge fusion feature of the sentence to be translated based on a preset graph network, wherein the preset graph network is constructed based on a polysemous word in a source language corresponding to the sentence to be translated and a plurality of translated words corresponding to the polysemous word in a target language; determining, by a decoding network, a translated sentence corresponding to the sentence to be translated based on the encoded feature and the knowledge fusion feature.

    METHOD FOR CORRECTING TEXT, METHOD FOR GENERATING TEXT CORRECTION MODEL, DEVICE

    公开(公告)号:US20230090625A1

    公开(公告)日:2023-03-23

    申请号:US18053034

    申请日:2022-11-07

    Abstract: Disclosed are a method for correcting a text, an electronic device and a storage medium. The method includes: acquiring a text to be corrected; acquiring a phonetic symbol sequence of the text to be corrected; and obtaining a corrected text by inputting the text to be corrected and the phonetic symbol sequence into a text correction model, in which, the text correction model obtains the corrected text by: detecting an error word in the text to be corrected, determining a phonetic symbol corresponding to the error word in the phonetic symbol sequence, and adding the phonetic feature corresponding to the phonetic symbol behind the error word to obtain a phonetic symbol text, and correcting the error word and the phonetic feature in the phonetic symbol text to obtain the corrected text.

    QUERY ANSWERING METHOD BASED ON LARGE MODEL, ELECTRONIC DEVICE, STORAGE MEDIUM, AND INTELLIGENT AGENT

    公开(公告)号:US20250094460A1

    公开(公告)日:2025-03-20

    申请号:US18969597

    申请日:2024-12-05

    Abstract: A query answering method, an electronic device, a storage medium, and an intelligent agent are provided, which relate to a field of artificial intelligence technology, and in particular to fields of large model, intelligent search and information processing technology. The method includes: inputting, in response to a retrieval content set retrieved based on a query, the query, the retrieval content set and prompt information for answer generation into the large model, so that the large model performs operations of: processing, based on a current task in the prompt information and the query, a current text corresponding to the retrieval content set to obtain a processed text, where the current task is determined based on a task execution order in the prompt information; and obtaining, in a case of determining that the processed text meets a preset condition, an answer to the query based on the processed text.

Patent Agency Ranking