TEXT PROCESSING METHOD
    1.
    发明申请

    公开(公告)号:US20230101401A1

    公开(公告)日:2023-03-30

    申请号:US18056197

    申请日:2022-11-16

    Abstract: A text processing method is provided. The method includes: a first probability value of each candidate character of a plurality of candidate characters corresponding to a target position is determined based on character feature information corresponding to the target position in a text fragment to be processed, wherein the character feature information is determined based on a context at the target position in the text fragment to be processed; a second probability value of each candidate character of the plurality of candidate characters is determined based on a character string including the candidate character and at least one character in at least one position in the text fragment to be processed adjacent to the target position; and a correction character at the target position is determined based on the first probability value and the second probability value of each candidate character of the plurality of candidate characters.

    Translation Method, Apparatus and Storage Medium

    公开(公告)号:US20230095352A1

    公开(公告)日:2023-03-30

    申请号:US18074853

    申请日:2022-02-05

    Abstract: The present disclosure provides a translation method and apparatus, an electronic device, and a non-transitory storage medium. An implementation includes: determining an encoded feature of a sentence to be translated by an encoding module; determining, by a graph network module, a knowledge fusion feature of the sentence to be translated based on a preset graph network, wherein the preset graph network is constructed based on a polysemous word in a source language corresponding to the sentence to be translated and a plurality of translated words corresponding to the polysemous word in a target language; determining, by a decoding network, a translated sentence corresponding to the sentence to be translated based on the encoded feature and the knowledge fusion feature.

    METHOD OF TRAINING DEEP LEARNING MODEL AND METHOD OF PROCESSING TEXT DATA

    公开(公告)号:US20230088360A1

    公开(公告)日:2023-03-23

    申请号:US18059389

    申请日:2022-11-28

    Abstract: A method of training a deep learning model is provided, which relates to a field of artificial intelligence, in particular to a field of a natural language processing technology and a field of a machine translation technology. A specific implementation solution includes: processing sample source data and corresponding sample target data respectively by using the deep learning model, so as to obtain a first output value and a second output value; determining a regularization function value according to the first output value and the second output value; and adjusting a parameter of the depth learning model according to the regularization function value, so as to obtain a pre-trained depth learning model. A method of processing text data, an electronic device, and a storage medium are further provided.

    Method and Apparatus for Selecting Sample Corpus Used to Optimize Translation Model

    公开(公告)号:US20230140997A1

    公开(公告)日:2023-05-11

    申请号:US18089392

    申请日:2022-12-27

    CPC classification number: G06F40/58

    Abstract: A method and apparatus for selecting a sample corpus used to optimize a translation model, an electronic device, a computer readable storage medium, and a computer program product are provided. The method includes: after acquiring a first corpus, translating the first corpus by using a to-be-optimized translation model to acquire a second corpus with different types of languages, then translating the second corpus by using the to-be-optimized translation model to acquire a third corpus, then determining a difficulty level of the first corpus based on a similarity between the first corpus and the third corpus, and finally determining the first corpus as a sample corpus used to perform optimization training on the to-be-optimized translation model in response to the difficulty level satisfying requirements of a difficulty level threshold.

    Method for Evaluating Text Content, and Related Apparatus

    公开(公告)号:US20230196026A1

    公开(公告)日:2023-06-22

    申请号:US18109813

    申请日:2023-02-14

    CPC classification number: G06F40/30 G06F40/289

    Abstract: A method for evaluating a text content, which may include: after splitting a to-be-evaluated text into a plurality of clauses arranged in sequence according to punctuation information of the to-be-evaluated text, determining a first clause of the plurality of clauses as an actual tune name; then, determining actual prosodic information based on a Chinese phonetic alphabet text of a third clause to a last clause in response to that a number of clauses, whose numbers of Chinese characters satisfy character count requirements of clauses corresponding to the actual tune name, from the third clause to the last clause exceeds a number threshold; and finally, in response to the actual prosodic information being consistent with a standard prosodic information of the actual tune name, evaluating the to-be-evaluated text as a Ci-poetry text.

    TRANSLATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20230153548A1

    公开(公告)日:2023-05-18

    申请号:US17885152

    申请日:2022-08-10

    CPC classification number: G06F40/58

    Abstract: A translation method, an electronic device and a storage medium, which relate to the field of artificial intelligence technologies, such as machine learning technologies, information processing technologies, are disclosed. An implementation includes: acquiring an intermediate translation result generated by each of multiple pre-trained translation models for a to-be-translated specified sentence in a same iteration of a translation process, so as to obtain multiple intermediate translation results; acquiring a co-occurrence word based on the multiple intermediate translation results; and acquiring a target translation result of the specified sentence based on the co-occurrence word.

    TRAINING METHOD, TEXT TRANSLATION METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20230076471A1

    公开(公告)日:2023-03-09

    申请号:US17982965

    申请日:2022-11-08

    Abstract: A training method, a text translation method, an electronic device, and a storage medium, which relate to a field of artificial intelligence, in particular to fields of natural language processing and deep learning technologies. A specific implementation solution includes: performing a feature extraction on source sample text data to obtain a sample feature vector sequence; obtaining a target sample feature vector according to the sample feature vector sequence; performing an autoregressive decoding and a non-autoregressive decoding on the sample feature vector sequence, respectively; performing a length prediction on the target sample feature vector; training a predetermined model by using translation sample data, the autoregressive text translation result, the non-autoregressive text translation result, a true length value of the source sample text, the first predicted length value, a true length value of the translation sample text, and the second predicted length value to obtain the text translation model.

    Machine Translation Method and Apparatus, Device and Storage Medium

    公开(公告)号:US20230153550A1

    公开(公告)日:2023-05-18

    申请号:US18096297

    申请日:2023-01-12

    CPC classification number: G06F40/58 G06F40/51

    Abstract: A machine translation method can include: acquiring a to-be-translated source text; generating an intervention text corresponding to the to-be-translated source text by using intervention symbols, the intervention text including a term vocabulary part and an other text part; translating the intervention text to obtain a first translation result of the intervention text, where the first translation result includes a translation result of the other text part and the term vocabulary part; and generating a target translated text of the to-be-translated source text based on the first translation result and preset translated content of the term vocabulary part.

Patent Agency Ranking