SPEECH SYNTHESIZER USING ARTIFICIAL INTELLIGENCE, METHOD OF OPERATING SPEECH SYNTHESIZER AND COMPUTER-READABLE RECORDING MEDIUM

    公开(公告)号:US20210327407A1

    公开(公告)日:2021-10-21

    申请号:US16499822

    申请日:2019-04-23

    Abstract: A speech synthesizer using artificial intelligence includes a memory configured to store a first ratio of a word classified into a minor class among a plurality of classes, a second ratio of the word which is not classified into the minor class, and a synthesized speech model and a processor configured to change a first class classification probability set of the word to a second class classification probability set, based on the first ratio, the second ratio and the first class classification probability set, and learn the synthesized speech model using the changed second class classification probability set. The plurality of classes includes a first class corresponding to first reading break, a second class corresponding to second reading break greater than the first break and a third class corresponding to third reading break greater than the second break, and the minor class has a smallest count among the first to third classes.

    GATHERING USER'S SPEECH SAMPLES
    12.
    发明申请

    公开(公告)号:US20210134301A1

    公开(公告)日:2021-05-06

    申请号:US17028527

    申请日:2020-09-22

    Abstract: Disclosed is gathering a user's speech samples. According to an embodiment of the disclosure, a method of gathering learning samples may gather a speaker's speech data obtained while talking on a mobile terminal and text data generated from the speech data and gather training data for generating a speech synthesis model. According to the disclosure, the method of gathering learning samples may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.

    SPEECH SYNTHESIS METHOD BASED ON EMOTION INFORMATION AND APPARATUS THEREFOR

    公开(公告)号:US20200035216A1

    公开(公告)日:2020-01-30

    申请号:US16593404

    申请日:2019-10-04

    Abstract: A speech synthesis method and apparatus based on emotion information are disclosed. A method for performing, by a speech synthesis apparatus, speech synthesis based on emotion information according to an embodiment of the present disclosure includes: receiving data; generating emotion information on the basis of the data; generating metadata corresponding to the emotion information; and transmitting the metadata to a speech synthesis engine, wherein the metadata is described in the form of a markup language, and the markup language includes a speech synthesis markup language (SSML). According to the present disclosure, an intelligent computing device constituting a speech synthesis apparatus may be related with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.

    EMOTION CLASSIFICATION INFORMATION-BASED TEXT-TO-SPEECH (TTS) METHOD AND APPARATUS

    公开(公告)号:US20210366462A1

    公开(公告)日:2021-11-25

    申请号:US16485421

    申请日:2019-01-11

    Abstract: Disclosed are an emotion classification information-based text-to-speech (TTS) method and device. The emotion classification information-based TTS method according to an embodiment of the present invention may, when emotion classification information is set in a received message, transmit metadata corresponding to the set emotion classification information to a speech synthesis engine and, when no emotion classification information is set in the received message, generate new emotion classification information through semantic analysis and context analysis of sentences in the received message and transmit the metadata to the speech synthesis engine. The speech synthesis engine may perform speech synthesis by carrying emotion classification information based on the transmitted metadata.

    METHOD AND DEVICE FOR SPEECH PROCESSING

    公开(公告)号:US20210082421A1

    公开(公告)日:2021-03-18

    申请号:US16676160

    申请日:2019-11-06

    Abstract: Disclosed are a speech processing method and a speech processing apparatus, characterized in that a speech processing is carried out by executing an artificial intelligence (AI) algorithm and/or a machine learning algorithm, such that the speech processing apparatus, a user terminal, and a server can communicate with each other in a 5G communication environment. The speech processing method according to one exemplary embodiment of the present invention includes converting a response text, which is generated in response to a spoken utterance of a user, to a spoken response utterance, obtaining external situation information while outputting the spoken response utterance, generating a dynamic spoken response utterance by converting the spoken response utterance on the basis of the external situation information, and outputting the dynamic spoken response utterance.

    METHOD FOR SYNTHESIZED SPEECH GENERATION USING EMOTION INFORMATION CORRECTION AND APPARATUS

    公开(公告)号:US20210074261A1

    公开(公告)日:2021-03-11

    申请号:US16928815

    申请日:2020-07-14

    Abstract: A method includes generating first synthesized speech by using text and a first emotion vector configured for the text, extracting a second emotion vector included in the first synthesized speech, determining whether correction of the second emotion information vector is needed by comparing a loss value calculated by using the first emotion information vector and the second emotion information vector with a preconfigured threshold, re-performing speech synthesis by using a third emotion information vector generated by correcting the second emotion information vector, and outputting the generated synthesized speech, thereby configuring emotion information of speech in a more effective manner. A speech synthesis apparatus may be associated with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.

    APPARATUS AND METHOD FOR INSPECTING SPEECH RECOGNITION

    公开(公告)号:US20200013388A1

    公开(公告)日:2020-01-09

    申请号:US16572955

    申请日:2019-09-17

    Abstract: Disclosed are a speech recognition verification device and a speech recognition verification method, which verify speech recognition results by executing artificial intelligence (AI) algorithms and/or machine learning algorithms in a 5G environment connected for Internet-of-Things. According to an embodiment, the speech recognition verification method includes converting a verification target text item to a verification target spoken utterance by applying a preset utterance condition, analyzing the verification target spoken utterance and outputting a recognition result text item corresponding to an analysis result, and verifying speech recognition performance through comparison between the verification target text item and the recognition result text item. According to the present disclosure, the speech recognition result may be verified objectively by using a spoken utterance generated with random text and various utterance conditions as input of speech recognition.

    SPEECH SYNTHESIS DEVICE AND SPEECH SYNTHESIS METHOD

    公开(公告)号:US20230148275A1

    公开(公告)日:2023-05-11

    申请号:US17959050

    申请日:2022-10-03

    CPC classification number: G10L13/047 G10L25/30

    Abstract: Provided is a speech synthetic device capable of outputting a synthetic voice having various speech styles. The speech synthesis device includes a speaker, and a processor to acquire voice feature information through a text and a user input; generate a synthetic voice, by receiving the text and the voice feature information inputs into a decoder supervised-trained to minimize a difference between feature information of a learning text and characteristic information of a learning voice, and output the generated synthetic voice through the speaker.

    SPEECH SYNTHESIZER USING ARTIFICIAL INTELLIGENCE, METHOD OF OPERATING SPEECH SYNTHESIZER AND COMPUTER-READABLE RECORDING MEDIUM

    公开(公告)号:US20210327406A1

    公开(公告)日:2021-10-21

    申请号:US16499816

    申请日:2019-02-15

    Abstract: A speech synthesizer includes a memory configured to store a plurality of sentences and prior information of a word classified into a minor class among a plurality of classes with respect to each sentence, and a processor configured to determine an oversampling rate of the word based on the prior information, determine the number of times of oversampling of the word using the determined oversampling rate and generate sentences including the word by the determined number of times of oversampling. The plurality of classes includes a first class corresponding to first reading break, a second class corresponding to second reading break greater than the first break and a third class corresponding to third reading break greater than the second break, and the minor class has a smallest count among the first to third classes in one sentence.

Patent Agency Ranking