EMOTION CLASSIFICATION INFORMATION-BASED TEXT-TO-SPEECH (TTS) METHOD AND APPARATUS

    公开(公告)号:US20210366462A1

    公开(公告)日:2021-11-25

    申请号:US16485421

    申请日:2019-01-11

    Abstract: Disclosed are an emotion classification information-based text-to-speech (TTS) method and device. The emotion classification information-based TTS method according to an embodiment of the present invention may, when emotion classification information is set in a received message, transmit metadata corresponding to the set emotion classification information to a speech synthesis engine and, when no emotion classification information is set in the received message, generate new emotion classification information through semantic analysis and context analysis of sentences in the received message and transmit the metadata to the speech synthesis engine. The speech synthesis engine may perform speech synthesis by carrying emotion classification information based on the transmitted metadata.

    METHOD AND DEVICE FOR SPEECH PROCESSING

    公开(公告)号:US20210082421A1

    公开(公告)日:2021-03-18

    申请号:US16676160

    申请日:2019-11-06

    Abstract: Disclosed are a speech processing method and a speech processing apparatus, characterized in that a speech processing is carried out by executing an artificial intelligence (AI) algorithm and/or a machine learning algorithm, such that the speech processing apparatus, a user terminal, and a server can communicate with each other in a 5G communication environment. The speech processing method according to one exemplary embodiment of the present invention includes converting a response text, which is generated in response to a spoken utterance of a user, to a spoken response utterance, obtaining external situation information while outputting the spoken response utterance, generating a dynamic spoken response utterance by converting the spoken response utterance on the basis of the external situation information, and outputting the dynamic spoken response utterance.

    METHOD FOR SYNTHESIZED SPEECH GENERATION USING EMOTION INFORMATION CORRECTION AND APPARATUS

    公开(公告)号:US20210074261A1

    公开(公告)日:2021-03-11

    申请号:US16928815

    申请日:2020-07-14

    Abstract: A method includes generating first synthesized speech by using text and a first emotion vector configured for the text, extracting a second emotion vector included in the first synthesized speech, determining whether correction of the second emotion information vector is needed by comparing a loss value calculated by using the first emotion information vector and the second emotion information vector with a preconfigured threshold, re-performing speech synthesis by using a third emotion information vector generated by correcting the second emotion information vector, and outputting the generated synthesized speech, thereby configuring emotion information of speech in a more effective manner. A speech synthesis apparatus may be associated with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.

    APPARATUS AND METHOD FOR INSPECTING SPEECH RECOGNITION

    公开(公告)号:US20200013388A1

    公开(公告)日:2020-01-09

    申请号:US16572955

    申请日:2019-09-17

    Abstract: Disclosed are a speech recognition verification device and a speech recognition verification method, which verify speech recognition results by executing artificial intelligence (AI) algorithms and/or machine learning algorithms in a 5G environment connected for Internet-of-Things. According to an embodiment, the speech recognition verification method includes converting a verification target text item to a verification target spoken utterance by applying a preset utterance condition, analyzing the verification target spoken utterance and outputting a recognition result text item corresponding to an analysis result, and verifying speech recognition performance through comparison between the verification target text item and the recognition result text item. According to the present disclosure, the speech recognition result may be verified objectively by using a spoken utterance generated with random text and various utterance conditions as input of speech recognition.

    VOICE INTERPRETATION DEVICE
    18.
    发明申请

    公开(公告)号:US20200243106A1

    公开(公告)日:2020-07-30

    申请号:US16850810

    申请日:2020-04-16

    Abstract: An apparatus that includes a microphone and a processor. The processor is configured to receive, via the microphone, audio comprising voice of a person, and determine whether the received audio is an actual voice or a synthesized voice. The apparatus also provides a first notification indicating that the received audio is the actual voice when the received audio is the actual voice, and provides a second notification indicating that the received audio is the synthesized voice when the received audio is the synthesized voice.

    VOICE INTERPRETATION DEVICE
    19.
    发明申请

    公开(公告)号:US20200043515A1

    公开(公告)日:2020-02-06

    申请号:US16151091

    申请日:2018-10-03

    Abstract: An apparatus that includes a microphone and a processor. The processor is configured to receive, via the microphone, audio comprising voice of a person, and determine whether the received audio is an actual voice or a synthesized voice. The apparatus also provides a first notification indicating that the received audio is the actual voice when the received audio is the actual voice, and provides a second notification indicating that the received audio is the synthesized voice when the received audio is the synthesized voice.

    SPEECH SYNTHESIS METHOD AND APPARATUS BASED ON EMOTION INFORMATION

    公开(公告)号:US20200035215A1

    公开(公告)日:2020-01-30

    申请号:US16593161

    申请日:2019-10-04

    Abstract: A speech synthesis method and apparatus based on emotion information are disclosed. A speech synthesis method based on emotion information extracts speech synthesis target text from received data and determines whether the received data includes situation explanation information. First metadata corresponding to first emotion information is generated on the basis of the situation explanation information. When the extracted data does not include situation explanation information, second metadata corresponding to second emotion information generated on the basis of semantic analysis and context analysis is generated. One of the first metadata and the second metadata is added to the speech synthesis target text to synthesize speech corresponding to the extracted data. A speech synthesis apparatus of this disclosure may be associated with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.

Patent Agency Ranking