-
公开(公告)号:US20230017302A1
公开(公告)日:2023-01-19
申请号:US17949741
申请日:2022-09-21
Applicant: Samsung Electronics Co., Ltd.
Inventor: Kyoungbo MIN , Seungdo CHOI , Doohwa HONG
Abstract: An electronic device for providing a text-to-speech (TTS) service and an operating method therefor are provided. The operating method of the electronic device includes obtaining target voice data based on an utterance input of a specific speaker, determining a number of learning steps of the target voice data, based on data features including a data amount of the target voice data, generating a target model by training a pre-trained model pre-trained to convert text into an audio signal, by using the target voice data as training data, based on the determined number of learning steps, generating output data obtained by converting input text into an audio signal, by using the generated target model, and outputting the generated output data.
-
公开(公告)号:US20230230569A1
公开(公告)日:2023-07-20
申请号:US17990358
申请日:2022-11-18
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Seungdo CHOI , Kyoungbo MIN , Sooyeon PARK
IPC: G10K11/178 , H04R1/10
CPC classification number: G10K11/17885 , H04R1/1083 , G10K2210/1081 , H04R2460/01
Abstract: An electronic apparatus includes an inner microphone provided on a first surface of the electronic apparatus; an outer microphone disposed on a second surface opposite the first surface; and a processor configured to: receive a voice signal of a counterpart and a voice signal of a wearer of the electronic apparatus that are input through the inner microphone and the outer microphone, based on a size of the voice signal of the wearer input through the inner microphone being greater than or equal to a predetermined threshold, remove the voice signal of the wearer input through the outer microphone based on the voice signal of the wearer input through the inner microphone, and amplify the voice signal of the counterpart input through the outer microphone and from which the voice signal of the wearer is removed and output the amplified voice signal, wherein the size of the voice signal of the wearer input through the inner microphone is greater than a size of the voice signal of the wearer input through the outer microphone.
-
公开(公告)号:US20210134269A1
公开(公告)日:2021-05-06
申请号:US17081251
申请日:2020-10-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Kyoungbo MIN , Seungdo CHOI , Doohwa HONG
Abstract: An electronic device for providing a text-to-speech (TTS) service and an operating method therefor are provided. The operating method of the electronic device includes obtaining target voice data based on an utterance input of a specific speaker, determining a number of learning steps of the target voice data, based on data features including a data amount of the target voice data, generating a target model by training a pre-trained model pre-trained to convert text into an audio signal, by using the target voice data as training data, based on the determined number of learning steps, generating output data obtained by converting input text into an audio signal, by using the generated target model, and outputting the generated output data.
-
公开(公告)号:US20210065678A1
公开(公告)日:2021-03-04
申请号:US17007793
申请日:2020-08-31
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Seungdo CHOI , Kyoungbo MIN , Sangjun PARK , Kihyun CHOO
IPC: G10L13/08 , G10L13/047
Abstract: A speech synthesis method performed by an electronic apparatus to synthesize speech from text and includes: obtaining text input to the electronic apparatus; obtaining a text representation by encoding the text using a text encoder of the electronic apparatus; obtaining an audio representation of a first audio frame set from an audio encoder of the electronic apparatus, based on the text representation; obtaining an audio representation of a second audio frame set based on the text representation and the audio representation of the first audio frame set; obtaining an audio feature of the second audio frame set by decoding the audio representation of the second audio frame set; and synthesizing speech based on an audio feature of the first audio frame set and the audio feature of the second audio frame set.
-
-
-