-
公开(公告)号:US20210233537A1
公开(公告)日:2021-07-29
申请号:US16917784
申请日:2020-06-30
Applicant: LG ELECTRONICS INC.
Inventor: Siyoung YANG , Yongchul PARK , Sungmin HAN , Sangki KIM , Juyeong JANG , Minook KIM
Abstract: Disclosed is a device for controlling a plurality of voice recognition devices for determining and selecting a first voice recognition device that a user wants to use based on a point in time when the voice of the user is spoken or a place where the user spoke the voice. The device for controlling a plurality of voice recognition devices according to the present disclosure may be associated with an artificial intelligence module, a robot, an augmented reality (AR) device, a virtual reality (VR) device, a device related to 5G service, etc.
-
公开(公告)号:US20210096810A1
公开(公告)日:2021-04-01
申请号:US16703768
申请日:2019-12-04
Applicant: LG ELECTRONICS INC.
Inventor: Sang Ki KIM , Yongchul PARK , Sungmin HAN , Siyoung YANG , Juyeong JANG , Minook KIM
Abstract: Disclosed are a sound source focus method and device in which the sound source focus device, in a 5G communication environment by amplifying and outputting a sound source signal of a user's object of interest extracted from an acoustic signal included in video content by executing a loaded artificial intelligence (AI) algorithm and/or machine learning algorithm. The sound source focus method includes playing video content including a video signal including at least one moving object and the acoustic signal in which sound sources output by the object are mixed, determining the user's object of interest from the video signal, acquiring unique sound source information about the user's object of interest, extracting an actual sound source for the user's object of interest corresponding to the unique sound source information from the acoustic signal, and outputting the actual sound source extracted for the user's object of interest.
-
公开(公告)号:US20210174796A1
公开(公告)日:2021-06-10
申请号:US16810013
申请日:2020-03-05
Applicant: LG ELECTRONICS INC.
Inventor: Jong Hoon CHAE , Minook KIM , Yongchul PARK , Sungmin HAN , Siyoung YANG , Sangki KIM , Juyeong JANG
Abstract: A method and apparatus for controlling a device according to an embodiment of the present disclosure may be based on a speech feature of a user reflecting the Lombard effect so as to operate a device located far away from the user, among a plurality of electronic devices. As such, even when the user calls a device located far away from the user without any separate context information, speech recognition neural networks and weight calculation neural networks may be selected and used to operate the device located far away from the user, and reception of a speech signal of the user calling a device located far away from the user may be performed in an Internet of Things (IoT) environment using a 5G network.
-
4.
公开(公告)号:US20200058290A1
公开(公告)日:2020-02-20
申请号:US16660947
申请日:2019-10-23
Applicant: LG ELECTRONICS INC.
Inventor: Jonghoon CHAE , Minook KIM , Sangki KIM , Yongchul PARK , Siyoung YANG , Juyeong JANG , Sungmin HAN
Abstract: Disclosed herein is an artificial intelligence apparatus includes a memory configured to store learning target text and human speech of a person who pronounces the text, a processor configured to generate synthesized speech in which the text is pronounced by synthesized sound and extract a synthesized speech feature set including information on a feature pronounced in the synthesized speech and a human speech feature set including information on a feature pronounced in the human speech, and a learning processor configured to train a speech correction model for outputting a corrected speech feature set to allow predetermined synthesized speech to be corrected based on a human pronunciation feature when a synthesized speech feature set extracted from predetermined synthesized speech is input, based on the synthesized speech feature set and the human speech feature set.
-
公开(公告)号:US20200043495A1
公开(公告)日:2020-02-06
申请号:US16601787
申请日:2019-10-15
Applicant: LG ELECTRONICS INC.
Inventor: Yongchul PARK , Minook KIM , Sang Ki KIM , Siyoung YANG , Juyeong JANG , Sungmin HAN
Abstract: A method for performing multi-language communication includes receiving an utterance, identifying a language of the received utterance, determining whether the identified language matches a preset reference language, applying, to the received utterance, an interpretation model interpreting the identified language into the reference language when the identified language does not match the reference language, changing, to text, speech data which is outputted in the reference language as a result of applying the interpretation model, generating a response message responding to the text of the speech data, and outputting the response message. Here, the interpretation model may be a deep neural network model generated through machine learning, and the interpretation model may be stored in an edge device or provided through a server in an Internet of things environment through a 5G network.
-
6.
公开(公告)号:US20210174782A1
公开(公告)日:2021-06-10
申请号:US16803941
申请日:2020-02-27
Applicant: LG ELECTRONICS INC.
Inventor: Minook KIM , Yongchul PARK , Sungmin HAN , Siyoung YANG , Sangki KIM , Juyeong JANG
IPC: G10L13/10 , G10L13/047 , G06N20/00 , G06N5/04
Abstract: An artificial intelligence device includes a memory and a processor. The memory is configured to store audio data having a predetermined speech style. The processor is configured to generate a condition vector relating to a condition for determining the speech style of the audio data, reduce a dimension of the condition vector to a predetermined reduction dimension, acquire a sparse code vector based on a dictionary vector acquired through sparse dictionary coding with respect to the condition vector having the predetermined reduction dimension, and change a vector element value included in the sparse code vector.
-
公开(公告)号:US20210134262A1
公开(公告)日:2021-05-06
申请号:US17029582
申请日:2020-09-23
Applicant: LG ELECTRONICS INC.
Inventor: Minook KIM , Yongchul PARK , Sungmin HAN , Siyoung YANG , Sangki KIM , Juyeong JANG
IPC: G10L13/033 , G10L15/02 , G10L13/047 , G10L25/51 , G10L15/24
Abstract: Disclosed is speech synthesis in a noisy environment. According to an embodiment of the disclosure, a method of speech synthesis may generate a Lombard effect-applied synthesized speech using a feature vector generated from an utterance feature. According to the disclosure, the speech synthesis method and device may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.
-
公开(公告)号:US20200035215A1
公开(公告)日:2020-01-30
申请号:US16593161
申请日:2019-10-04
Applicant: LG Electronics Inc.
Inventor: Siyoung YANG , Minook KIM , Sangki KIM , Yongchul PARK , Juyeong JANG , Sungmin HAN
Abstract: A speech synthesis method and apparatus based on emotion information are disclosed. A speech synthesis method based on emotion information extracts speech synthesis target text from received data and determines whether the received data includes situation explanation information. First metadata corresponding to first emotion information is generated on the basis of the situation explanation information. When the extracted data does not include situation explanation information, second metadata corresponding to second emotion information generated on the basis of semantic analysis and context analysis is generated. One of the first metadata and the second metadata is added to the speech synthesis target text to synthesize speech corresponding to the extracted data. A speech synthesis apparatus of this disclosure may be associated with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.
-
9.
公开(公告)号:US20200005763A1
公开(公告)日:2020-01-02
申请号:US16561410
申请日:2019-09-05
Applicant: LG ELECTRONICS INC.
Inventor: Jonghoon CHAE , Minook KIM , Sangki KIM , Yongchul PARK , Siyoung YANG , Juyeong JANG , Sungmin HAN
IPC: G10L13/08 , G10L15/02 , G10L13/047 , G10L13/033
Abstract: Disclosed is an artificial intelligence (AI)-based voice sampling apparatus for providing a speech style, including a rhyme encoder configured to receive a user's voice, extract a voice sample, and analyze a vocal feature included in the voice sample, a text encoder configured to receive text for reflecting the vocal feature, a processor configured to classify the vocal feature of the voice sample input to the rhyme encoder according to a label, extract an embedding vector representing the vocal feature from the label, and generate a speech style from the embedding vector and apply the generated speech style to the text, and a rhyme decoder configured to output synthesized voice data in which the speech style is applied to the text by the processor.
-
公开(公告)号:US20210134301A1
公开(公告)日:2021-05-06
申请号:US17028527
申请日:2020-09-22
Applicant: LG ELECTRONICS INC.
Inventor: Siyoung YANG , Yongchul PARK , Sungmin HAN , Sangki KIM , Juyeong JANG , Minook KIM
Abstract: Disclosed is gathering a user's speech samples. According to an embodiment of the disclosure, a method of gathering learning samples may gather a speaker's speech data obtained while talking on a mobile terminal and text data generated from the speech data and gather training data for generating a speech synthesis model. According to the disclosure, the method of gathering learning samples may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.
-
-
-
-
-
-
-
-
-