-
公开(公告)号:US11531455B2
公开(公告)日:2022-12-20
申请号:US17278977
申请日:2019-10-11
发明人: Minkyu Shin , Sangyoon Kim , Dokyun Lee , Changwoo Han , Jonguk Yoo , Jaewon Lee
IPC分类号: G06F3/04842 , G06V20/10 , G06F3/04883 , G10L15/26
摘要: Provided are an electronic device capable of providing text information corresponding to a user voice through a user interface and a method of controlling the electronic device. Specifically, an electronic device according to the present disclosure, when an image including at least one object is obtained, analyzes the image to identify the at least one object included in the image, and when a user voice is received, performs voice recognition on the user voice to obtain text information corresponding to the user voice, then identifies an object corresponding to the user voice among the at least one object included in the image, and displays a memo user interface (UI) including text information on an area corresponding to the object identified as corresponding to the user voice among areas on a display.
-
公开(公告)号:US11989690B2
公开(公告)日:2024-05-21
申请号:US18084024
申请日:2022-12-19
发明人: Minkyu Shin , Sangyoon Kim , Dokyun Lee , Changwoo Han , Jonguk Yoo , Jaewon Lee
IPC分类号: G06Q10/087 , G06F3/04842 , G06F3/04883 , G06V10/764 , G06V20/10 , G10L15/26
CPC分类号: G06Q10/087 , G06F3/04842 , G06F3/04883 , G06V10/764 , G06V20/10 , G10L15/26
摘要: Provided are an electronic device capable of providing text information corresponding to a user voice through a user interface and a method of controlling the electronic device. Specifically, an electronic device according to the present disclosure, when an image including at least one object is obtained, analyzes the image to identify the at least one object included in the image, and when a user voice is received, performs voice recognition on the user voice to obtain text information corresponding to the user voice, then identifies an object corresponding to the user voice among the at least one object included in the image, and displays a memo user interface (UI) including text information on an area corresponding to the object identified as corresponding to the user voice among areas on a display.
-
公开(公告)号:US11217267B2
公开(公告)日:2022-01-04
申请号:US17023880
申请日:2020-09-17
发明人: Changwoo Han , Minkyu Shin , Jonguk Yoo , Dokyun Lee , Kangseok Choi
IPC分类号: H04R29/00 , G10L25/51 , G06N3/08 , G10L21/0272 , G10L25/30
摘要: A method, performed by an electronic device, of identifying location information of an external includes: obtaining map information including location information for each time interval of a mobile device based on the mobile device moving while generating noise near the external device; obtaining audio signal information about an audio signal including the noise generated by the mobile device, the audio signal being collected by a microphone of the external device; identifying, from the audio signal information, a time point at which a noise level of the noise generated by the mobile device is highest; obtaining, from the map information, location information of the mobile device corresponding to the identified time point; and identifying the location information of the external device based on the obtained location information of the mobile device.
-
公开(公告)号:US12026666B2
公开(公告)日:2024-07-02
申请号:US18084024
申请日:2022-12-19
发明人: Minkyu Shin , Sangyoon Kim , Dokyun Lee , Changwoo Han , Jonguk Yoo , Jaewon Lee
IPC分类号: G06Q10/087 , G06F3/04842 , G06F3/04883 , G06V10/764 , G06V20/10 , G10L15/26
CPC分类号: G06Q10/087 , G06F3/04842 , G06F3/04883 , G06V10/764 , G06V20/10 , G10L15/26
摘要: Provided are an electronic device capable of providing text information corresponding to a user voice through a user interface and a method of controlling the electronic device. Specifically, an electronic device according to the present disclosure, when an image including at least one object is obtained, analyzes the image to identify the at least one object included in the image, and when a user voice is received, performs voice recognition on the user voice to obtain text information corresponding to the user voice, then identifies an object corresponding to the user voice among the at least one object included in the image, and displays a memo user interface (UI) including text information on an area corresponding to the object identified as corresponding to the user voice among areas on a display.
-
公开(公告)号:US11961522B2
公开(公告)日:2024-04-16
申请号:US17296806
申请日:2019-11-22
发明人: Chanwoo Kim , Dhananjaya N. Gowda , Sungsoo Kim , Minkyu Shin , Larry Paul Heck , Abhinav Garg , Kwangyoun Kim , Mehul Kumar
摘要: The disclosure relates to an electronic apparatus for recognizing user voice and a method of recognizing, by the electronic apparatus, the user voice. According to an embodiment, the method of recognizing the user voice includes obtaining an audio signal segmented into a plurality of frame units, determining an energy component for each filter bank by applying a filter bank distributed according to a preset scale to a frequency spectrum of the audio signal segmented into the frame units, smoothing the determined energy component for each filter bank, extracting a feature vector of the audio signal based on the smoothed energy component for each filter bank, and recognizing the user voice in the audio signal by inputting the extracted feature vector to a voice recognition model.
-
公开(公告)号:US11380326B2
公开(公告)日:2022-07-05
申请号:US16875236
申请日:2020-05-15
发明人: Changwoo Han , Minkyu Shin , Jonguk Yoo , Dokyun Lee , Kangseok Choi , Jaewon Lee , Hyeontaek Lim
IPC分类号: G10L15/00 , G10L15/22 , G10L15/02 , G10L19/008 , G10L15/16 , G10L15/08 , G10L21/0208
摘要: A speech recognition method includes receiving a first multi-channel audio signal; obtaining at least one of a speech signal characteristic or a noise signal characteristic for at least one frequency band of frequency bands corresponding to channel audio signals included in the first multi-channel audio signal; generating a signal with an enhanced speech component by performing beamforming on the first multi-channel audio signal based on the speech signal characteristic, a speech signal characteristic obtained for a previous frame that was obtained before a certain time that the first multi-channel audio signal was obtained, and the noise signal characteristic; determining whether the enhanced speech component includes a wake word; and based on determining that the enhanced speech component includes the wake word: activating a speech recognition operation based on the signal with the enhanced speech component.
-
-
-
-
-