ARTIFICIAL INTELLIGENCE APPARATUS AND METHOD FOR ESTIMATING SOUND SOURCE LOCALIZATION THEREOF

    公开(公告)号:US20240061907A1

    公开(公告)日:2024-02-22

    申请号:US18062483

    申请日:2022-12-06

    CPC classification number: G06F18/217 G01S3/8006 G06F18/214

    Abstract: An artificial intelligence (AI) apparatus including a memory and a processor configured to estimate a sound source localization based on at least one of image information, sound source information, and sensor information stored in the memory. The processor is configured to pre-process at least one of the image information, the sound source information, or the sensor information to generate test data, input the test data into a pre-trained AI model to estimate the sound source localization, calculate a sound source localization estimation evaluation score of the AI model for the test data, classify the test data into validation data based on the calculated sound source localization estimation evaluation score, change the AI model based on the classified validation data, and input the test data into the changed AI model to update the AI model.

    METHOD OF PROVIDING INTERACTIVE ASSISTANT FOR EACH SEAT IN VEHICLE

    公开(公告)号:US20210280182A1

    公开(公告)日:2021-09-09

    申请号:US17069508

    申请日:2020-10-13

    Abstract: Method and device of providing an interactive assistant for each seat in a vehicle are provided. The method of providing an interactive assistant for each seat in a vehicle, including receiving a plurality of voice signals through a beamformed microphone array for a plurality of regions preset in a vehicle and generating and selecting at least one cluster using the plurality of voice signals. Accordingly, the interactive assistant capable of removing a noise and realizing enhanced convenience can be provided. The vehicle of the present disclosure may be associated with an artificial intelligence module, an unmanned aerial vehicle (UAV), a robot, an augmented reality (AR) device, a virtual reality (VR) device, and a device related to a 5G service.

    FAR-END TERMINAL AND VOICE FOCUSING METHOD THEREOF

    公开(公告)号:US20240223707A1

    公开(公告)日:2024-07-04

    申请号:US17921301

    申请日:2022-06-07

    CPC classification number: H04M3/568 H04M3/567

    Abstract: A far-end terminal including a communication interface configured to wirelessly communicate with a near-end terminal for performing a video conference between the far-end terminal and the near-terminal, a camera configured to capture a region in front of the far-end terminal including a plurality of counterpart speakers, a display configured to display the plurality of counterpart speakers captured through the camera and to display an image of a speaker at the near-end terminal, and a processor configured to receive focusing mode setting information from the near-end terminal indicating an operation mode of the near-end terminal is a wide focusing mode, in response to the focusing mode setting information indicating the operation mode is the wide focusing mode, obtain an angle range corresponding to a narrower partial region of an entire region including the plurality of counterpart speakers at the far-end terminal, perform selective audio focusing on a received voice within the obtained angle range to selectively increase a gain of the received voice and to selectively decrease a gain of other received voices outside the obtained angle range, and transmit audio, which is a result of performing the beamforming, to the near-end terminal

    ARTIFICIAL SOUND SOURCE SEPARATION METHOD AND DEVICE OF THEREOF

    公开(公告)号:US20200035256A1

    公开(公告)日:2020-01-30

    申请号:US16593155

    申请日:2019-10-04

    Inventor: Hyeonsik CHOI

    Abstract: An artificial sound source separation method and device are disclosed. The sound source separation method by the artificial sound source separation device based on dictionary learning generates a dictionary matrix by performing dictionary learning, receives an overlapping sound source in which at least two sound sources are mixed and separates a target sound source from the overlapping sound source based on the dictionary matrix; and detecting the target sound source. The dictionary learning may be performed using a K-SVD algorithm. The intelligent computing device configuring a sound source processing device of the present disclosure may be associated with an artificial intelligence module, drone (unmanned aerial vehicle, UAV), robot, augmented reality (AR) devices, virtual reality (VR) devices, devices related to 5G services, and the like.

Patent Agency Ranking