-
公开(公告)号:US11943591B2
公开(公告)日:2024-03-26
申请号:US17565894
申请日:2021-12-30
发明人: Seungwoo Kang , Euihyeok Lee , Chulhong Min
CPC分类号: H04R3/005 , G06F1/163 , G06F1/1694 , G10L25/63 , G10L25/78 , H04R1/1091 , H04R5/04
摘要: Disclosed herein is a system for automatic detection of music listening reactions, comprising a wearable sensor and a mobile device. The wearable sensor is worn on the listener's ear. The mobile device receives an inertial signal and a sound signal of the listener from the wearable sensor. The mobile device determines a vocal reaction of the listener based on the inertial signal, the sound signal and music information of the music being played and a motion reaction of the listener based on the inertial signal and the music information. Other embodiments are described and shown.
-
公开(公告)号:US11942107B2
公开(公告)日:2024-03-26
申请号:US17183288
申请日:2021-02-23
IPC分类号: G10L25/78 , G06N5/01 , G06N20/10 , G10L25/09 , G10L25/30 , G10L25/51 , H04R25/00 , G10L15/16 , G10L19/26
CPC分类号: G10L25/78 , G06N5/01 , G06N20/10 , G10L25/09 , G10L25/30 , G10L25/51 , H04R25/40 , H04R25/505 , H04R25/604 , G10L15/16 , G10L19/26
摘要: The present disclosure is directed to a device and method for detecting presence or absence of human speech. The device and method utilize a low-power accelerometer. The device and method generate an acceleration signal using the accelerometer, filter the acceleration signal with a band pass filter or a high pass filter, determine at least one calculation of the filtered acceleration signal, detect a presence or absence of a voice based on the at least one calculation, and output a detection signal that indicates the presence or absence of the voice. The device and method are well suited for portable audio devices, such as true wireless stereo headphones, that have a limited power supply.
-
公开(公告)号:US11942087B2
公开(公告)日:2024-03-26
申请号:US17147991
申请日:2021-01-13
发明人: Robert A. Zurek , Adrian M. Schuster , Fu-Lin Shau , Jincheng Wu
IPC分类号: G10L15/22 , B60N2/00 , G06F3/01 , G06V20/59 , G06V40/16 , G06V40/18 , G06V40/19 , G06V40/20 , G10L15/20 , G10L15/24 , G10L15/25 , G10L15/26 , G10L21/0208 , G10L21/0216 , G10L25/78
CPC分类号: G10L15/22 , G06F3/013 , G06V20/59 , G06V40/166 , G06V40/19 , G06V40/20 , G10L15/20 , G10L15/25 , G10L15/26 , G10L21/0208 , B60N2/002 , G06V40/18 , G10L2015/223 , G10L2015/227 , G10L15/24 , G10L2021/02166 , G10L25/78 , H04R2430/20 , H04R2460/07 , H04R2499/11
摘要: A device performs a method for using image data to aid voice recognition. The method includes the device capturing image data of a vicinity of the device and adjusting, based on the image data, a set of parameters for voice recognition performed by the device. The set of parameters for the device performing voice recognition include, but are not limited to: a trigger threshold of a trigger for voice recognition; a set of beamforming parameters; a database for voice recognition; and/or an algorithm for voice recognition. The algorithm may include using noise suppression or using acoustic beamforming.
-
公开(公告)号:US11942079B2
公开(公告)日:2024-03-26
申请号:US17096288
申请日:2020-11-12
申请人: Sungpil Chun , Yongseob Lim
发明人: Sungpil Chun , Yongseob Lim
IPC分类号: G06F40/35 , G06Q10/10 , G06V40/10 , G06V40/20 , G10L15/06 , G10L15/18 , G10L15/22 , G10L15/24 , G10L25/51 , G06F40/30 , G10L25/78
CPC分类号: G10L15/1815 , G06Q10/10 , G06V40/10 , G06V40/20 , G10L15/063 , G10L15/22 , G10L15/24 , G10L25/51 , G06F40/30 , G06F40/35 , G10L15/1822 , G10L25/78
摘要: The present invention provides an artificial intelligence-based data processing apparatus and method using the same to prevent dispute over various kinds of inconveniences such as inter-floor noise occurring in an apartment house and to solve them in a friendly and communicative manner based on mutual consideration.
The AI-based data processing apparatus according to embodiments of the present invention can communicate with neighbors conveniently, quickly and accurately by voice, and communicate in a manner that does not offend each other as if it were through an unbiased mediator. By acting in consideration, it is possible to effectively prevent and resolve inter-floor noise related disputes.-
公开(公告)号:US11941504B2
公开(公告)日:2024-03-26
申请号:US17040299
申请日:2019-03-22
申请人: Google LLC
发明人: Pararth Shah , Dilek Hakkani-Tur , Juliana Kew , Marek Fiser , Aleksandra Faust
IPC分类号: G06N3/008 , B25J9/16 , B25J13/08 , G05B13/02 , G05D1/00 , G05D1/02 , G06F18/21 , G06N3/044 , G06T7/593 , G06V20/10 , G06V30/262 , G10L15/16 , G10L15/18 , G10L15/22 , G10L25/78
CPC分类号: G06N3/008 , B25J9/161 , B25J9/162 , B25J9/163 , B25J9/1697 , B25J13/08 , G05B13/027 , G05D1/0221 , G06F18/21 , G06N3/044 , G06T7/593 , G06V20/10 , G06V30/274 , G10L15/16 , G10L15/1815 , G10L15/22 , G10L25/78 , G10L2015/223
摘要: Implementations relate to using deep reinforcement learning to train a model that can be utilized, at each of a plurality of time steps, to determine a corresponding robotic action for completing a robotic task. Implementations additionally or alternatively relate to utilization of such a model in controlling a robot. The robotic action determined at a given time step utilizing such a model can be based on: current sensor data associated with the robot for the given time step, and free-form natural language input provided by a user. The free-form natural language input can direct the robot to accomplish a particular task, optionally with reference to one or more intermediary steps for accomplishing the particular task. For example, the free-form natural language input can direct the robot to navigate to a particular landmark, with reference to one or more intermediary landmarks to be encountered in navigating to the particular landmark.
-
公开(公告)号:US11930333B2
公开(公告)日:2024-03-12
申请号:US17748022
申请日:2022-05-18
发明人: Qian Li , Yuan Jiang , Xingqiang Wu
IPC分类号: H04R3/04 , G10L21/0264 , G10L25/78 , H04R3/00 , H04R5/04
CPC分类号: H04R3/04 , G10L21/0264 , G10L25/78 , H04R3/005 , H04R5/04 , H04R2410/07
摘要: In certain aspects, a noise suppression method and system for a personal sound amplification product (PSAP) are disclosed. An environmental audio signal acquired through one or more microphones is processed to generate a set of first sub-band signals in a set of first sub-bands. The environmental audio signal is also processed to generate a set of second sub-band signals in a set of second sub-bands. A set of first gains for the set of first sub-band signals in the set of first sub-bands is determined based on the set of second sub-band signals in the set of second sub-bands. The set of first sub-band signals is processed based on the set of first gains to generate a noise-suppressed audio signal.
-
公开(公告)号:US11922933B2
公开(公告)日:2024-03-05
申请号:US16889965
申请日:2020-06-02
申请人: YAMAHA CORPORATION
发明人: Tetsuto Kawai
摘要: Voice processing method and device includes obtaining a probability value of an audio signal representing sound, collected by a first microphone on a near-end side, including a person's voice, determining a gain of the audio signal based on the determined probability value, processing the audio signal based on the determined gain of the audio signal, and sending the processed audio signal to a far-end side.
-
公开(公告)号:US11922924B2
公开(公告)日:2024-03-05
申请号:US17617547
申请日:2020-05-21
发明人: Jingzhou Yang , Lei He
IPC分类号: G10L25/78 , G10L13/00 , G10L13/033 , G10L13/047 , G10L13/10
CPC分类号: G10L13/10 , G10L13/033 , G10L13/047
摘要: Method and apparatus for generating speech through multilingual neural text-to-speech (TTS) synthesis are provided in the present disclosure. A text input in at least a first language may be received. Speaker latent space information of a target speaker may be provided through a speaker encoder. Language latent space information of a second language may be provided through a language encoder. At least one acoustic feature may be generated, through an acoustic feature predictor, based on the text input, the speaker latent space information and the language latent space information of the second language. A speech waveform corresponding to the text input may be generated, through a neural vocoder, based on the at least one acoustic feature.
-
公开(公告)号:US11922585B2
公开(公告)日:2024-03-05
申请号:US17641273
申请日:2020-08-11
申请人: AUDI AG
发明人: Marcus Kuehne
CPC分类号: G06T19/006 , B60R1/29 , G06F3/011 , G10L25/51 , G10L25/78 , B60R2300/20 , B60R2300/304 , B60R2300/8006
摘要: A control device in a motor vehicle receives a signal from a sensor device which includes at least one image of the current interior situation. The control device generates a superimposition signal containing the at least one image of the current interior situation and transmits the superimposition signal to a display element of a display apparatus. The display apparatus superimposes the at least one image of the current interior situation onto predefined output content output to a user.
-
公开(公告)号:US20240062760A1
公开(公告)日:2024-02-22
申请号:US18497188
申请日:2023-10-30
申请人: The Notebook, LLC
IPC分类号: G10L15/22 , A61B5/00 , A61B5/16 , B60K28/06 , G06V40/16 , G10L15/18 , G10L25/24 , G10L25/66 , G10L25/78 , G10L25/90
CPC分类号: G10L15/22 , A61B5/165 , A61B5/4803 , B60K28/06 , G06V40/165 , G06V40/167 , G06V40/171 , G06V40/174 , G10L15/1815 , G10L15/1822 , G10L25/24 , G10L25/66 , G10L25/78 , G10L25/90 , G10L2015/223 , G10L2015/227 , G10L25/93
摘要: Systems and methods are disclosed. A digitized human vocal expression of a user and digital images are received over a network from a remote device. The digitized human vocal expression is processed to determine characteristics of the human vocal expression, including: pitch, volume, rapidity, a magnitude spectrum identify, and/or pauses in speech. Digital images are received and processed to detect characteristics of the user face, including detecting if any of the following is present: a sagging lip, a crooked smile, uneven eyebrows, and/or facial droop. Using the human vocal expression characteristics and face characteristics, a determination is made as to what action is to be taken. A cepstrum pitch may be determined using an inverse Fourier transform of a logarithm of a spectrum of a human vocal expression signal. The volume may be determined using peak heights in a power spectrum of the human vocal expression.
-
-
-
-
-
-
-
-
-