Environment aware voice-assistant devices, and related systems and methods

    公开(公告)号:US11501758B2

    公开(公告)日:2022-11-15

    申请号:US16988052

    申请日:2020-08-07

    申请人: Apple Inc.

    摘要: An appliance can include a microphone transducer, a processor, and a memory storing instructions. The appliance is configured to receive an audio signal at the microphone transducer and to detect an utterance in the audio signal. The appliance is further configured to classify a speech mode based on the utterance. The appliance is further configured to determine conditions of an environment of the appliance. The appliance is further configured to select at least one of a playback volume or a speech output mode from a plurality of speech output modes based on the classification, and the conditions of the environment of the appliance. The appliance is further configured to adapt the playback volume and/or mode of played-back speech according to the speech output mode. The appliance may be configured to synthesize speech according to the speech output mode, or to modify synthesized speech according to the speech output mode.

    BEAM SELECTION FOR NOISE SUPPRESSION BASED ON SEPARATION

    公开(公告)号:US20170337932A1

    公开(公告)日:2017-11-23

    申请号:US15159698

    申请日:2016-05-19

    申请人: Apple Inc.

    IPC分类号: G10L21/0208 G10L25/78

    摘要: An audio system has a housing in which are integrated a number of microphones. A programmed processor accesses the microphone signals and produces a number of acoustic pick up beams. A number of separation values are computed, each being a measure of the difference between strength of a respective beam and strength of a noise reference input signal. One of the beams is selected whose separation value is the largest, and the selected beam is applied to a first input of a two-channel noise suppression process, while the noise reference input signal is applied to the second input of the noise suppression process. Other embodiments are also described and claimed.

    User voice location estimation for adjusting portable device beamforming settings
    8.
    发明授权
    User voice location estimation for adjusting portable device beamforming settings 有权
    用于调整便携式设备波束形成设置的用户语音位置估计

    公开(公告)号:US09525938B2

    公开(公告)日:2016-12-20

    申请号:US13838213

    申请日:2013-03-15

    申请人: Apple Inc.

    IPC分类号: H04R3/00 H04R1/32

    摘要: An audio device may use the audio detected at two opposite facing, front and rear omnidirectional microphones to determine the angular directional location of a user's voice while the device in speaker mode or audio command input mode. The angular directional location may be determined to be at front, side and rear locations of the device during the period of time by calculating an energy ratio of audio signals output by the front and rear microphones during the period. Comparing the ratio to experimental data for sound received from different directions around the device may provide the location of the user's voice. Based on the determination, audio beamforming input settings may be adjusted for user voice beamforming. As a result, the device can perform better beamforming to combine the signals captured by the microphones and generate a single output that isolates the user's voice from background noise.

    摘要翻译: 音频设备可以使用在两个相对的前面和后面全向麦克风处检测到的音频来确定在扬声器模式或音频命令输入模式下的用户语音的角度定向位置。 可以通过计算在该时段期间由前麦克风和后麦克风输出的音频信号的能量比来确定角度定向位置在该时段期间在设备的前侧,后侧和后部位置。 将比例与来自设备周围不同方向的声音的实验数据进行比较可以提供用户声音的位置。 基于该确定,可以针对用户语音波束成形来调整音频波束成形输入设置。 因此,该装置可以执行更好的波束形成以组合由麦克风捕获的信号,并产生将用户的声音与背景噪声隔离的单个输出。

    USER VOICE LOCATION ESTIMATION FOR ADJUSTING PORTABLE DEVICE BEAMFORMING SETTINGS
    9.
    发明申请
    USER VOICE LOCATION ESTIMATION FOR ADJUSTING PORTABLE DEVICE BEAMFORMING SETTINGS 有权
    用于调整便携式设备波束形成设备的用户语音位置估计

    公开(公告)号:US20140219471A1

    公开(公告)日:2014-08-07

    申请号:US13838213

    申请日:2013-03-15

    申请人: APPLE INC.

    IPC分类号: H04R3/00

    摘要: An audio device may use the audio detected at two opposite facing, front and rear omnidirectional microphones to determine the angular directional location of a user's voice while the device in speaker mode or audio command input mode. The angular directional location may be determined to be at front, side and rear locations of the device during the period of time by calculating an energy ratio of audio signals output by the front and rear microphones during the period. Comparing the ratio to experimental data for sound received from different directions around the device may provide the location of the user's voice. Based on the determination, audio beamforming input settings may be adjusted for user voice beamforming. As a result, the device can perform better beamforming to combine the signals captured by the microphones and generate a single output that isolates the user's voice from background noise.

    摘要翻译: 音频设备可以使用在两个相对的前面和后面全向麦克风处检测到的音频来确定在扬声器模式或音频命令输入模式下的用户语音的角度定向位置。 可以通过计算在该时段期间由前麦克风和后麦克风输出的音频信号的能量比来确定角度定向位置在该时段期间在设备的前侧,后侧和后部位置。 将比例与来自设备周围不同方向的声音的实验数据进行比较可以提供用户声音的位置。 基于该确定,可以针对用户语音波束成形来调整音频波束成形输入设置。 因此,该装置可以执行更好的波束形成以组合由麦克风捕获的信号,并产生将用户的声音与背景噪声隔离的单个输出。

    Environment aware voice-assistant devices, and related systems and methods

    公开(公告)号:US12087284B1

    公开(公告)日:2024-09-10

    申请号:US17955509

    申请日:2022-09-28

    申请人: Apple Inc.

    摘要: An appliance can include a microphone transducer, a processor, and a memory storing instructions. The appliance is configured to receive an audio signal at the microphone transducer and to detect an utterance in the audio signal. The appliance is further configured to classify a speech mode based on the utterance. The appliance is further configured to determine conditions of an environment of the appliance. The appliance is further configured to select at least one of a playback volume or a speech output mode from a plurality of speech output modes based on the classification, and the conditions of the environment of the appliance. The appliance is further configured to adapt the playback volume and/or mode of played-back speech according to the speech output mode. The appliance may be configured to synthesize speech according to the speech output mode, or to modify synthesized speech according to the speech output mode.