Method for dynamic adjustment of audio input gain in a speech system
    1.
    发明授权
    Method for dynamic adjustment of audio input gain in a speech system 有权
    语音系统中音频输入增益的动态调整方法

    公开(公告)号:US06651040B1

    公开(公告)日:2003-11-18

    申请号:US09583845

    申请日:2000-05-31

    IPC分类号: H03G320

    CPC分类号: G10L15/02

    摘要: A method for adjusting audio input signal gain in a speech system can include seven steps. First, an upper and a lower threshold can be predetermined in which the upper and lower threshold define an optimal range of audio data signal amplitude measurements. Second, a frame of unpredicted digital audio data samples can be received. Each sample can indicate an amplitude measurement of the audio data signal at a particular point in time. Third, a maximum signal amplitude can be calculated for a configurable measurement percentile of the unpredicted digital audio data samples. Fourth, the audio input signal gain can be incrementally adjusted downward if the maximum signal amplitude exceeds the upper threshold. Conversely, fifth, the audio input signal gain can be incrementally adjusted upward if the maximum signal amplitude falls below the lower threshold. Sixth, additional frames of unpredicted digital audio data samples can be received. Finally, seventh, each of the third through the sixth steps can be repeated with the received additional frames until the calculated maximum signal amplitude falls within the optimal range of audio signal amplitude.

    摘要翻译: 用于调整语音系统中的音频输入信号增益的方法可以包括七个步骤。 首先,可以预定上限阈值和下限阈值,其中上限阈值和下限阈值定义音频数据信号幅度测量的最佳范围。 第二,可以接收一帧不可预测的数字音频数据样本。 每个样本可以指示音频数据信号在特定时间点的振幅测量。 第三,可以为未预测的数字音频数据样本的可配置测量百分位数计算最大信号幅度。 第四,如果最大信号幅度超过上限阈值,音频输入信号增益可以向下递增调整。 相反,第五,如果最大信号幅度低于下限阈值,音频输入信号增益可以向上递增调整。 第六,可以接收到不可预测的数字音频数据样本的附加帧。 最后,第七,可以用接收到的附加帧重复第三到第六步骤中的每一个,直到计算的最大信号幅度落在音频信号幅度的最佳范围内。

    Device and method for performing diagnostics on a microphone
    2.
    发明授权
    Device and method for performing diagnostics on a microphone 失效
    用于在麦克风上执行诊断的设备和方法

    公开(公告)号:US5822718A

    公开(公告)日:1998-10-13

    申请号:US790401

    申请日:1997-01-29

    IPC分类号: H04R29/00

    CPC分类号: H04R29/004

    摘要: A device and method are disclosed which perform diagnostics on a microphone and display diagnostic information and instructions to a user. The invention uses a processor to create histograms of the PCM (Pulse Code Modulation) signal after removing any dc bias to determine signal and noise levels and ratios, as well as other parameters. Messages are generated and displayed by the device and method to inform a user that the microphone is working correctly or about possible malfunctions, such as low gain. The messages can advise the user on steps to take to correct the malfunctions, for example, to try a different adapter cable or plug.

    摘要翻译: 公开了一种在麦克风上执行诊断并向用户显示诊断信息和指令的装置和方法。 本发明使用处理器在去除任何直流偏置以确定信号和噪声水平和比率以及其它参数之后创建PCM(脉码调制)信号的直方图。 消息由设备和方法生成和显示,以通知用户麦克风正常工作或关于可能的故障,例如低增益。 消息可以建议用户采取措施来纠正故障,例如,尝试使用不同的适配器电缆或插头。

    Audio device characterization for accurate predictable volume control

    公开(公告)号:US06999591B2

    公开(公告)日:2006-02-14

    申请号:US09794784

    申请日:2001-02-27

    IPC分类号: H04R29/00

    CPC分类号: H03G3/3089

    摘要: An automatic gain control method in accordance with the inventive arrangements can include the following steps. Initially, an audio signal can be provided to an audio device which has a range of permissible signal level settings and a signal level controller for establishing a particular signal level setting. In addition, an actual signal level can be measured for the audio signal at an established signal level setting. The measured actual signal level further can be stored in a volume map along with the corresponding established signal level setting. Following the storage of the measured actual signal level in the volume map, a different signal level setting can be established using the signal level controller. Subsequently, the actual signal level can be re-measured and the re-measured actual signal level and corresponding established different signal level setting can be stored in the volume map. Finally, the volume map can be used during an audio processing session to determine a signal level setting for the audio device, wherein the signal level setting corresponds to a desired actual audio signal level. In one aspect of the present invention, the method can also include detecting a hysteresis condition in the volume map.

    Compensating for ambient noise levels in text-to-speech applications
    4.
    发明授权
    Compensating for ambient noise levels in text-to-speech applications 有权
    补偿文本到语音应用中的环境噪声水平

    公开(公告)号:US06988068B2

    公开(公告)日:2006-01-17

    申请号:US10396037

    申请日:2003-03-25

    IPC分类号: G10L19/14 G10L13/08 G10L21/00

    CPC分类号: G10L13/033 H03G3/32

    摘要: A method of automatically adjusting volume of speech generated by a text-to-speech application can include measuring an ambient noise level of an audio environment. A target volume for speech output generated by a text-to-speech application can be calculated based in part upon the ambient noise level. A volume of speech generated by the text-to-speech application can be automatically adjusted responsive to the performed calculation.

    摘要翻译: 自动调整由文本到语音应用产生的语音音量的方法可以包括测量音频环境的环境噪声水平。 可以部分地基于环境噪声水平来计算由文本到语音应用产生的用于语音输出的目标音量。 可以根据执行的计算自动调整由文本到语音应用产生的语音量。

    Speech recognition optimization tool
    5.
    发明授权
    Speech recognition optimization tool 有权
    语音识别优化工具

    公开(公告)号:US07340397B2

    公开(公告)日:2008-03-04

    申请号:US10378506

    申请日:2003-03-03

    IPC分类号: G10L15/04

    CPC分类号: G10L15/02 G10L15/20

    摘要: A method of optimizing audio input for speech recognition applications can include identifying a source waveform and at least one optimization parameter, wherein the optimization parameter is configured to adjust audio input to a speech recognition application. The source waveform can be modified according to the optimization parameter resulting in a modified waveform. At least one optimization parameter can be synchronized with the source waveform. At least two time dependant graphs can be displayed, where the time dependant graphs can include the source waveform, the modified waveform, and/or a graph for the optimization parameter plotted against time.

    摘要翻译: 优化用于语音识别应用的音频输入的方法可以包括识别源波形和至少一个优化参数,其中优化参数被配置为调整到语音识别应用的音频输入。 源波形可以根据优化参数进行修改,从而产生修改后的波形。 至少一个优化参数可以与源波形同步。 可以显示至少两个时间依赖图,其中时间相关图可以包括源波形,修改波形和/或优化参数与时间对应的图。

    Using a loudness-level-reference segment of audio to normalize relative audio levels among different audio files when combining content of the audio files
    6.
    发明授权
    Using a loudness-level-reference segment of audio to normalize relative audio levels among different audio files when combining content of the audio files 有权
    当组合音频文件的内容时,使用音频级别参考片段来规范不同音频文件之间的相对音频级别

    公开(公告)号:US07822498B2

    公开(公告)日:2010-10-26

    申请号:US11463683

    申请日:2006-08-10

    IPC分类号: G06F17/00 H03G3/00 H04B1/20

    摘要: The present invention records a loudness-level-reference segment of audio when creating speech audio files and audio files including background sounds. The speech audio files can then be combined with the background sound containing audio files in any desirable combination. When combining the files, the relative audio level of the files is matched, by matching the loudness-level-reference segments with each other. Any of a variety of known digital signal processing techniques can be used to normalize the component audio files. The combined audio files containing speech and background sounds (e.g. ambient noise) having matching relative audio levels can be used to test and/or train a speech recognition engine or a speech processing system.

    摘要翻译: 本发明在创建包括背景声音的语音音频文件和音频文件时记录音频的音量级参考片段。 语音音频文件然后可以与包含音频文件的背景声音以任何期望的组合组合。 当组合文件时,通过将响度级别参考分段彼此匹配来匹配文件的相对音频级别。 可以使用各种已知的数字信号处理技术中的任何一种来标准化组件音频文件。 可以使用包含具有匹配的相对音频电平的语音和背景声音(例如环境噪声)的组合音频文件来测试和/或训练语音识别引擎或语音处理系统。

    SPEECH RECOGNITION OPTIMIZATION TOOL
    7.
    发明申请
    SPEECH RECOGNITION OPTIMIZATION TOOL 有权
    语音识别优化工具

    公开(公告)号:US20070299663A1

    公开(公告)日:2007-12-27

    申请号:US11852193

    申请日:2007-09-07

    IPC分类号: G10L15/00

    CPC分类号: G10L15/02 G10L15/20

    摘要: A method of optimizing audio input for speech recognition applications can include identifying a source waveform and at least one optimization parameter, wherein the optimization parameter is configured to adjust audio input to a speech recognition application. The source waveform can be modified according to the optimization parameter resulting in a modified waveform. At least one optimization parameter can be synchronized with the source waveform. At least two time dependant graphs can be displayed, where the time dependant graphs can include the source waveform, the modified waveform, and/or a graph for the optimization parameter plotted against time.

    摘要翻译: 优化用于语音识别应用的音频输入的方法可以包括识别源波形和至少一个优化参数,其中优化参数被配置为调整到语音识别应用的音频输入。 源波形可以根据优化参数进行修改,从而产生修改后的波形。 至少一个优化参数可以与源波形同步。 可以显示至少两个时间依赖图,其中时间相关图可以包括源波形,修改波形和/或优化参数与时间对应的图。

    Speech recognition optimization tool
    9.
    发明授权
    Speech recognition optimization tool 有权
    语音识别优化工具

    公开(公告)号:US07490038B2

    公开(公告)日:2009-02-10

    申请号:US11852193

    申请日:2007-09-07

    IPC分类号: G10L15/00

    CPC分类号: G10L15/02 G10L15/20

    摘要: A method of optimizing audio input for speech recognition applications can include identifying a source waveform and at least one optimization parameter, wherein the optimization parameter is configured to adjust audio input to a speech recognition application. The source waveform can be modified according to the optimization parameter resulting in a modified waveform. At least one optimization parameter can be synchronized with the source waveform. At least two time dependant graphs can be displayed, where the time dependant graphs can include the source waveform, the modified waveform, and/or a graph for the optimization parameter plotted against time.

    摘要翻译: 优化用于语音识别应用的音频输入的方法可以包括识别源波形和至少一个优化参数,其中优化参数被配置为调整到语音识别应用的音频输入。 源波形可以根据优化参数进行修改,从而产生修改后的波形。 至少一个优化参数可以与源波形同步。 可以显示至少两个时间依赖图,其中时间相关图可以包括源波形,修改波形和/或优化参数与时间对应的图。