Model-based voice activity detection system and method using a log-likelihood ratio and pitch
    2.
    发明授权
    Model-based voice activity detection system and method using a log-likelihood ratio and pitch 有权
    基于模型的语音活动检测系统和使用对数似然比和音调的方法

    公开(公告)号:US06615170B1

    公开(公告)日:2003-09-02

    申请号:US09519960

    申请日:2000-03-07

    IPC分类号: G10L1520

    CPC分类号: G10L25/78

    摘要: A system and method for voice activity detection, in accordance with the invention includes the steps of inputting data including frames of speech and noise, and deciding if the frames of the input data include speech or noise by employing a log-likelihood ratio test statistic and pitch. The frames of the input data are tagged based on the log-likelihood ratio test statistic and pitch characteristics of the input data as being most likely noise or most likely speech. The tags are counted in a plurality of frames to determine if the input data is speech or noise.

    摘要翻译: 根据本发明的用于语音活动检测的系统和方法包括以下步骤:输入包括语音和噪声的数据,并且通过采用对数似然比检验统计量来确定输入数据的帧是否包括语音或噪声, 沥青。 输入数据的帧基于输入数据的对数似然比检验统计量和音调特性被标记为最可能是噪声或最可能的语音。 标签被计数在多个帧中以确定输入数据是语音还是噪声。

    System and method for sampling rate transformation in speech recognition
    3.
    发明授权
    System and method for sampling rate transformation in speech recognition 有权
    语音识别中采样率变换的系统和方法

    公开(公告)号:US06199041B1

    公开(公告)日:2001-03-06

    申请号:US09197024

    申请日:1998-11-20

    IPC分类号: G10L1500

    CPC分类号: G10L15/065 G10L21/00

    摘要: A method and system for transforming a sampling rate in speech recognition systems, in accordance with the present invention, includes the steps of providing cepstral based data including utterances comprised of segments at a reference frequency, the segments being represented by cepstral vector coefficients, converting the cepstral vector coefficients to energy bands in logarithmic spectra, filtering the energy bands of the logarithmic spectra to remove energy bands having a frequency above a predetermined portion of a target frequency and converting the filtered logarithmic spectra to modified cepstral vector coefficients at the target frequency. Another method and system convert system prototypes for speech recognition systems from a reference frequency to a target frequency.

    摘要翻译: 根据本发明的用于变换语音识别系统中的采样率的方法和系统包括以下步骤:提供基于倒谱的数据,包括由参考频率的段组成的话语,所述段由倒谱矢量系数表示,转换 将对数谱中的能带的倒谱矢量系数过滤,对对数光谱的能带进行滤波,以去除具有高于目标频率的预定部分的频率的能带,并将滤波后的对数谱转换成目标频率处的修正倒谱矢量系数。 另一种方法和系统将用于语音识别系统的系统原型从参考频率转换为目标频率。

    Voice prompts for use in speech-to-speech translation system

    公开(公告)号:US20060253272A1

    公开(公告)日:2006-11-09

    申请号:US11123287

    申请日:2005-05-06

    IPC分类号: G06F17/28

    CPC分类号: G10L13/00 G10L2015/223

    摘要: Techniques for employing improved prompts in a speech-to-speech translation system are disclosed. By way of example, a technique for use in indicating a dialogue turn in an automated speech-to-speech translation system comprises the following steps/operations. One or more text-based scripts are obtained. The one or more text-based scripts are synthesizable into one or more voice prompts. At least one of the one or more voice prompts is synthesized for playback from at least one of the one or more text-based scripts, the at least one synthesized voice prompt comprising an audible message in a language understandable to a speaker interacting with the speech-to-speech translation system, the audible message indicating a dialogue turn in the automated speech-to-speech translation system.

    System and method for determining a connectionless communication path for communicating audio data through an address and port translation device
    6.
    发明授权
    System and method for determining a connectionless communication path for communicating audio data through an address and port translation device 有权
    用于确定用于通过地址和端口转换装置传送音频数据的无连接通信路径的系统和方法

    公开(公告)号:US06928082B2

    公开(公告)日:2005-08-09

    申请号:US09819492

    申请日:2001-03-28

    摘要: A method of audio communication utilizing media datagrams between a first telephony client located behind a network address translation (NAT) server and a remote second telephony client is disclosed. Each client utilizes a single port number for both sending and receiving media datagrams. A media datagram is sent from the first telephony client to the second telephony client on a UDP/IP channel utilizing a destination IP address and port number provided by the second telephony client. The second telephony client extracts the source IP address and source port number from the received media datagram to determine if the first telephony client is located behind a NAT server. If the first telephony client is located behind a NAT server, the extracted source IP address and port number are stored and used to send media datagrams to the first telephony client located behind the NAT server.

    摘要翻译: 公开了一种在位于网络地址转换(NAT)服务器和远程第二电话客户端之后的第一电话客户端之间利用媒体数据报的音频通信的方法。 每个客户端都使用单个端口号来发送和接收媒体数据报。 使用由第二电话客户端提供的目的地IP地址和端口号,在UDP / IP信道上将媒体数据报从第一电话客户端发送到第二电话客户端。 第二电话客户端从接收到的媒体数据报中提取源IP地址和源端口号,以确定第一电话客户端是否位于NAT服务器后面。 如果第一电话客户端位于NAT服务器后面,则提取的源IP地址和端口号被存储并用于向位于NAT服务器后面的第一电话客户端发送媒体数据报。

    Automatic segmentation of continuous text using statistical approaches
    7.
    发明授权
    Automatic segmentation of continuous text using statistical approaches 失效
    使用统计方法自动分割连续文本

    公开(公告)号:US5806021A

    公开(公告)日:1998-09-08

    申请号:US700823

    申请日:1996-09-04

    IPC分类号: G06F17/27 G06F17/20

    CPC分类号: G06F17/277

    摘要: An automatic segmenter for continuous text segments such text in a rapid, consistent and semantically accurate manner. Two statistical methods for segmentation of continuous text are used. The first method, called "forward-backward matching", is easy and fast but can produce occasional errors in long phrases. The second method, called "statistical stack search segmenter", utilizes statistical language models to generate more accurate segmentation output at an expense of two times more execution time than the "forward-backward matching" method. In some applications where speed is a major concern, "forward-backward matching" can be used, while in other applications where highly accurate output is desired, "statistical stack search segmenter" is ideal.

    摘要翻译: 用于以快速,一致和语义准确的方式连续文本段的自动分段器。 使用两种连续文本分割的统计方法。 第一种称为“前向 - 后向匹配”的方法是简单快捷的,但可能会产生长时间的误差。 称为“统计堆栈搜索分段器”的第二种方法利用统计语言模型以比“前向 - 后向匹配”方法多两倍的执行时间来生成更精确的分段输出。 在速度是主要关注的一些应用中,可以使用“前向后匹配”,而在需要高精度输出的其他应用中,“统计栈搜索分段器”是理想的。

    Voice prompts for use in speech-to-speech translation system
    8.
    发明授权
    Voice prompts for use in speech-to-speech translation system 有权
    语音提示用于语音转语音翻译系统

    公开(公告)号:US08560326B2

    公开(公告)日:2013-10-15

    申请号:US12115205

    申请日:2008-05-05

    IPC分类号: G06F17/28 G06F17/20 G10L21/00

    CPC分类号: G10L13/00 G10L2015/223

    摘要: Techniques for employing improved prompts in a speech-to-speech translation system are disclosed. By way of example, a technique for use in indicating a dialogue turn in an automated speech-to-speech translation system comprises the following steps/operations. One or more text-based scripts are obtained. The one or more text-based scripts are synthesizable into one or more voice prompts. At least one of the one or more voice prompts is synthesized for playback from at least one of the one or more text-based scripts, the at least one synthesized voice prompt comprising an audible message in a language understandable to a speaker interacting with the speech-to-speech translation system, the audible message indicating a dialogue turn in the automated speech-to-speech translation system.

    摘要翻译: 公开了在语音到语音翻译系统中采用改进提示的技术。 作为示例,用于指示自动语音到语音翻译系统中的对话转弯的技术包括以下步骤/操作。 获得一个或多个基于文本的脚本。 一个或多个基于文本的脚本可以合成为一个或多个语音提示。 所述一个或多个语音提示中的至少一个被合成用于从所述一个或多个基于文本的脚本中的至少一个进行回放,所述至少一个合成语音提示包括以与所述语音交互的说话者可理解的语言的可听消息 语音翻译系统,该声音消息指示自动语音转语音翻译系统中的对话转换。

    Voice Prompts for Use in Speech-to-Speech Translation System
    9.
    发明申请
    Voice Prompts for Use in Speech-to-Speech Translation System 有权
    语音提示用于语音转语音翻译系统

    公开(公告)号:US20080243476A1

    公开(公告)日:2008-10-02

    申请号:US12115205

    申请日:2008-05-05

    IPC分类号: G06F17/28

    CPC分类号: G10L13/00 G10L2015/223

    摘要: Techniques for employing improved prompts in a speech-to-speech translation system are disclosed. By way of example, a technique for use in indicating a dialogue turn in an automated speech-to-speech translation system comprises the following steps/operations. One or more text-based scripts are obtained. The one or more text-based scripts are synthesizable into one or more voice prompts. At least one of the one or more voice prompts is synthesized for playback from at least one of the one or more text-based scripts, the at least one synthesized voice prompt comprising an audible message in a language understandable to a speaker interacting with the speech-to-speech translation system, the audible message indicating a dialogue turn in the automated speech-to-speech translation system.

    摘要翻译: 公开了在语音到语音翻译系统中采用改进提示的技术。 作为示例,用于指示自动语音到语音翻译系统中的对话转弯的技术包括以下步骤/操作。 获得一个或多个基于文本的脚本。 一个或多个基于文本的脚本可以合成为一个或多个语音提示。 所述一个或多个语音提示中的至少一个被合成用于从所述一个或多个基于文本的脚本中的至少一个进行回放,所述至少一个合成语音提示包括以与所述语音交互的说话者可理解的语言的可听消息 语音翻译系统,该声音消息指示自动语音转语音翻译系统中的对话转换。