METHOD AND APPARATUS FOR KEYWORD-BASED MEDIA ITEM TRANSMISSION
    3.
    发明申请
    METHOD AND APPARATUS FOR KEYWORD-BASED MEDIA ITEM TRANSMISSION 审中-公开
    用于基于关键字的媒体项目传输的方法和装置

    公开(公告)号:US20080162454A1

    公开(公告)日:2008-07-03

    申请号:US11619465

    申请日:2007-01-03

    IPC分类号: G06F17/30 G10L15/00 G06F17/27

    摘要: A system includes a first communications device [105] to participate in a conversation with at least a second communication device [110]. An intelligent communication agent [120] monitors the conversation for at least one keyword. In response to detecting the at least one keyword, the intelligent communication agent performs a search for multimedia content corresponding to the at least one keyword and retrieves the multimedia content. A logic engine [135] determines relevant content of the multimedia content based on at least one of a conversation profile and at least one user profile for at least one of a user of the first communication device and at least a second user of the at least a second communication device. A transmission element [130] transmits the relevant content to at least one of the first communication device, the at least a second communication device, and a predetermined multimedia device [145].

    摘要翻译: 系统包括参与与至少第二通信设备[110]的会话的第一通信设备[105]。 智能通信代理[120]监视至少一个关键字的会话。 响应于检测到所述至少一个关键字,所述智能通信代理执行与所述至少一个关键字相对应的多媒体内容的搜索并检索所述多媒体内容。 逻辑引擎基于至少一个对话简档和至少一个用户简档来确定多媒体内容的相关内容,用于第一通信设备的用户和至少第二用户中的至少一个 第二通信设备。 传输元件[130]将相关内容发送到第一通信设备,至少第二通信设备和预定多媒体设备中的至少一个[145]。

    METHOD AND APPARATUS FOR VOICE SEARCHING IN A MOBILE COMMUNICATION DEVICE
    4.
    发明申请
    METHOD AND APPARATUS FOR VOICE SEARCHING IN A MOBILE COMMUNICATION DEVICE 审中-公开
    用于在移动通信设备中进行语音搜索的方法和装置

    公开(公告)号:US20080162472A1

    公开(公告)日:2008-07-03

    申请号:US11617134

    申请日:2006-12-28

    IPC分类号: G06F17/30 G06F3/048

    摘要: A method and apparatus for performing a voice search in a mobile communication device is disclosed. The method may include receiving a search query from a user of the mobile communication device, converting speech parts in the search query into linguistic representations, comparing the query linguistic representations to the linguistic representations of all items in the voice search database to find matches, wherein the voice search database has indexed all items that are associated with the device, displaying the matches to the user, receiving the user's selection from the displayed matches, and retrieving and executing the user's selection.

    摘要翻译: 公开了一种用于在移动通信设备中执行语音搜索的方法和装置。 该方法可以包括从移动通信设备的用户接收搜索查询,将搜索查询中的语音部分转换为语言表示,将查询语言表示与语音搜索数据库中的所有项目的语言表示进行比较以找到匹配,其中 语音搜索数据库对与设备相关联的所有项目进行索引,向用户显示匹配,从显示的匹配中接收用户的选择,以及检索和执行用户的选择。

    Noise reduced speech recognition parameters
    5.
    发明授权
    Noise reduced speech recognition parameters 有权
    噪声降低语音识别参数

    公开(公告)号:US06678656B2

    公开(公告)日:2004-01-13

    申请号:US10061048

    申请日:2002-01-30

    IPC分类号: G10L1520

    摘要: A voice sample characterization front-end suitable for use in a distributed speech recognition context. A digitized voice sample 31 is split between a low frequency path 32 and a high frequency path 33. Both paths are used to determine spectral content suitable for use when determining speech recognition parameters (such as cepstral coefficients) that characterize the speech sample for recognition purposes. The low frequency path 32 has a thorough noise reduction capability. In one embodiment, the results of this noise reduction are used by the high frequency path 33 to aid in de-noising without requiring the same level of resource capacity as used by the low frequency path 32.

    摘要翻译: 语音样本表征前端适用于分布式语音识别语境。 数字化语音样本31在低频路径32和高频路径33之间分离。当确定表征语音样本以识别目的的语音识别参数(例如倒谱系数)时,两个路径用于确定适合使用的频谱内容 。 低频路径32具有彻底的降噪能力。 在一个实施例中,由高频路径33使用该噪声降低的结果来帮助去噪,而不需要与低频路径32所使用的相同的资源容量。

    Method and apparatus for generating and updating a voice tag
    6.
    发明授权
    Method and apparatus for generating and updating a voice tag 失效
    用于生成和更新语音标签的方法和装置

    公开(公告)号:US07471775B2

    公开(公告)日:2008-12-30

    申请号:US11170892

    申请日:2005-06-30

    申请人: Yan Ming Cheng

    发明人: Yan Ming Cheng

    IPC分类号: H04M1/64

    摘要: A method and apparatus (100) for updating a voice tag comprising N stored voice tag phoneme sequences includes a function (110) for determining (205) an accepted stored voice tag phoneme sequence for an utterance, a function (140) for extracting(210) a current set of M phoneme sequences having highest likelihoods of representing the utterance, a function (160) for updating (215) a reference histogram associated with the accepted voice tag, and a function (160) for updating (225) the voice tag with N selected phoneme sequences that are selected from the current set of M phoneme sequences and the set of N voice tag phoneme sequences, wherein the N selected phoneme sequences have phoneme histograms most closely matching the reference histogram. The method and apparatus (100) also generates a voice tag using some functions (110, 140, 160) that are common with the method and apparatus to update the voice tag, such as the extracting (410) of the current set of M phoneme sequences.

    摘要翻译: 一种用于更新包括N个存储的语音标签音素序列的语音标签的方法和装置(100),包括用于确定(205)用于话语的接受的存储的语音标签音素序列的功能(110),用于提取(210) )具有表示发音的最高似然性的当前的一组M个音素序列,用于更新(215)与所接受的语音标签相关联的参考直方图的功能(160)和用于更新(225)语音标签的功能(160) 其中N个选择的音素序列选自当前的M个音素序列集合和一组N个语音标签音素序列,其中N个选择的音素序列具有与参考直方图最接近匹配的音素直方图。 方法和装置(100)还使用与方法和装置相同的功能(110,140,​​160)来生成语音标签,以更新语音标签,例如提取(410)当前的一组M个音素 序列。

    METHOD AND APPARATUS PERTAINING TO THE PROCESSING OF SAMPLED AUDIO CONTENT USING A MULTI-RESOLUTION SPEECH RECOGNITION SEARCH PROCESS
    7.
    发明申请
    METHOD AND APPARATUS PERTAINING TO THE PROCESSING OF SAMPLED AUDIO CONTENT USING A MULTI-RESOLUTION SPEECH RECOGNITION SEARCH PROCESS 审中-公开
    使用多分辨率语音识别搜索过程处理采样音频内容的方法和设备

    公开(公告)号:US20080162129A1

    公开(公告)日:2008-07-03

    申请号:US11617908

    申请日:2006-12-29

    申请人: Yan Ming Cheng

    发明人: Yan Ming Cheng

    IPC分类号: G10L15/00

    CPC分类号: G10L15/148

    摘要: One provides (101) a plurality of frames of sampled audio content and then processes (102) that plurality of frames using a speech recognition search process that comprises, at least in part, searching for at least two of state boundaries, subword boundaries, and word boundaries using different search resolutions.

    摘要翻译: 一个提供(101)多个采样音频内容的帧,然后使用语音识别搜索处理(102)处理(102)多个帧,该处理至少部分地包括搜索状态边界,子词边界和 字边界使用不同的搜索分辨率。

    Methods and apparatus for reducing noise associated with an electrical speech signal
    8.
    发明授权
    Methods and apparatus for reducing noise associated with an electrical speech signal 有权
    用于降低与电语音信号相关联的噪声的方法和装置

    公开(公告)号:US06480821B2

    公开(公告)日:2002-11-12

    申请号:US09774840

    申请日:2001-01-31

    IPC分类号: G10L2102

    摘要: A system for enhancing the signal-to-noise ratio of a speech signal is avoided. A plurality of local energy maximums associated with a speech signal are determined. Presumably, each of these local energy maximums defines a speech pitch period. Typically, human pitch periods are approximately 100-400 Hz depending on the sex and age of the speaker. Because human speech typically includes more energy near the beginning of a pitch period than at the end of the pitch period, and background noise tends to remain relatively constant throughout the pitch period, the speech signal may be enhanced by increasing the energy associated with the beginning of the pitch period and/or by decreasing the energy associated with the end of the pitch period. Preferably, the amount of energy increase in the earlier portion of the pitch period is approximately equal to the amount of energy reduction in the later portion of the pitch period. In this manner, the total energy remains the constant.

    摘要翻译: 避免了用于提高语音信号的信噪比的系统。 确定与语音信号相关联的多个局部能量最大值。 大概地,这些局部能量最大值中的每一个定义了语音音调周期。 通常,根据演讲者的性别和年龄,人类音调周期约为100-400Hz。 因为人类语音通常在音调周期的开始处包括比在音调周期结束时更多的能量,并且背景噪声在整个音调周期期间趋于保持相对恒定,所以可以通过增加与开始相关联的能量来增强语音信号 和/或通过减小与音调周期结束相关联的能量。 优选地,在音调周期的较早部分中的能量增加量大约等于音调周期的稍后部分中的能量减少量。 以这种方式,总能量保持恒定。

    Method and apparatus for distributed voice searching
    9.
    发明授权
    Method and apparatus for distributed voice searching 有权
    分布式语音搜索的方法和装置

    公开(公告)号:US07818170B2

    公开(公告)日:2010-10-19

    申请号:US11733306

    申请日:2007-04-10

    申请人: Yan Ming Cheng

    发明人: Yan Ming Cheng

    IPC分类号: G10L17/00

    摘要: A method for distributed voice searching may include receiving a search query from a user of the mobile communication device, generating a lattice of coarse linguistic representations from speech parts in the search query, extracting query features from the generated lattice of coarse linguistic representations, generating coarse search feature vectors based on the extracted query features, performing a coarse search using the generated coarse search feature vectors and transmitting the generated coarse search feature vectors to a remote voice search processing unit, receiving remote resultant web indices from the remote voice search processing unit, generating a lattice of fine linguistic representations from speech parts in the search query, generating fine search feature vectors from the lattice of fine linguistic representations, performing a fine search using the coarse search results, the remote resultant web indices and the generated fine search feature vectors, and displaying the fine search results to the user.

    摘要翻译: 用于分布式语音搜索的方法可以包括从移动通信设备的用户接收搜索查询,从搜索查询中的语音部分生成粗略语言表示的格子,从生成的粗略语言表示的格子中提取查询特征,生成粗略 基于所提取的查询特征的搜索特征向量,使用所生成的粗略搜索特征向量执行粗略搜索,并将生成的粗略搜索特征向量发送到远程语音搜索处理单元,从远程语音搜索处理单元接收远程结果web索引, 从搜索查询中的语音部分生成精细语言表示的格子,从精细语言表示的格子生成精细搜索特征向量,使用粗略搜索结果,远程生成的网页索引和生成的精细搜索特征向量进行精细搜索 ,并显示t 他对用户的搜索结果很好。

    METHOD AND APPARATUS PERTAINING TO THE PROCESSING OF SAMPLED AUDIO CONTENT USING A FAST SPEECH RECOGNITION SEARCH PROCESS
    10.
    发明申请
    METHOD AND APPARATUS PERTAINING TO THE PROCESSING OF SAMPLED AUDIO CONTENT USING A FAST SPEECH RECOGNITION SEARCH PROCESS 审中-公开
    使用快速语音识别搜索过程处理采样音频内容的方法和设备

    公开(公告)号:US20080162128A1

    公开(公告)日:2008-07-03

    申请号:US11617892

    申请日:2006-12-29

    申请人: Yan Ming Cheng

    发明人: Yan Ming Cheng

    IPC分类号: G10L15/00

    CPC分类号: G10L15/148

    摘要: One provides (101) a plurality of frames of sampled audio content and then processes (102) that plurality of frames using a speech recognition search process that comprises, at least in part, determining whether to search each subword boundary contained within each frame on a frame-by-frame basis. These teachings will also readily accommodate determining whether to search each word boundary contained within each frame on a frame-by-frame basis.

    摘要翻译: 一个提供(101)多个采样的音频内容的帧,然后使用语音识别搜索处理(102)处理(102)多个帧,该处理至少部分地确定是否搜索包含在每个帧内的每个子字边界 逐帧依据。 这些教导还将容易地适应确定是否逐帧搜索包含在每个帧内的每个字边界。