Client-server speech processing system, apparatus, method, and storage medium
    1.
    发明授权
    Client-server speech processing system, apparatus, method, and storage medium 失效
    客户服务器语音处理系统,设备,方法和存储介质

    公开(公告)号:US07058580B2

    公开(公告)日:2006-06-06

    申请号:US10956130

    申请日:2004-10-04

    IPC分类号: G10L15/04

    CPC分类号: G10L15/30

    摘要: The system implements high-accuracy speech recognition while suppressing the amount of data transfer between the client and server. For this purpose, the client compression-encodes speech parameters by a speech processing unit, and sends the compression-encoded speech parameters to the server. The server receives the compression-encoded speech parameters, a speech processing unit makes speech recognition of the compression-encoded speech parameters, and sends information corresponding to the speech recognition result to the client.

    摘要翻译: 系统实现高精度语音识别,同时抑制客户端与服务器之间的数据传输量。 为此,客户机通过语音处理单元对语音参数进行压缩编码,并将压缩编码的语音参数发送到服务器。 服务器接收压缩编码的语音参数,语音处理单元进行压缩编码语音参数的语音识别,并将与语音识别结果相对应的信息发送给客户端。

    Information processing apparatus and method, and program
    2.
    发明申请
    Information processing apparatus and method, and program 审中-公开
    信息处理装置和方法,程序

    公开(公告)号:US20050119888A1

    公开(公告)日:2005-06-02

    申请号:US10497499

    申请日:2002-12-10

    摘要: A GUI display module displays a contents image based on contents data within a display area, and a display portion switching input module instructs to change the display portion of the contents image within the display area. Based on this instruction input, a display portion switching module changes the display portion of the contents image within the display area. A synthesis text determination module determines data which is to undergo speech synthesis in the contents data on the basis of display portion information which is held by a display portion holding module and indicates the display portion. A speech synthesis module synthesizes speech of the data which is to undergo speech synthesis, and a speech output module outputs the synthesized synthetic speech.

    摘要翻译: GUI显示模块基于显示区域内的内容数据显示内容图像,并且显示部分切换输入模块指示改变显示区域内的内容图像的显示部分。 基于该指令输入,显示部分切换模块改变显示区域内的内容图像的显示部分。 合成文本确定模块基于由显示部分保持模块保持并指示显示部分的显示部分信息来确定要在内容数据中进行语音合成的数据。 语音合成模块合成要进行语音合成的数据的语音,并且语音输出模块输出合成的合成语音。

    Encoding method for syllables
    4.
    发明授权
    Encoding method for syllables 失效
    音节编码方法

    公开(公告)号:US5208863A

    公开(公告)日:1993-05-04

    申请号:US608376

    申请日:1990-11-02

    IPC分类号: G10L13/06 G10L15/02 G10L19/00

    CPC分类号: G10L15/02

    摘要: A method for encoding syllables of a language, particularly the Japanese language, and for facilitating the extraction of sound codes from the input syllables, for voice recognition or voice synthesis includes the step of providing a syllable classifying table, in which each syllable is represented by an upper byte code indicating the consonant part of the syllable and a lower byte code indicating the non-consonant part of the syllable. The consonants constitute a first category of data classified by phonetic features, while the non-consonants constitute a second category of data classified by phonetic features, so that the extraction of consonant or non-consonant sounds can be made by a search in only the first or the second categories. The encoding of diphthongs are made in such a manner that those containing the same vowel have the same remainder corresponding to the code of this vowel, when the codes are divided by the number of vowels contained in the second category, so that the extraction of a vowel from diphthongs can be achieved by a simple mathematical division.

    摘要翻译: 用于编码语言,特别是日语的音节的方法,以及用于便于从输入音节提取声音代码,用于语音识别或语音合成的方法包括提供音节分类表的步骤,其中每个音节由 指示音节的辅音部分的高字节代码和表示音节的非辅音部分的低字节代码。 辅音构成了以语音特征分类的第一类数据,而非辅音构成了以语音特征分类的第二类数据,因此可以仅通过搜索第一类搜索辅音或非辅音 或第二类。 双符号的编码是这样进行的,即当包含同一元音的那些元素具有与该元音的代码相同的余数时,当代码除以第二类别中包含的元音数量,从而提取 来自diphthongs的元音可以通过简单的数学部分来实现。

    Speech recognition system, speech recognition server, speech recognition client, their control method, and computer readable memory

    公开(公告)号:US07099824B2

    公开(公告)日:2006-08-29

    申请号:US09993570

    申请日:2001-11-27

    IPC分类号: G10L15/00

    CPC分类号: G10L15/30

    摘要: A user dictionary, which is formed by storing pronunciations and notations of target recognition words designated by the user in correspondence with each other, input speech recognition data, and dictionary management data used to determine the recognition field of a recognition dictionary used in recognition of the speech recognition data are sent to a server via a communication module. In the server, a dictionary management unit looks up an identifier table to determine a recognition dictionary corresponding to the dictionary management information received from a client from a plurality of kinds of recognition dictionaries. A speech recognition module recognizes the speech recognition data using at least the determined recognition dictionary. The recognition result is sent to the client via a communication module.

    Image-forming apparatus and image-forming method
    6.
    发明申请
    Image-forming apparatus and image-forming method 有权
    图像形成装置和图像形成方法

    公开(公告)号:US20050147439A1

    公开(公告)日:2005-07-07

    申请号:US11055254

    申请日:2005-02-10

    CPC分类号: B41J13/106 B41J11/0095

    摘要: The present invention is characterized in that: a demanded printing-job is accepted at a printing-job accepting unit, and a printing executing unit executes printing according to the accepted printing-job so as to produce an output sheet onto a discharge tray; an output-sheet detecting unit detects the presence of a printed output-sheet on the discharge tray; if the removal of discharged sheets from the discharge tray is detected, a printed-sheet mixture determining unit determines the possibility that sheets printed corresponding to a plurality of demands are mixed among the removed output-sheets; and if it is determined the possibility that sheets printed corresponding to a plurality of the demands are mixed among the output-sheets, warning information is produced by a warning voice producing unit.

    摘要翻译: 本发明的特征在于:在打印作业接收单元接受所要求的打印作业,并且打印执行单元根据所接受的打印作业执行打印,以便在排出托盘上产生输出纸张; 输出页检测单元检测排纸托盘上印刷的输出页的存在; 如果检测到从排出托盘排出的纸张,则印刷纸混合物确定单元确定在去除的输出纸之间混合了多个要求印刷的纸张的可能性; 并且如果确定在输出页之间混合了对应于多个要求的打印纸的可能性,则由警告语音产生单元产生警告信息。

    Client-server speech processing system, apparatus, method, and storage medium
    7.
    发明授权
    Client-server speech processing system, apparatus, method, and storage medium 失效
    客户服务器语音处理系统,设备,方法和存储介质

    公开(公告)号:US06813606B2

    公开(公告)日:2004-11-02

    申请号:US09739878

    申请日:2000-12-20

    IPC分类号: G10L2100

    CPC分类号: G10L15/30

    摘要: The system implements high-accuracy speech recognition while suppressing the amount of data transfer between the client and server. For this purpose, the client compression-encodes speech parameters by a speech processing unit, and sends the compression-encoded speech parameters to the server. The server receives the compression-encoded speech parameters, and speech processing unit makes speech recognition of the compression-encoded speech parameters, and sends information corresponding to the speech recognition result to the client.

    摘要翻译: 系统实现高精度语音识别,同时抑制客户端与服务器之间的数据传输量。 为此,客户机通过语音处理单元对语音参数进行压缩编码,并将压缩编码的语音参数发送到服务器。 服务器接收压缩编码语音参数,语音处理单元进行压缩编码语音参数的语音识别,并将与语音识别结果相对应的信息发送给客户端。

    Speech processing apparatus and method and computer readable medium encoded with a program for recognizing input speech by performing searches based on a normalized current feature parameter
    9.
    发明授权
    Speech processing apparatus and method and computer readable medium encoded with a program for recognizing input speech by performing searches based on a normalized current feature parameter 失效
    语音处理装置和方法以及通过基于归一化的当前特征参数执行搜索来识别输入语音的程序编码的计算机可读介质

    公开(公告)号:US06236962B1

    公开(公告)日:2001-05-22

    申请号:US09038898

    申请日:1998-03-12

    IPC分类号: G10L1520

    CPC分类号: G10L15/20

    摘要: An apparatus, method, and storage medium for eliminating the influence of line characteristics in a real-time manner in order to raise recognition precision of input speech and to enable the speech to be recognized in a real-time manner, includes a device and step for obtaining, an estimate value of a long-time mean of a parameter from speech feature parameters which are sequentially inputted by using the speech feature parameters which have already been inputted, and a device and step for normalizing the speech feature parameter inputted at that time point by using the obtained estimate value. Each time the speech feature parameter is inputted, the latest estimate value is obtained by using the already inputted parameters including the inputted speech feature parameter, and the latest input speech feature parameter is normalized by using the updated estimate value. Since the reliability of the estimate value is higher as the number of speech feature parameters used when the estimate value is obtained is larger, the estimate value is normalized by adding a weight in accordance with the reliability.

    摘要翻译: 为了提高输入语音的识别精度和使语音能够被实时识别,可以实时消除线路特性的影响的装置,方法和存储介质,包括设备和步骤 用于从通过使用已经输入的语音特征参数顺序输入的语音特征参数获得参数的长时间平均值的估计值;以及用于对当时输入的语音特征参数进行归一化的装置和步骤 通过使用获得的估计值来指出。 每当输入语音特征参数时,通过使用已经输入的包括输入的语音特征参数的参数来获得最新估计值,并且通过使用更新的估计值对最新的输入语音特征参数进行归一化。 由于当获得估计值时使用的语音特征参数的数量较大,估计值的可靠性较高,所以通过根据可靠性加权来对估计值进行归一化。

    Speech recognition method and apparatus for use therein
    10.
    发明授权
    Speech recognition method and apparatus for use therein 失效
    用于其中的语音识别方法和装置

    公开(公告)号:US5751898A

    公开(公告)日:1998-05-12

    申请号:US199968

    申请日:1994-02-22

    CPC分类号: G10L15/12 G10L15/10

    摘要: Speech recognition is achieved using a normalized cumulative distance. A normalized Dynamic Programming (DP) value is calculated by dividing a cumulative path distance by an optimal integral path length. The path length is calculated iteratively by adding 2 if the warping path is diagonal or by adding 3 if the warping path is horizontal or vertical. Distance may be calculated by measuring a difference between input power and average power. The power difference is weighted by a coefficient (.lambda.) between 0 and 1. A Mahalanobis distance is then weighted by (1-.lambda.) and added to the weighted power difference.

    摘要翻译: 使用归一化的累积距离实现语音识别。 通过将累积路径距离除以最优积分路径长度来计算归一化动态规划(DP)值。 如果弯曲路径是对角线,则迭代地计算路径长度,如果弯曲路径是水平或垂直的,则通过加上2来计算路径长度。 距离可以通过测量输入功率和平均功率之间的差异来计算。 功率差由0和1之间的系数(λ)加权。然后通过(1-lambda)加权马哈拉诺比斯距离并将其加到加权功率差。