Automatic and attendant speech to text conversion in a selective call
radio system and method
    1.
    发明授权
    Automatic and attendant speech to text conversion in a selective call radio system and method 失效
    在选择性呼叫无线电系统和方法中自动和伴随语音转换文本

    公开(公告)号:US6151572A

    公开(公告)日:2000-11-21

    申请号:US67779

    申请日:1998-04-27

    IPC分类号: G10L15/00 G10L15/26 H04M1/64

    摘要: A radio communication system includes a voice recognition system (218), a transmitter (202) and a processing system (210). The voice recognition system is utilized for receiving caller initiated messages, and the transmitter is used for transmitting messages to a plurality of SCRs (selective call radios) (122) of the radio communication system. The processing system, which is coupled to the voice recognition system, and the transmitter, is adapted to cause the voice recognition system to convert a voice signal representative of a voice message originated by a caller of the radio communication system to a text message (401, 417), wherein the text message is intended for a SCR, to then generate a likelihood of success that the voice signal has been flawlessly converted to a text message, to have a human listen to an audible representation of the voice signal, and to cause the transmitter to transmit the text message to the SCR (432). The converting step includes autocorrelation by Fourier transform, measure a degree of voiceness for each band, applying the degree of voiceness to a corresponding plurality of phenome models, and deriving a text equivalent by searching through a phenome library.

    摘要翻译: 无线电通信系统包括语音识别系统(218),发射机(202)和处理系统(210)。 语音识别系统被用于接收主叫方发起的消息,并且发射机用于向无线电通信系统的多个SCR(选择性呼叫无线电)(122)发送消息。 耦合到语音识别系统的处理系统和发射机适于使语音识别系统将表示由无线电通信系统的呼叫者发起的语音消息的语音信号转换成文本消息(401 ,417),其中所述文本消息旨在用于SCR,然后生成所述语音信号已经被完美地转换为文本消息的成功的可能性,以使人们听到所述语音信号的可听见的表示,并且 使发射机将文本消息发送到SCR(432)。 转换步骤包括通过傅立叶变换的自相关,测量每个频带的声音程度,将声音的程度应用于对应的多个表现模型,以及通过在phenome库中搜索来导出等价文本。

    Speech recognition in selective call systems
    2.
    发明授权
    Speech recognition in selective call systems 失效
    选择性呼叫系统中的语音识别

    公开(公告)号:US5719996A

    公开(公告)日:1998-02-17

    申请号:US491329

    申请日:1995-06-30

    IPC分类号: G10L15/14 G10L5/06

    CPC分类号: G10L15/144

    摘要: A selective call communication system (100) has a speech recognition system using an acoustic space (400) which has a plurality of probability density functions (pdfs). The selective call communication system (100) has an acoustic space generator (136) for representing speech in the acoustic space (400) which has a plurality of regions (1-14) having a subset of the plurality of probability density functions (502-516). The selective call communication system (100) has a tree generator (138) for generating a hierarchical tree structure (500) representing the subset of the plurality of probability density functions (502-516) associated with the plurality of regions (1-14), a score computer (132) for determining a region of the plurality of regions (1-14) indicative of a minimum distance to a center of the region for a speech sample received, and a speech recognizer (130) for calculating the probability density functions of the region for recognizing the speech sample received.

    摘要翻译: 选择呼叫通信系统(100)具有使用具有多个概率密度函数(pdf)的声学空间(400)的语音识别系统。 选呼通信系统(100)具有用于表示声空间(400)中的语音的声空间发生器(136),该声空间具有多个区域(1-14),该多个区域具有多个概率密度函数(502- 516)。 选呼通信系统(100)具有树生成器(138),用于生成表示与多个区域(1-14)相关联的多个概率密度函数(502-516)的子集的分层树结构(500) ,用于确定指示与所接收的语音样本的所述区域的中心的最小距离的所述多个区域(1-14)中的区域的分数计算机(132)和用于计算所述概率密度的语音识别器(130) 用于识别收到的语音样本的区域的功能。

    VOICE WEB SEARCH
    3.
    发明申请
    VOICE WEB SEARCH 有权
    语音网页搜索

    公开(公告)号:US20110145214A1

    公开(公告)日:2011-06-16

    申请号:US12639176

    申请日:2009-12-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30899 G06F17/30654

    摘要: A search system will receive a voice query and use speech recognition with a predefined vocabulary to generate a textual transcription of the voice query. Queries are sent to a text search engine, retrieving multiple web page results for each of these initial text queries. The collection of the keywords is extracted from the resulting web pages and is phonetically indexed to form a voice query dependent and phonetically searchable index database. Finally, a phonetically-based voice search engine is used to search the original voice query against the voice query dependent and phonetically searchable index database to find the keywords and/or key phrases that best match what was originally spoken. The keywords and/or key phrases that best match what was originally spoken are then used as a final text query for a search engine. Search results from the final text query are then presented to the user.

    摘要翻译: 搜索系统将接收语音查询并且使用具有预定义词汇表的语音识别来生成语音查询的文本转录。 查询被发送到文本搜索引擎,为每个这些初始文本查询检索多个网页结果。 从所得到的网页中提取关键字的集合,并进行语音索引,形成一个语音查询依赖和语音可搜索的索引数据库。 最后,使用基于语音的语音搜索引擎来针对语音查询依赖和语音搜索的索引数据库搜索原始语音查询,以找到与最初所说的最匹配的关键字和/或关键短语。 最符合最初发言的关键字和/或关键短语然后被用作搜索引擎的最终文本查询。 然后将最终文本查询的搜索结果呈现给用户。

    ANALYZING AND PROCESSING A VERBAL EXPRESSION CONTAINING MULTIPLE GOALS
    4.
    发明申请
    ANALYZING AND PROCESSING A VERBAL EXPRESSION CONTAINING MULTIPLE GOALS 有权
    分析和处理包含多个目标的VERBAL表达

    公开(公告)号:US20110144996A1

    公开(公告)日:2011-06-16

    申请号:US12639067

    申请日:2009-12-16

    IPC分类号: G10L15/00

    摘要: Disclosed is a method for parsing a verbal expression received from a user to determine whether or not the expression contains a multiple-goal command. Specifically, known techniques are applied to extract terms from the verbal expression. The extracted terms are assigned to categories. If two or more terms are found in the parsed verbal expression that are in associated categories and that do not overlap one another temporally, then the confidence levels of these terms are compared. If the confidence levels are similar, then the terms may be parallel entries in the verbal expression and may represent multiple goals. If a multiple-goal command is found, then the command is either presented to the user for review and possible editing or is executed. If the parsed multiple-goal command is presented to the user for review, then the presentation can be made via any appropriate interface including voice and text interfaces.

    摘要翻译: 公开了一种用于解析从用户接收的口头表达以确定表达式是否包含多目标命令的方法。 具体来说,应用已知技术从语言表达中提取术语。 提取的术语被分配到类别。 如果在解析的语言表达中找到两个或多个相关类别的术语,并且不会在时间上彼此重叠,那么比较这些术语的置信水平。 如果置信水平相似,则术语可能是口头表达中的并行条目,可能表示多个目标。 如果找到多目标命令,则将该命令呈现给用户进行审查和可能的编辑或执行。 如果将解析的多目标命令呈现给用户进行审查,则可以通过任何适当的界面(包括语音和文本界面)进行演示。

    METHOD AND APPARATUS FOR GENERATING A MULTIMEDIA-BASED QUERY
    5.
    发明申请
    METHOD AND APPARATUS FOR GENERATING A MULTIMEDIA-BASED QUERY 审中-公开
    用于生成基于多媒体的查询的方法和装置

    公开(公告)号:US20100145971A1

    公开(公告)日:2010-06-10

    申请号:US12329979

    申请日:2008-12-08

    IPC分类号: G06F7/10 G06F17/30 G06F7/00

    CPC分类号: G06F16/43

    摘要: A method and apparatus for generating a query from multimedia content is provided herein. During operation a query generator (101) will receive multi-media content and separate the multi-media content into at least a video portion and an audio portion. A query will be generated based on both the video portion and the audio portion. The query may comprise a single query based on both the video and audio portion, or the query may comprise a “bundle” of queries. The bundle of queries contains at least a query for the video portion, and a query for the audio portion of the multimedia event.

    摘要翻译: 本文提供了一种用于从多媒体内容生成查询的方法和装置。 在操作期间,查询生成器(101)将接收多媒体内容并将多媒体内容分离成至少一个视频部分和音频部分。 将基于视频部分和音频部分生成查询。 查询可以包括基于视频和音频部分的单个查询,或者查询可以包括查询的“捆绑”。 该查询束至少包含对视频部分的查询,以及对多媒体事件的音频部分的查询。

    Voice web search
    6.
    发明授权
    Voice web search 有权
    语音网页搜索

    公开(公告)号:US09081868B2

    公开(公告)日:2015-07-14

    申请号:US12639176

    申请日:2009-12-16

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30899 G06F17/30654

    摘要: A search system will receive a voice query and use speech recognition with a predefined vocabulary to generate a textual transcription of the voice query. Queries are sent to a text search engine, retrieving multiple web page results for each of these initial text queries. The collection of the keywords is extracted from the resulting web pages and is phonetically indexed to form a voice query dependent and phonetically searchable index database. Finally, a phonetically-based voice search engine is used to search the original voice query against the voice query dependent and phonetically searchable index database to find the keywords and/or key phrases that best match what was originally spoken. The keywords and/or key phrases that best match what was originally spoken are then used as a final text query for a search engine. Search results from the final text query are then presented to the user.

    摘要翻译: 搜索系统将接收语音查询并且使用具有预定义词汇表的语音识别来生成语音查询的文本转录。 查询被发送到文本搜索引擎,为每个这些初始文本查询检索多个网页结果。 从所得到的网页中提取关键字的集合,并进行语音索引,形成一个语音查询依赖和语音可搜索的索引数据库。 最后,使用基于语音的语音搜索引擎来针对语音查询依赖和语音搜索索引数据库搜索原始语音查询,以找到与最初所说的最匹配的关键字和/或关键短语。 最符合最初发言的关键字和/或关键短语然后被用作搜索引擎的最终文本查询。 然后将最终文本查询的搜索结果呈现给用户。

    Analyzing and processing a verbal expression containing multiple goals
    7.
    发明授权
    Analyzing and processing a verbal expression containing multiple goals 有权
    分析和处理包含多个目标的口头表达

    公开(公告)号:US08914289B2

    公开(公告)日:2014-12-16

    申请号:US12639067

    申请日:2009-12-16

    IPC分类号: G10L15/22 G06F17/30 G06F17/27

    摘要: A method for parsing a verbal expression received from a user to determine whether or not the expression contains a multiple-goal command is described. Specifically, known techniques are applied to extract terms from the verbal expression. The extracted terms are assigned to categories. If two or more terms are found in the parsed verbal expression that are in associated categories and that do not overlap one another temporally, then the confidence levels of these terms are compared. If the confidence levels are similar, then the terms may be parallel entries in the verbal expression and may represent multiple goals. If a multiple-goal command is found, then the command is either presented to the user for review and possible editing or is executed. If the parsed multiple-goal command is presented to the user for review, then the presentation can be made via any appropriate interface including voice and text interfaces.

    摘要翻译: 描述用于解析从用户接收的语言表达以确定表达式是否包含多目标命令的方法。 具体来说,应用已知技术从语言表达中提取术语。 提取的术语被分配到类别。 如果在解析的语言表达中找到两个或多个相关类别的术语,并且不会在时间上彼此重叠,那么比较这些术语的置信水平。 如果置信水平相似,则术语可能是口头表达中的并行条目,可能表示多个目标。 如果找到多目标命令,则将该命令呈现给用户进行审查和可能的编辑或执行。 如果将解析的多目标命令呈现给用户进行审查,则可以通过任何适当的界面(包括语音和文本界面)进行演示。

    Methods for creating and searching a database of speakers
    8.
    发明授权
    Methods for creating and searching a database of speakers 有权
    创建和搜索扬声器数据库的方法

    公开(公告)号:US08442823B2

    公开(公告)日:2013-05-14

    申请号:US12907729

    申请日:2010-10-19

    IPC分类号: G10L15/00

    CPC分类号: G10L15/1822

    摘要: A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker.

    摘要翻译: 一种执行对扬声器数据库的搜索的方法,包括:接收由查询扬声器所说出的查询语音样本; 从查询语音样本中导出查询语句; 从查询语句中提取查询语句统计信息; 使用核函数执行内核局部敏感哈希(KLSH),所述KLSH使用包括在扬声器数据库中的多个话语中提取的查询话语统计和话音统计作为输入,以便选择所述多个话语的子集; 以及使用话语比较方程比较所述子集中每个话语的话语统计量的查询话语统计量,以从所述数据库中产生具有与所述查询发音者具有最高相似性的话语的说话者列表。