专利检索 ap:"Matthew Lennig" 第 1 页

1.

发明授权
Updating markov models based on speech input and additional information for automated telephone directory assistance 失效
标题翻译：基于语音输入更新马尔可夫模型，并为自动电话簿帮助更新信息

公开(公告)号：US5644680A

公开(公告)日：1997-07-01

申请号：US452191

申请日：1995-05-25

申请人： Gregory J. Bielby , Vishwa N. Gupta , Lauren C. Hodgson , Matthew Lennig , R. Douglas Sharp , Hans A. Wasmeier

发明人： Gregory J. Bielby , Vishwa N. Gupta , Lauren C. Hodgson , Matthew Lennig , R. Douglas Sharp , Hans A. Wasmeier

IPC分类号： G06F17/30 , G10L15/00 , G10L15/06 , G10L15/14 , G10L15/22 , H04M3/42 , H04M3/493 , H04M3/50 , H04M3/51 , H04M3/60 , H04Q3/545 , H04Q3/72 , G10L5/06

CPC分类号： G10L15/063 , G10L15/22 , H04M3/4931 , G10L15/142 , G10L25/24 , H04M2201/40 , H04M2242/22 , H04M3/42093 , H04M3/42102 , H04M3/51 , H04Q3/72

摘要： In methods and apparatus for at least partially automating a telephone directory assistance function, directory assistance callers are prompted to speak locality or called entity names associated with desired directory listings. A speech recognition algorithm is applied to speech signals received in response to prompting to determine spoken locality or called entity names. Desired telephone numbers are released to callers, and released telephone numbers are used to confirm or correct at least some of the recognized locality or called entity names. Speech signal representations labelled with the confirmed or corrected names are used as labelled speech tokens to refine prior training of the speech recognition algorithm. The training refinement automatically adjusts for deficiencies in prior training of the speech recognition algorithm and to long term changes in the speech patterns of directory assistance callers served by a particular directory assistance installation. The methods can be generalized to other speech recognition applications.

摘要翻译： 在用于至少部分自动化电话簿辅助功能的方法和装置中，提示目录援助呼叫者说出与所需目录列表相关联的地点或称为实体名称。语音识别算法被应用于响应于提示来确定语音位置或称为实体名称而接收到的语音信号。所需的电话号码被发放给呼叫者，并且发布的电话号码用于确认或更正至少一些已识别的地点或被叫实体名称。用确认或修正的名称标记的语音信号表示被用作标记语音令牌，以改进语音识别算法的先前训练。训练细化自动调整语音识别算法的先前训练中的缺陷以及特定目录辅助安装服务的目录协助呼叫者的语音模式的长期变化。这些方法可以推广到其他语音识别应用。

2.

发明授权
Speech recognition 失效
标题翻译：语音识别

公开(公告)号：US4956865A

公开(公告)日：1990-09-11

申请号：US191824

申请日：1988-05-02

申请人： Matthew Lennig , Paul Mermelstein , Vishwa N. Gupta

发明人： Matthew Lennig , Paul Mermelstein , Vishwa N. Gupta

IPC分类号： G10L11/02 , G10L15/02

CPC分类号： G10L25/87 , G10L15/02

摘要： In a speech recognizer, for recognizing unknown utterances in isolated-word speech or continuous speech, improved recognition accuracy is obtained by augmenting the usual spectral representation of the unknown utterance with a dynamic component. A corresponding dynamic component is provided in the templates with which the spectral representation of the utterance is compared. In preferred embodiments, the representation is mel-based cepstral and the dynamic components comprise vector differences between pairs of primary cepstra. Preferably the time interval between each pair is about 50 milliseconds. It is also preferable to compute a dynamic perceptual loudness component along with the dynamic parameters.

摘要翻译： 在语音识别器中，为了识别孤立词语音或连续语音中的未知语音，通过用动态分量增加未知语音的常规频谱表示来获得改进的识别精度。在模板中提供相应的动态分量，与之对比发音的频谱表示。在优选实施例中，该表示是基于mel的倒频谱，并且动态分量包括主要cepstra对之间的矢量差异。优选地，每对之间的时间间隔约为50毫秒。还优选地计算动态感知响度分量以及动态参数。

3.

发明授权
Prosody based endpoint detection 失效
标题翻译：基于韵律的终点检测

公开(公告)号：US06873953B1

公开(公告)日：2005-03-29

申请号：US09576116

申请日：2000-05-22

申请人： Matthew Lennig

发明人： Matthew Lennig

IPC分类号： G10L11/02 , G10L15/04

CPC分类号： G10L25/87

摘要： A method and apparatus are provided for performing prosody based endpoint detection of speech in a speech recognition system. Input speech represents an utterance, which has an intonation pattern. An end-of-utterance condition is identified based on prosodic parameters of the utterance, such as the intonation pattern and the duration of the final syllable of the utterance, as well as non-prosodic parameters, such as the log energy of the speech.

摘要翻译： 提供了一种用于在语音识别系统中执行基于韵律的终端检测语音的方法和装置。输入语言表示一个具有语调模式的话语。基于话语的韵律参数，例如语调模式和话语最后一个音节的持续时间以及非韵律参数（如语音的对数能量）来确定语音终止条件。

4.

发明授权
System architecture for and method of voice processing 失效
标题翻译：系统架构和语音处理方法

公开(公告)号：US06119087A

公开(公告)日：2000-09-12

申请号：US39203

申请日：1998-03-13

申请人： Thomas Murray Kuhn , Matthew Lennig , Peter Christopher Monaco , David Bruce Peters

发明人： Thomas Murray Kuhn , Matthew Lennig , Peter Christopher Monaco , David Bruce Peters

IPC分类号： G10L15/22 , G10L15/30 , G10L15/14

CPC分类号： G10L15/30 , G10L15/22

摘要： A system and method for efficiently distributing voice call data received from speech recognition servers over a telephone network having a shared processing resource is disclosed. Incoming calls are received from phone lines and assigned grammar types by speech recognition servers. A request for processing the voice call data is sent to a resource manager which monitors the shared processing resource and identifies a preferred processor within the shared resource. The resource manager sends an instruction to the speech recognition server to send the voice call data to a preferred processor for processing. The preferred processor is determined by known processor efficiencies for voice call data having the assigned grammar type of the incoming voice call data and a measure of processor loads. While the system is operating, the resource manger develops and updates a history of each processor. The histories include processing efficiency values for all grammar types received. The processing efficiencies are stored, tabulated and assigned usage number values for each processor. When incoming voice call data is receive, the resource manages evaluates the total sum of the usage numbers for processing requests assigned to each processor and the usage number for the grammar type of the incoming data as applied to each processor. The incoming data is distributed to the processor with the lowest sum of total of usage numbers for assigned requests and the usage number assigned to the incoming data for that processor.

摘要翻译： 公开了一种用于通过具有共享处理资源的电话网络有效地分发从语音识别服务器接收的语音呼叫数据的系统和方法。通过语音识别服务器从电话线接收来电和分配的语法类型。处理语音呼叫数据的请求被发送到资源管理器，该资源管理器监视共享的处理资源并识别共享资源内的优选处理器。资源管理器向语音识别服务器发送指令以将语音呼叫数据发送到优选处理器进行处理。优选的处理器由具有所分配的语音呼叫数据语法类型和处理器负载量度的语音呼叫数据的已知处理器效率来确定。当系统运行时，资源管理器开发和更新每个处理器的历史记录。历史包括收到的所有语法类型的处理效率值。处理效率被存储，制表并为每个处理器分配使用编号值。当接收到接收的语音呼叫数据时，资源管理评估用于处理分配给每个处理器的请求的使用数量的总和以及应用于每个处理器的输入数据的语法类型的使用次数。输入数据以分配的请求的总使用数量总和和分配给该处理器的输入数据的使用编号分配给处理器。

5.

发明授权
Rejection method for speech recognition 失效
标题翻译：拒绝语音识别方法

公开(公告)号：US5097509A

公开(公告)日：1992-03-17

申请号：US501993

申请日：1990-03-28

申请人： Matthew Lennig

发明人： Matthew Lennig

IPC分类号： G10L15/00 , G10L15/08

CPC分类号： G10L15/08

摘要： A speech recognizer, for recognizing unknown utterances in isolated-word small-vocabulary speech has improved rejection of out of vocabulary utterances. Both a usual spectral representation including a dynamic component and an equalized representation are used to match unknown utterances to templates for in-vocabulary words. In a preferred embodiment, the representations are mel-based cepstral with dynamic components being signed vector differences between pairs of primary cepstra. The equalized representation being the signed difference of each cepstral coefficient less an average value of the coefficients. Factors are generated from the ordered lists of templates to determine the probability of the top choice being a correct acceptance, with different methods being applied when the usual and equalized representations yield a different match. For additional enhancement, the rejection method may use templates corresponding to non-vocabulary utterances or decoys. If the top choice corresponds to a decoy, the input is rejected.

摘要翻译： 用于识别孤立词小词汇语音中的未知话语的语音识别器改进了对词汇话语的拒绝。包括动态分量和均衡表示的常规频谱表示都用于将未知语音与词汇词的模板相匹配。在优选实施例中，这些表示是基于mel的倒频谱，其中动态分量是主要cepstra对之间的符号矢量差异。均衡表示是每个倒谱系数的有符号差小于系数的平均值。因素是从有序的模板列表生成的，以确定最佳选择是正确接受的概率，当平常和均衡的表示产生不同的匹配时，应用不同的方法。为了进一步增强，拒绝方法可以使用对应于非词汇话语或诱饵的模板。如果顶级选择对应于诱饵，则输入被拒绝。

6.

发明申请
Spoken language proficiency assessment by computer 审中-公开
标题翻译：电脑口语能力评估

公开(公告)号：US20070033017A1

公开(公告)日：2007-02-08

申请号：US11490290

申请日：2006-07-20

申请人： Anish Nair , Matthew Lennig , Brent Townshend

发明人： Anish Nair , Matthew Lennig , Brent Townshend

IPC分类号： G10L19/00

CPC分类号： G09B19/06 , G09B5/00 , G09B7/00 , G09B17/003 , G10L15/26

摘要： A system and method for spoken language proficiency assessment by a computer is described. A user provides a spoken response to a constructed response question. A speech recognition system processes the spoken response into a sequence of linguistic units. At training time, features matching a linguistic template are extracted by identifying matches between a training sequence of linguistic units and pre-selected templates. Additionally, a generalized count of the extracted features is computed. At runtime, linguistic features are detected by comparing a runtime sequence of linguistic units to the feature set extracted at training time. This comparison results in a generalized count of linguistic features. The generalized count is then used to compute a score.

摘要翻译： 描述了由计算机进行口语能力评估的系统和方法。用户提供对构建的响应问题的口头响应。语音识别系统将语音响应处理成一系列语言单元。在训练时间，通过识别语言单元的训练序列和预选模板之间的匹配来提取与语言模板匹配的特征。另外，计算提取的特征的广义计数。在运行时，通过将语言单元的运行时序列与在训练时提取的特征集进行比较来检测语言特征。这种比较导致了语言特征的广义计数。然后将广义计数用于计算分数。

7.

发明授权
Distributed voice web architecture and associated components and methods 有权
标题翻译：分布式语音Web架构及相关组件和方法

公开(公告)号：US06785653B1

公开(公告)日：2004-08-31

申请号：US09561680

申请日：2000-05-01

申请人： James E. White , Matthew Lennig

发明人： James E. White , Matthew Lennig

IPC分类号： G10L2100

CPC分类号： G10L25/87 , G10L15/30 , H04M3/4938

摘要： A speech-enabled distributed processing system forming a Voice Web includes a gateway, one or more voice content sites coupled to the gateway over a wide area network, and a browser coupled to the gateway over a network, which may or may not be the wide area network. The gateway receives telephone calls from one or more users over telephony connections and performs endpointing of speech of each user. The browser provides the gateway with information enabling the gateway to selectively direct the endpointed speech to a voice content site via the wide area network. The gateway outputs the endpointed speech in the form of application protocol requests onto the wide area network to the appropriate site, as specified by the browser, or to the browser. The gateway receives prompts in the form of application protocol responses from the browser or a voice content site and plays the prompts to the appropriate user over the telephony connection. While accessing a selected voice content site, the gateway reroutes the endpointed speech to the browser if the endpointing result represents a hotword candidate.

摘要翻译： 形成语音网络的支持语音的分布式处理系统包括网关，通过广域网耦合到网关的一个或多个语音内容站点，以及通过网络耦合到网关的浏览器，其可以是或不是宽的区域网络。网关通过电话连接从一个或多个用户接收电话呼叫，并执行每个用户的语音终止。浏览器向网关提供信息，使得网关能够通过广域网选择性地将终端语音指向语音内容站点。网关将应用协议请求形式的端点语音输出到广域网到由浏览器或浏览器指定的适当站点。网关以来自浏览器或语音内容站点的应用协议响应的形式接收提示，并通过电话连接向适当的用户播放提示。在访问所选择的语音内容站点时，如果终结点结果表示热门候选者，则网关将端点语音重新路由到浏览器。

8.

发明授权
Speech recognition method using a two-pass search 失效
标题翻译：使用双向搜索的语音识别方法

公开(公告)号：US5515475A

公开(公告)日：1996-05-07

申请号：US80543

申请日：1993-06-24

申请人： Vishwa N. Gupta , Matthew Lennig

发明人： Vishwa N. Gupta , Matthew Lennig

IPC分类号： G10L15/14 , G10L15/08 , G10L5/06 , G10L9/00

CPC分类号： G10L15/08 , G10L15/142

摘要： A method of recognizing speech comprises searching a vocabulary of words for a match to an unknown utterance. Words in the vocabulary are represented by concatenated allophone models and the vocabulary is represented as a network. On a first pass of the search, a one-state duration constrained model is used to search the vocabulary network. The one-state model has as its transition probability the maximum observed transitional probability (model distance) of the unknown utterance for the corresponding allophone model. Words having top scores are chosen from the first pass search and, in a second pass of the search, rescored using a full Viterbi trellis with the complete allophone models and model distances. The rescores are sorted to provide a few top choices. Using a second set of speech parameters these few top choices are again rescored. Comparison of the scores using each set of speech parameters determines a recognition choice. Post processing is also possible to further enhance recognition accuracy. Test results indicate that the two-pass search provides approximately the same recognition accuracy as a full Viterbi search of the vocabulary network.

摘要翻译： 识别语音的方法包括搜索单词的词汇以用于未知语音的匹配。词汇中的词汇由连接的异音素模型表示，词汇表示为网络。在搜索的第一遍，使用一状态持续时间约束模型来搜索词汇网络。单态模型具有对于相应异音素模型的未知话语的最大观察到的过渡概率（模型距离）作为其转换概率。从第一遍搜索中选择具有最高分数的单词，并且在第二遍搜索中，使用完整的维特比网格与完整的异音素模型和模型距离进行重播。分类被分类以提供几个最佳选择。使用第二组语音参数，这些几个顶级选择再次被重新打破。使用每组语音参数对比分数决定了识别选择。后处理也可以进一步提高识别精度。测试结果表明，双向搜索提供与词汇网络的完整维特比搜索大致相同的识别精度。

9.

发明授权
Method and apparatus for training speech recognition algorithms for directory assistance applications 失效
标题翻译：用于训练目录辅助应用的语音识别算法的方法和装置

公开(公告)号：US5488652A

公开(公告)日：1996-01-30

申请号：US227830

申请日：1994-04-14

申请人： Gregory J. Bielby , Vishwa N. Gupta , Lauren C. Hodgson , Matthew Lennig , R. Douglas Sharp , Hans A. Wasmeier

发明人： Gregory J. Bielby , Vishwa N. Gupta , Lauren C. Hodgson , Matthew Lennig , R. Douglas Sharp , Hans A. Wasmeier

IPC分类号： G06F17/30 , G10L15/00 , G10L15/06 , G10L15/14 , G10L15/22 , H04M3/42 , H04M3/493 , H04M3/50 , H04M3/51 , H04M3/60 , H04Q3/545 , H04Q3/72

CPC分类号： G10L15/063 , G10L15/22 , H04M3/4931 , G10L15/142 , G10L25/24 , H04M2201/40 , H04M2242/22 , H04M3/42093 , H04M3/42102 , H04M3/51 , H04Q3/72

摘要： In methods and apparatus for at least partially automating a telephone directory assistance function, directory assistance callers are prompted to speak locality or called entity names associated with desired directory listings. A speech recognition algorithm is applied to speech signals received in response to prompting to determine spoken locality or called entity names. Desired telephone numbers are released to callers, and released telephone numbers are used to confirm or correct at least some of the recognized locality or called entity names. Speech signal representations labelled with the confirmed or corrected names are used as labelled speech tokens to refine prior training of the speech recognition algorithm. The training refinement automatically adjusts for deficiencies in prior training of the speech recognition algorithm and to long term changes in the speech patterns of directory assistance callers served by a particular directory assistance installation. The methods can be generalized to other speech recognition applications.

摘要翻译： 在用于至少部分自动化电话簿辅助功能的方法和装置中，提示目录援助呼叫者说出与所需目录列表相关联的地点或称为实体名称。语音识别算法被应用于响应于提示来确定语音位置或称为实体名称而接收到的语音信号。所需的电话号码被发放给呼叫者，并且发布的电话号码用于确认或更正至少一些已识别的地点或被叫实体名称。用确认或修正的名称标记的语音信号表示被用作标记语音令牌，以改进语音识别算法的先前训练。训练细化自动调整语音识别算法的先前训练中的缺陷，以及由特定目录辅助安装服务的目录服务呼叫者的语音模式的长期变化。这些方法可以推广到其他语音识别应用。

10.

发明授权
Method and apparatus for automation of directory assistance using speech recognition 失效
标题翻译：使用语音识别自动化目录辅助的方法和装置

公开(公告)号：US5479488A

公开(公告)日：1995-12-26

申请号：US193522

申请日：1994-02-08

申请人： Matthew Lennig , Robert D. Sharp , Gregory J. Bielby

发明人： Matthew Lennig , Robert D. Sharp , Gregory J. Bielby

IPC分类号： H04M3/493 , H04M3/64 , G10L9/06

CPC分类号： H04M3/4931 , H04M2201/40

摘要： In a telecommunications system, automatic directory assistance uses a voice processing unit comprising a lexicon of lexemes and data representing a predetermined relationship between each lexeme and calling numbers in a locality served by the automated directory assistance apparatus. The voice processing unit issues messages to a caller making a directory assistance call to prompt the caller to utter a required one of said lexemes. The unit detects the calling number originating a directory assistance call and, responsive to the calling number and the relationship data computes a probability index representing the likelihood of a lexeme being the subject of the directory assistance call. The unit employs a speech recognizer to recognize, on the basis of the acoustics of the caller's utterance and the probability index, a lexeme corresponding to that uttered by the caller.

摘要翻译： 在电信系统中，自动目录援助使用语音处理单元，该语音处理单元包括词汇词典和代表由自动目录帮助装置服务的地点中的每个词汇和主叫号码之间的预定关系的数据。语音处理单元向呼叫者发出消息，进行目录协助呼叫以提示呼叫者发出所需的一个所述词汇。该单元检测发起目录辅助呼叫的呼叫号码，并且响应于主叫号码，并且关系数据计算表示作为目录协助呼叫的对象的词汇的可能性的概率指标。该单元采用语音识别器，根据呼叫者的话语和概率索引的声学来识别与由呼叫者发出的语音对应的词典。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类