System and method for building and evaluating automatic speech recognition via an application programmer interface
    1.
    发明授权
    System and method for building and evaluating automatic speech recognition via an application programmer interface 有权
    通过应用程序接口构建和评估自动语音识别的系统和方法

    公开(公告)号:US09484018B2

    公开(公告)日:2016-11-01

    申请号:US12952829

    申请日:2010-11-23

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for building an automatic speech recognition system through an Internet API. A network-based automatic speech recognition server configured to practice the method receives feature streams, transcriptions, and parameter values as inputs from a network client independent of knowledge of internal operations of the server. The server processes the inputs to train an acoustic model and a language model, and transmits the acoustic model and the language model to the network client. The server can also generate a log describing the processing and transmit the log to the client. On the server side, a human expert can intervene to modify how the server processes the inputs. The inputs can include an additional feature stream generated from speech by algorithms in the client's proprietary feature extraction.

    摘要翻译: 本文公开了用于通过因特网API构建自动语音识别系统的系统,方法和非暂时的计算机可读存储介质。 被配置为实施该方法的基于网络的自动语音识别服务器接收来自网络客户端的特征流,转录和参数值作为输入,而与服务器内部操作的知识无关。 服务器处理输入以训练声学模型和语言模型,并将声学模型和语言模型传输到网络客户端。 服务器还可以生成描述处理的日志,并将日志传送给客户端。 在服务器端,人类专家可以进行干预,以修改服务器如何处理输入。 输入可以包括通过客户端专有特征提取中的算法从语音生成的附加特征流。

    SYSTEM AND METHOD FOR BUILDING AND EVALUATING AUTOMATIC SPEECH RECOGNITION VIA AN APPLICATION PROGRAMMER INTERFACE
    2.
    发明申请
    SYSTEM AND METHOD FOR BUILDING AND EVALUATING AUTOMATIC SPEECH RECOGNITION VIA AN APPLICATION PROGRAMMER INTERFACE 有权
    通过应用编程器界面建立和评估自动语音识别的系统和方法

    公开(公告)号:US20120130709A1

    公开(公告)日:2012-05-24

    申请号:US12952829

    申请日:2010-11-23

    IPC分类号: G10L19/00 G06F9/46

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for building an automatic speech recognition system through an Internet API. A network-based automatic speech recognition server configured to practice the method receives feature streams, transcriptions, and parameter values as inputs from a network client independent of knowledge of internal operations of the server. The server processes the inputs to train an acoustic model and a language model, and transmits the acoustic model and the language model to the network client. The server can also generate a log describing the processing and transmit the log to the client. On the server side, a human expert can intervene to modify how the server processes the inputs. The inputs can include an additional feature stream generated from speech by algorithms in the client's proprietary feature extraction.

    摘要翻译: 本文公开了用于通过因特网API构建自动语音识别系统的系统,方法和非暂时的计算机可读存储介质。 被配置为实施该方法的基于网络的自动语音识别服务器接收来自网络客户端的特征流,转录和参数值作为输入,而与服务器内部操作的知识无关。 服务器处理输入以训练声学模型和语言模型,并将声学模型和语言模型传输到网络客户端。 服务器还可以生成描述处理的日志,并将日志传送给客户端。 在服务器端,人类专家可以进行干预,以修改服务器如何处理输入。 输入可以包括通过客户端专有特征提取中的算法从语音生成的附加特征流。

    System and method for speech recognition modeling for mobile voice search
    3.
    发明授权
    System and method for speech recognition modeling for mobile voice search 有权
    用于移动语音搜索的语音识别建模的系统和方法

    公开(公告)号:US09558738B2

    公开(公告)日:2017-01-31

    申请号:US13042671

    申请日:2011-03-08

    IPC分类号: G10L15/00 G10L15/06 G10L15/14

    CPC分类号: G10L15/063 G10L15/14

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating an acoustic model for use in speech recognition. A system configured to practice the method first receives training data and identifies non-contextual lexical-level features in the training data. Then the system infers sentence-level features from the training data and generates a set of decision trees by node-splitting based on the non-contextual lexical-level features and the sentence-level features. The system decorrelates training vectors, based on the training data, for each decision tree in the set of decision trees to approximate full-covariance Gaussian models, and then can train an acoustic model for use in speech recognition based on the training data, the set of decision trees, and the training vectors.

    摘要翻译: 本文公开了用于生成用于语音识别的声学模型的系统,方法和非暂时的计算机可读存储介质。 被配置为练习该方法的系统首先接收训练数据并识别训练数据中的非上下文词汇级特征。 然后,该系统从训练数据推导出句子级特征,并基于非上下文词汇级特征和句子级特征,通过节点分割生成一组决策树。 该系统基于训练数据对训练数据进行解相关,对于决策树组中的每个决策树,以近似全协方差高斯模型,然后可以基于训练数据训练用于语音识别的声学模型,该集合 的决策树,以及训练矢量。

    SYSTEM AND METHOD FOR SPEECH RECOGNITION MODELING FOR MOBILE VOICE SEARCH
    4.
    发明申请
    SYSTEM AND METHOD FOR SPEECH RECOGNITION MODELING FOR MOBILE VOICE SEARCH 有权
    用于移动语音搜索的语音识别建模的系统和方法

    公开(公告)号:US20120232902A1

    公开(公告)日:2012-09-13

    申请号:US13042671

    申请日:2011-03-08

    IPC分类号: G10L15/06

    CPC分类号: G10L15/063 G10L15/14

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating an acoustic model for use in speech recognition. A system configured to practice the method first receives training data and identifies non-contextual lexical-level features in the training data. Then the system infers sentence-level features from the training data and generates a set of decision trees by node-splitting based on the non-contextual lexical-level features and the sentence-level features. The system decorrelates training vectors, based on the training data, for each decision tree in the set of decision trees to approximate full-covariance Gaussian models, and then can train an acoustic model for use in speech recognition based on the training data, the set of decision trees, and the training vectors.

    摘要翻译: 本文公开了用于生成用于语音识别的声学模型的系统,方法和非暂时的计算机可读存储介质。 被配置为练习该方法的系统首先接收训练数据并识别训练数据中的非上下文词汇级特征。 然后,该系统从训练数据推导出句子级特征,并基于非上下文词汇级特征和句子级特征,通过节点分割生成一组决策树。 该系统基于训练数据对训练数据进行解相关,对于决策树组中的每个决策树,以近似全协方差高斯模型,然后可以基于训练数据训练用于语音识别的声学模型,该集合 的决策树,以及训练矢量。

    Navigation route updates
    5.
    发明授权
    Navigation route updates 有权
    导航路线更新

    公开(公告)号:US08825374B2

    公开(公告)日:2014-09-02

    申请号:US13489020

    申请日:2012-06-05

    IPC分类号: G01C21/34 G01C21/00 G01C21/36

    摘要: Concepts and technologies are disclosed herein for providing navigation routes and/or providing navigation route updates. According to various embodiments of the concepts and technologies disclosed herein, a navigation application can be configured to obtain route data from a routing service. The routing service can be configured to use navigation data locally stored and/or obtained from a number of sources to generate navigation routes and/or to update navigation routes. The generated and/or updated navigation routes can be provided to the user device as route data that can be used to provide navigation directions to a user.

    摘要翻译: 本文公开了用于提供导航路线和/或提供导航路线更新的概念和技术。 根据本文公开的概念和技术的各种实施例,导航应用可以被配置为从路由服务获得路由数据。 可以将路由服务配置为使用本地存储和/或从多个源获得的导航数据来生成导航路线和/或更新导航路线。 生成和/或更新的导航路线可以作为可用于向用户提供导航方向的路线数据提供给用户设备。

    NAVIGATION ROUTE UPDATES
    6.
    发明申请
    NAVIGATION ROUTE UPDATES 有权
    导航路由更新

    公开(公告)号:US20130325320A1

    公开(公告)日:2013-12-05

    申请号:US13489020

    申请日:2012-06-05

    IPC分类号: G01C21/34

    摘要: Concepts and technologies are disclosed herein for providing navigation routes and/or providing navigation route updates. According to various embodiments of the concepts and technologies disclosed herein, a navigation application can be configured to obtain route data from a routing service. The routing service can be configured to use navigation data locally stored and/or obtained from a number of sources to generate navigation routes and/or to update navigation routes. The generated and/or updated navigation routes can be provided to the user device as route data that can be used to provide navigation directions to a user.

    摘要翻译: 本文公开了用于提供导航路线和/或提供导航路线更新的概念和技术。 根据本文公开的概念和技术的各种实施例,导航应用可以被配置为从路由服务获得路由数据。 可以将路由服务配置为使用本地存储和/或从多个源获得的导航数据来生成导航路线和/或更新导航路线。 生成和/或更新的导航路线可以作为可用于向用户提供导航方向的路线数据提供给用户设备。

    SYSTEM AND METHOD FOR COMBINING FRAME AND SEGMENT LEVEL PROCESSING, VIA TEMPORAL POOLING, FOR PHONETIC CLASSIFICATION
    7.
    发明申请
    SYSTEM AND METHOD FOR COMBINING FRAME AND SEGMENT LEVEL PROCESSING, VIA TEMPORAL POOLING, FOR PHONETIC CLASSIFICATION 有权
    用于组合框架和分段水平处理的系统和方法,通过时间池,用于电话分类

    公开(公告)号:US20130103402A1

    公开(公告)日:2013-04-25

    申请号:US13281102

    申请日:2011-10-25

    IPC分类号: G10L15/04 G10L15/00

    CPC分类号: G10L15/02 G10L15/08 G10L15/16

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations. Based on the scores, the plurality of segmental classification units selects a class label and returns a result.

    摘要翻译: 本文公开了用于通过时间池来组合帧和段级处理用于语音分类的系统,方法和非暂时的计算机可读存储介质。 帧处理器单元接收输入并从输入中提取与时间相关的特征。 多个池化接口单元基于集合时间依赖特征并根据多个选择策略选择多个时间相关特征来生成多个特征向量。 接下来,多个分段分类单元生成特征向量的得分。 每个分段分类单元(SCU)可专用于特定的汇聚接口单元(PIU)以形成PIU-SCU组合。 可以进一步组合多个PIU-SCU组合以形成组合的集合,并且可以通过改变PIU-SCU组合使用的合并操作来使集合多样化。 基于分数,多个分段分类单元选择分类标签并返回结果。

    System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification
    8.
    发明授权
    System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification 有权
    用于组合帧和段级处理的系统和方法,通过时间池进行语音分类

    公开(公告)号:US08886533B2

    公开(公告)日:2014-11-11

    申请号:US13281102

    申请日:2011-10-25

    IPC分类号: G10L15/08 G10L15/16 G10L15/02

    CPC分类号: G10L15/02 G10L15/08 G10L15/16

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations. Based on the scores, the plurality of segmental classification units selects a class label and returns a result.

    摘要翻译: 本文公开了用于通过时间池来组合帧和段级处理用于语音分类的系统,方法和非暂时的计算机可读存储介质。 帧处理器单元接收输入并从输入中提取与时间相关的特征。 多个池化接口单元基于集合时间相关特征并根据多个选择策略选择多个时间相关特征来生成多个特征向量。 接下来,多个分段分类单元生成特征向量的得分。 每个分段分类单元(SCU)可专用于特定的汇聚接口单元(PIU)以形成PIU-SCU组合。 可以进一步组合多个PIU-SCU组合以形成组合的集合,并且可以通过改变PIU-SCU组合使用的合并操作来使集合多样化。 基于分数,多个分段分类单元选择分类标签并返回结果。