专利检索 ipc:G10L15/06 第 7 页

61.

发明公开
RECURRENT NEURAL NETWORK LEARNING METHOD, COMPUTER PROGRAM FOR SAME, AND VOICE RECOGNITION DEVICE 审中-公开
标题翻译：再现的神经网络学习方法，计算机程序及语音识别装置

公开(公告)号：EP3296930A1

公开(公告)日：2018-03-21

申请号：EP16792676.5

申请日：2016-05-10

申请人： National Institute of Information and Communication Technology

发明人： KANDA, Naoyuki

IPC分类号： G06N3/08 , G06N3/04 , G10L15/06 , G10L15/16

CPC分类号： G06N3/08 , G06N3/04 , G06N3/0445 , G06N3/084 , G10L15/06 , G10L15/063 , G10L15/16 , G10L2015/0635

摘要： [Object]
An object is to provide a training method of improving training of a recurrent neural network (RNN) using time-sequential data.
[Solution]
The training method includes a step 220 of initializing the RNN, and a training step 226 of training the RNN by designating a certain vector as a start position and optimizing various parameters to minimize error function. The training step 226 includes: an updating step 250 of updating RNN parameters through Truncated BPTT using consecutive N (N≥3) vectors having a designated vector as a start point and using a reference value of a tail vector as a correct label; and a first repetition step 240 of repeating the process of executing the training step by newly designating a vector at a position satisfying a prescribed relation with the tail of N vectors used at the updating step until an end condition is satisfied. The vector at a position satisfying the prescribed relation is positioned at least two vectors behind the designated vector.

摘要翻译： 本发明的目的是提供一种使用时间序列数据改进递归神经网络（RNN）的训练的训练方法。解决方案训练方法包括初始化RNN的步骤220和训练步骤226，通过指定某个向量作为开始位置并优化各种参数以最小化误差函数来训练RNN。训练步骤226包括：更新步骤250，使用具有指定向量的连续N（N≥3）个向量作为起点并使用尾向量的参考值作为正确标签，通过截断BPTT更新RNN参数; 以及第一重复步骤240，通过在满足与在更新步骤中使用的N个向量的尾部的规定关系的位置重新指定向量直到满足结束条件来重复执行训练步骤的处理。满足规定关系的位置处的矢量位于指定矢量后面的至少两个矢量中。

62.

发明公开
AUTOMATIC ACCENT DETECTION 审中-公开

公开(公告)号：EP3286756A1

公开(公告)日：2018-02-28

申请号：EP15784191

申请日：2015-09-30

申请人： APPLE INC

发明人： NALLASAMY UDHYAKUMAR , KAJAREKAR SACHIN , PAULIK MATTHIAS , SEIGEL MATTHEW

IPC分类号： G10L15/06 , G10L25/51

CPC分类号： G10L15/07 , G10L15/063 , G10L15/10 , G10L15/16 , G10L25/51 , G10L2015/0631

摘要： Systems and processes for automatic accent detection are provided. In accordance with one example, a method includes, at an electronic device with one or more processors and memory, receiving a user input, determining a first similarity between a representation of the user input and a first acoustic model of a plurality of acoustic models, and determining a second similarity between the representation of the user input and a second acoustic model of the plurality of acoustic models. The method further includes determining whether the first similarity is greater than the second similarity. In accordance with a determination that the first similarity is greater than the second similarity, the first acoustic model may be selected; and in accordance with a determination that the first similarity is not greater than the second similarity, the second acoustic model may be selected.

63.

发明公开
OBFUSCATING TRAINING DATA 审中-公开
标题翻译：培训数据

公开(公告)号：EP3262634A1

公开(公告)日：2018-01-03

申请号：EP15710122.1

申请日：2015-02-26

申请人： Longsand Limited

发明人： BETLEY, Abigail , PYE, David , TEFERA ASEFA, Milky

IPC分类号： G10L15/06

CPC分类号： G10L15/063 , G06F17/241 , G10L15/02 , G10L15/26 , G10L2015/0631

摘要： Examples disclosed herein involve obfuscating training data. An example method includes computing a sequence of acoustic features from audio data of training data, the training data comprising the audio data and a corresponding text transcript; mapping the acoustic features to acoustic model states to generate annotated feature vectors, the annotated feature vectors comprising the acoustic features and corresponding context from the text transcript; and providing a randomized sequence of the annotated feature vectors as obfuscated training data to an audio analysis system.

64.

发明公开
DEPLOYED END-TO-END SPEECH RECOGNITION 审中-公开
标题翻译：部署的端对端语音识别

公开(公告)号：EP3245652A1

公开(公告)日：2017-11-22

申请号：EP16869294.5

申请日：2016-11-23

申请人： Baidu USA LLC

发明人： CATANZARO, Bryan , CHEN, Jingdong , CHRZANOWSKI, Mike , ELSEN, Erich , ENGEL, Jesse , FOUGNER, Christopher , HAN, Xu , HANNUN, Awni , PRENGER, Ryan , SATHEESH, Sanjeev , SENGUPTA, Shubhabrata , YOGATAMA, Dani , WANG, Chong , ZHAN, Jun , ZHU, Zhenyao , AMODEI, Dario

IPC分类号： G10L15/16 , G10L15/06 , G10L15/08 , G10L15/183

CPC分类号： G10L15/16 , G06N3/0445 , G06N3/084 , G10L15/02 , G10L15/063 , G10L15/14 , G10L15/183 , G10L15/197 , G10L25/18 , G10L25/21 , G10L2015/0635

摘要： Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

65.

发明授权
MIXED SPEECH SIGNALS RECOGNITION 有权
标题翻译：混合语音信号识别

公开(公告)号：EP3123466B1

公开(公告)日：2017-11-15

申请号：EP15714120.1

申请日：2015-03-19

申请人： Microsoft Technology Licensing, LLC

发明人： YU, Dong , WENG, Chao , SELTZER, Michael L. , DROPPO, James

IPC分类号： G10L15/16 , G10L15/06 , G10L15/20

CPC分类号： G10L15/16 , G10L15/063 , G10L15/20 , G10L17/18 , G10L25/21 , G10L25/84 , G10L25/90

摘要： The claimed subject matter includes a system and method for recognizing mixed speech from a source. The method includes training a first neural network to recognize the speech signal spoken by the speaker with a higher level of a speech characteristic from a mixed speech sample. The method also includes training a second neural network to recognize the speech signal spoken by the speaker with a lower level of the speech characteristic from the mixed speech sample. Additionally, the method includes decoding the mixed speech sample with the first neural network and the second neural network by optimizing the joint likelihood of observing the two speech signals considering the probability that a specific frame is a switching point of the speech characteristic.

66.

发明授权
DETECTING SPEECH INPUT PHRASE CONFUSION RISK 有权
标题翻译：检测语音输入相位混乱风险

公开(公告)号：EP3066663B1

公开(公告)日：2017-08-16

申请号：EP14799316.6

申请日：2014-11-05

申请人： Microsoft Technology Licensing, LLC

发明人： TJALVE, Michael , KARNAM, Pavan , MOONEY, Dennis

IPC分类号： G10L15/01 , G10L15/187 , G10L15/06

CPC分类号： G10L15/19 , G10L15/01 , G10L15/187 , G10L25/48 , G10L25/78 , G10L2015/0635 , G10L2025/786

67.

发明公开
TRAINING METHOD AND APPARATUS FOR LANGUAGE MODEL, AND DEVICE 审中-公开
标题翻译： SPRACHMODELLTRAININGSVERFAHREN SOWIE VORRICHTUNG UND EINRICHUNG

公开(公告)号：EP3179473A4

公开(公告)日：2017-07-12

申请号：EP16762948

申请日：2016-06-06

申请人： LE HOLDINGS (BEIJING) CO LTD , LE SHI ZHI XIN ELECTRONIC TECH (TIANJIN) LTD

发明人： YAN ZHIYONG

IPC分类号： G10L15/183 , G10L15/06

CPC分类号： G10L15/063 , G10L15/06 , G10L15/183 , G10L15/197 , G10L2015/0633 , G10L2015/0635

摘要： The present disclosure provides a language model training method and apparatus and a device. The method includes: obtaining a universal language model in an offline training mode, and clipping the universal language model to obtain a clipped language model; obtaining a log language model of logs within a preset time period in an online training mode; fusing the clipped language model with the log language model to obtain a first fusion language model used for carrying out first time decoding; and fusing the universal language model with the log language model to obtain a second fusion language model used for carrying out second time decoding. The method is used for solving the problem that a language model obtained offline in the prior art has poor coverage on new corpora, resulting in a reduced language recognition rate.

摘要翻译： 本公开提供了一种语言模型训练方法和装置及设备。该方法包括：在离线训练模式下获得通用语言模型，裁剪通用语言模型以获得裁剪语言模型; 在在线训练模式下获取预设时间段内的日志的日志语言模型; 将剪裁的语言模型与日志语言模型融合以获得用于执行第一次解码的第一融合语言模型; 以及将通用语言模型与日志语言模型融合以获得用于执行第二次解码的第二融合语言模型。该方法用于解决现有技术离线获取的语言模型对新语料的覆盖率较差，导致语言识别率降低的问题。

68.

发明公开
AUTHENTICATION METHOD, TERMINAL AND COMPUTER STORAGE MEDIUM BASED ON VOICEPRINT CHARACTERISTIC 审中-公开
标题翻译：基于语音特征的认证方法，终端和计算机存储介质

公开(公告)号：EP3185162A1

公开(公告)日：2017-06-28

申请号：EP15833692.5

申请日：2015-04-28

申请人： ZTE Corporation

发明人： LIU, Xueqin

IPC分类号： G06F21/32 , G10L15/02 , G10L15/06

CPC分类号： G06F21/32 , G10L17/04

摘要： A secure authentication method based on a voiceprint characteristic, the method comprising: upon receiving a voice acquisition instruction, a terminal acquires to-be-measured voice data recorded by a user; extracting a voiceprint characteristic of the to-be-measured voice data to obtain voiceprint characteristic information; and according to the currently extracted voiceprint characteristic information and pre-stored voiceprint characteristic information, authenticating the identity of the current user. Also disclosed are a corresponding terminal and computer storage medium.

摘要翻译： 一种基于声纹特征的安全认证方法，所述方法包括：终端接收到语音获取指令时，获取用户记录的待测量语音数据; 提取待测量语音数据的声纹特征，得到声纹特征信息; 根据当前提取的声纹特征信息和预先存储的声纹特征信息，对当前用户的身份进行认证。还公开了相应的终端和计算机存储介质。

69.

发明公开
SESSION CONTEXT MODELING FOR CONVERSATIONAL UNDERSTANDING SYSTEMS 有权
标题翻译：对话理解系统的会话上下文建模

公开(公告)号：EP3158559A1

公开(公告)日：2017-04-26

申请号：EP15736702.0

申请日：2015-06-17

申请人： Microsoft Technology Licensing, LLC

发明人： AKBACAK, Murat , HAKKANI-TUR, Dilek Z. , TUR, Gokhan , HECK, Larry P.

IPC分类号： G10L15/06 , G06F17/30 , G10L15/183

CPC分类号： G06F17/2836 , G06F17/2818 , G06F17/30766 , G10L15/06 , G10L15/183 , G10L2015/227

摘要： Systems and methods are provided for improving language models for speech recognition by adapting knowledge sources utilized by the language models to session contexts. A knowledge source, such as a knowledge graph, is used to capture and model dynamic session context based on user interaction information from usage history, such as session logs, that is mapped to the knowledge source. From sequences of user interactions, higher level intent sequences may be determined and used to form models that anticipate similar intents but with different arguments including arguments that do not necessarily appear in the usage history. In this way, the session context models may be used to determine likely next interactions or “turns” from a user, given a previous turn or turns. Language models corresponding to the likely next turns are then interpolated and provided to improve recognition accuracy of the next turn received from the user.

摘要翻译： 通过将语言模型使用的知识源适配到会话上下文中，提供了用于改善语音识别的语言模型的系统和方法。知识源（例如知识图）用于基于映射到知识源的使用历史记录（例如会话日志）中的用户交互信息来捕获和建模动态会话上下文。根据用户交互的序列，可以确定更高级别的意图序列并且将其用于形成预测类似意图但具有不同参数的模型，所述参数包括不一定出现在使用历史中的参数。以这种方式，会话上下文模型可以被用于确定可能的下一个交互或者在给定之前的转向或转向时从用户“转向”。然后插入并提供对应于可能的下一个回合的语言模型，以提高从用户接收到的下一个回合的识别准确度。

70.

发明公开
KNOWLEDGE SOURCE PERSONALIZATION TO IMPROVE LANGUAGE MODELS 审中-公开
标题翻译： WISSENSQUELLENPERSONALISIERUNG ZUR VERBESSERUNG VON SPRACHMODELLEN

公开(公告)号：EP3143522A1

公开(公告)日：2017-03-22

申请号：EP15728256.7

申请日：2015-05-15

申请人： Microsoft Technology Licensing, LLC

发明人： AKBACAK, Murat , HAKKANI-TUR, Dilek Z. , TUR, Gokhan , HECK, Larry P. , DUMOULIN, Benoit

IPC分类号： G06F17/30 , G10L15/06

CPC分类号： G10L15/18 , G06F17/30654 , G06F17/30734 , G06F17/30766 , G10L15/07

摘要： Systems and methods are provided for improving language models for speech recognition by personalizing knowledge sources utilized by the language models to specific users or user-population characteristics. A knowledge source, such as a knowledge graph, is personalized for a particular user by mapping entities or user actions from usage history for the user, such as query logs, to the knowledge source. The personalized knowledge source may be used to build a personal language model by training a language model with queries corresponding to entities or entity pairs that appear in usage history. In some embodiments, a personalized knowledge source for a specific user can be extended based on personalized knowledge sources of similar users.

摘要翻译： 提供了系统和方法，用于通过将语言模型所使用的知识源个人化为特定用户或用户群体特征来改进用于语音识别的语言模型。通过将实体或用户操作与用户的使用历史（例如查询日志）映射到知识源，为特定用户个性化知识源。个性化知识源可以用于通过训练具有对应于出现在使用历史中的实体或实体对的查询的语言模型来构建个人语言模型。在一些实施例中，可以基于类似用户的个性化知识源来扩展用于特定用户的个性化知识源。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类