Patent search ap:("GOOGLE INC.") AND inv:"Johan Schalkwyk" Page 1

1.

发明申请
RE-RECOGNIZING SPEECH WITH EXTERNAL DATA SOURCES 审中-公开

公开(公告)号：US20170301352A1

公开(公告)日：2017-10-19

申请号：US15637526

申请日：2017-06-29

Applicant: Google Inc.

Inventor： Trevor D. Strohman , Johan Schalkwyk , Gleb Skobeltsyn

IPC: G10L15/32 , G10L15/22 , G10L15/19 , G10L25/51 , G10L15/02

CPC classification number: G10L15/32 , G10L15/02 , G10L15/183 , G10L15/19 , G10L15/22 , G10L15/26 , G10L25/51 , G10L2015/025

Abstract: Methods, including computer programs encoded on a computer storage medium, for improving speech recognition based on external data sources. In one aspect, a method includes obtaining an initial candidate transcription of an utterance using an automated speech recognizer and identifying, based on a language model that is not used by the automated speech recognizer in generating the initial candidate transcription, one or more terms that are phonetically similar to one or more terms that do occur in the initial candidate transcription. Additional actions include generating one or more additional candidate transcriptions based on the identified one or more terms and selecting a transcription from among the candidate transcriptions.

2.

发明申请
NEURAL NETWORK FOR KEYBOARD INPUT DECODING 审中-公开

公开(公告)号：US20170199665A1

公开(公告)日：2017-07-13

申请号：US15473010

申请日：2017-03-29

Applicant: Google Inc.

Inventor： Shumin Zhai , Thomas Breuel , Ouais Alsharif , Yu Ouyang , Francoise Beaufays , Johan Schalkwyk

IPC: G06F3/0488 , G06N3/08 , G06F3/0482

CPC classification number: G06F3/04886 , G06F3/0219 , G06F3/0233 , G06F3/0237 , G06F3/0482 , G06F3/04883 , G06F3/04895 , G06F17/273 , G06F17/276 , G06F17/2765 , G06N3/0445 , G06N3/08

Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

3.

发明授权
Keyword detection without decoding 有权
Title translation: 关键字检测无需解码

公开(公告)号：US09378733B1

公开(公告)日：2016-06-28

申请号：US13860982

申请日：2013-04-11

Applicant: Google Inc.

Inventor： Vincent O. Vanhoucke , Oriol Vinyals , Patrick An Phu Nguyen , Maria Carolina Parada San Martin , Johan Schalkwyk

IPC: G10L17/24 , G06F21/46 , G10L15/08 , G10L15/22 , G10L25/51

CPC classification number: G10L15/08 , G10L15/02 , G10L2015/088

Abstract: Embodiments pertain to automatic speech recognition in mobile devices to establish the presence of a keyword. An audio waveform is received at a mobile device. Front-end feature extraction is performed on the audio waveform, followed by acoustic modeling, high level feature extraction, and output classification to detect the keyword. Acoustic modeling may use a neural network or a vector quantization dictionary and high level feature extraction may use pooling.

Abstract translation: 实施例涉及移动设备中的自动语音识别以建立关键字的存在。在移动设备处接收音频波形。对音频波形执行前端特征提取，然后进行声学建模，高级特征提取和输出分类，以检测关键字。声学建模可以使用神经网络或矢量量化字典，并且高级特征提取可以使用池。

4.

发明申请
DETERMINING HOTWORD SUITABILITY 有权
Title translation: 确定热门适用性

公开(公告)号：US20160133259A1

公开(公告)日：2016-05-12

申请号：US15002044

申请日：2016-01-20

Applicant: Google Inc

Inventor： Andrew E. Rubin , Johan Schalkwyk , Maria Carolina Parada San Martin

IPC: G10L17/24 , G06F21/46 , G06F21/32 , G10L15/22 , G10L15/08

CPC classification number: G10L17/24 , G06F21/32 , G06F21/46 , G10L15/06 , G10L15/08 , G10L15/22 , G10L25/51 , G10L2015/0638 , G10L2015/088 , G10L2015/225

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于确定热词适用性。一方面，一种方法包括接收语音数据，该语音数据编码由用户说出的候选词条，使用一个或多个预定标准评估语音数据或候选词条的转录，基于使用一个或多个预定标准来评估语音数据或候选词条的转录，以及提供用于显示给用户的热词适合性得分的表示。

5.

发明授权
Virtual participant-based real-time translation and transcription system for audio and video teleconferences 有权
Title translation: 基于虚拟参与者的音视频电话会议实时翻译和转录系统

公开(公告)号：US09292500B2

公开(公告)日：2016-03-22

申请号：US14486312

申请日：2014-09-15

Applicant: Google Inc.

Inventor： Jakob David Uszkoreit , Ashish Venugopal , Johan Schalkwyk , Joshua James Estelle

IPC: G10L21/00 , G10L25/00 , G06F17/28 , G10L15/00 , H04M3/56

CPC classification number: G06F17/289 , G10L15/005 , H04M3/568 , H04N7/155

Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.

Abstract translation: 本公开描述了一种电话会议系统，其可以使用虚拟参与者处理器将电话会议的语言内容翻译成每个参与者的口语，而不需要额外的用户输入。虚拟参与者处理器可以像其他参与者一样连接到电话会议。虚拟参与者处理器可以拦截以前在参与者之间交换的所有文本或音频数据现在可被虚拟参与者处理器拦截。在获得部分或完整的语言识别结果或进行语言偏好确定时，虚拟参与者处理器可以调用适合每个参与者的翻译引擎。虚拟参与者处理器可将所得到的翻译发送到电话会议管理处理器。电话会议管理处理器可将相应的翻译文本或音频数据传送给适当的参与者。

6.

发明申请
LANGUAGE MODELING OF COMPLETE LANGUAGE SEQUENCES 有权
Title translation: 完整语言序列的语言建模

公开(公告)号：US20140278407A1

公开(公告)日：2014-09-18

申请号：US13875406

申请日：2013-05-02

Applicant: Google Inc.

Inventor： Ciprian I. Chelba , Hasim Sak , Johan Schalkwyk

IPC: G10L15/26

CPC classification number: G10L15/063 , G10L15/197

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling of complete language sequences. Training data indicating language sequences is accessed, and counts for a number of times each language sequence occurs in the training data are determined. A proper subset of the language sequences is selected, and a first component of a language model is trained. The first component includes first probability data for assigning scores to the selected language sequences. A second component of the language model is trained based on the training data, where the second component includes second probability data for assigning scores to language sequences that are not included in the selected language sequences. Adjustment data that normalizes the second probability data with respect to the first probability data is generated, and the first component, the second component, and the adjustment data are stored.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于完整语言序列的语言建模。访问指示语言序列的训练数据，并且确定训练数据中出现每个语言序列多次的计数。选择语言序列的适当子集，并训练语言模型的第一个组成部分。第一组件包括用于将分数分配给所选择的语言序列的第一概率数据。基于训练数据训练语言模型的第二组件，其中第二组件包括用于将分数分配给不包括在所选语言序列中的语言序列的第二概率数据。生成相对于第一概率数据归一化第二概率数据的调整数据，并且存储第一分量，第二分量和调整数据。

7.

发明授权
Context-based speech recognition 有权
Title translation: 基于语境的语音识别

公开(公告)号：US09311915B2

公开(公告)日：2016-04-12

申请号：US14030265

申请日：2013-09-18

Applicant: Google Inc.

Inventor： Eugene Weinstein , Pedro J. Mengibar , Johan Schalkwyk

IPC: G10L21/00 , G10L15/16

CPC classification number: G10L15/16

Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.

Abstract translation: 处理系统接收编码话音的一部分的音频信号。处理系统接收与话语相关联的上下文信息，其中上下文信息不是从音频信号或任何其它音频信号导出的。处理系统作为神经网络的输入提供对应于音频信号和上下文信息的数据，并且基于至少神经网络的输出来产生用于话语的转录。

8.

发明授权
Multi-modal input on an electronic device 有权

公开(公告)号：US09251791B2

公开(公告)日：2016-02-02

申请号：US14299837

申请日：2014-06-09

Applicant: Google Inc.

Inventor： Brandon M. Ballinger , Johan Schalkwyk , Michael H. Cohen , William J. Byrne , Gudmundur Hafsteinsson , Michael J. LeBeau

IPC: G06F17/20 , G10L15/26 , G06F17/28 , G10L15/30 , G10L15/18 , G10L15/183 , G10L15/197

CPC classification number: G06F3/167 , G06F3/04886 , G06F17/277 , G06F17/289 , G10L15/005 , G10L15/18 , G10L15/183 , G10L15/197 , G10L15/22 , G10L15/26 , G10L15/265 , G10L15/30 , G10L2015/223 , G10L2015/228

Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.

9.

发明授权
Methods and systems for sharing of adapted voice profiles 有权
Title translation: 用于共享适应语音配置文件的方法和系统

公开(公告)号：US09117451B2

公开(公告)日：2015-08-25

申请号：US13872401

申请日：2013-04-29

Applicant: Google Inc.

Inventor： Javier Gonzalvo Fructuoso , Johan Schalkwyk

IPC: G10L17/00 , G10L15/28 , G10L13/04 , H04L29/08 , G10L13/10 , G10L13/033 , G10L15/07

CPC classification number: G10L15/28 , G10L13/033 , G10L13/04 , G10L13/10 , G10L15/07 , H04L67/306 , H04M1/00 , H04M1/578

Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.

Abstract translation: 提供了用于共享适应语音简档的方法和系统。该方法可以包括在计算系统处接收一个或多个语音样本，并且所述一个或多个语音样本可以包括多个讲话语音。该方法还可以包括在计算系统处确定与多个讲话话语中的说话者相关联的语音简档，并且包括说话者的适配语音。此外，该方法可以包括在计算系统处接收与所确定的语音简档相关联的授权简档，并且授权简档可以包括与一个或多个相应用户相关联的一个或多个用户标识符。此外，该方法可以包括至少部分地基于授权简档而将语音简档提供给与一个或多个相应用户相关联的至少一个计算设备的计算系统。

10.

发明申请
CONTEXT-BASED SPEECH RECOGNITION 有权
Title translation: 基于语境的语音识别

公开(公告)号：US20150039299A1

公开(公告)日：2015-02-05

申请号：US14030265

申请日：2013-09-18

Applicant: Google Inc.

Inventor： Eugene Weinstein , Pedro J. Moreno Mengibar , Johan Schalkwyk

IPC: G10L15/16

CPC classification number: G10L15/16

Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.

Abstract translation: 处理系统接收编码话音的一部分的音频信号。处理系统接收与话语相关联的上下文信息，其中上下文信息不是从音频信号或任何其它音频信号导出的。处理系统作为神经网络的输入提供对应于音频信号和上下文信息的数据，并且基于至少神经网络的输出来产生用于话语的转录。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification