Methods and Systems for Sharing of Adapted Voice Profiles
    11.
    发明申请
    Methods and Systems for Sharing of Adapted Voice Profiles 有权
    用于分享适应语音配置文件的方法和系统

    公开(公告)号:US20140236598A1

    公开(公告)日:2014-08-21

    申请号:US13872401

    申请日:2013-04-29

    Applicant: Google Inc.

    Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.

    Abstract translation: 提供了用于共享适应语音简档的方法和系统。 该方法可以包括在计算系统处接收一个或多个语音样本,并且所述一个或多个语音样本可以包括多个讲话语音。 该方法还可以包括在计算系统处确定与多个讲话话语中的说话者相关联的语音简档,并且包括说话者的适配语音。 此外,该方法可以包括在计算系统处接收与所确定的语音简档相关联的授权简档,并且授权简档可以包括与一个或多个相应用户相关联的一个或多个用户标识符。 此外,该方法可以包括至少部分地基于授权简档而将语音简档提供给与一个或多个相应用户相关联的至少一个计算设备的计算系统。

    Neural network for keyboard input decoding

    公开(公告)号:US10248313B2

    公开(公告)日:2019-04-02

    申请号:US15473010

    申请日:2017-03-29

    Applicant: Google Inc.

    Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

    SERVER SIDE HOTWORDING
    13.
    发明申请

    公开(公告)号:US20180233150A1

    公开(公告)日:2018-08-16

    申请号:US15432358

    申请日:2017-02-14

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

    Language modeling of complete language sequences

    公开(公告)号:US09786269B2

    公开(公告)日:2017-10-10

    申请号:US13875406

    申请日:2013-05-02

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G10L15/197

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling of complete language sequences. Training data indicating language sequences is accessed, and counts for a number of times each language sequence occurs in the training data are determined. A proper subset of the language sequences is selected, and a first component of a language model is trained. The first component includes first probability data for assigning scores to the selected language sequences. A second component of the language model is trained based on the training data, where the second component includes second probability data for assigning scores to language sequences that are not included in the selected language sequences. Adjustment data that normalizes the second probability data with respect to the first probability data is generated, and the first component, the second component, and the adjustment data are stored.

    Disambiguation of a spoken query term

    公开(公告)号:US09418177B1

    公开(公告)日:2016-08-16

    申请号:US13958740

    申请日:2013-08-05

    Applicant: Google Inc.

    CPC classification number: G06F17/30976 G10L15/197 G10L15/265

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.

    Methods and systems for sharing of adapted voice profiles

    公开(公告)号:US09318104B1

    公开(公告)日:2016-04-19

    申请号:US14796245

    申请日:2015-07-10

    Applicant: Google Inc.

    Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.

    VIRTUAL PARTICIPANT-BASED REAL-TIME TRANSLATION AND TRANSCRIPTION SYSTEM FOR AUDIO AND VIDEO TELECONFERENCES
    18.
    发明申请
    VIRTUAL PARTICIPANT-BASED REAL-TIME TRANSLATION AND TRANSCRIPTION SYSTEM FOR AUDIO AND VIDEO TELECONFERENCES 有权
    基于参与者的视频和视频电话实时翻译和转录系统

    公开(公告)号:US20150006144A1

    公开(公告)日:2015-01-01

    申请号:US14486312

    申请日:2014-09-15

    Applicant: GOOGLE INC.

    CPC classification number: G06F17/289 G10L15/005 H04M3/568 H04N7/155

    Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.

    Abstract translation: 本公开描述了一种电话会议系统,其可以使用虚拟参与者处理器将电话会议的语言内容翻译成每个参与者的口语,而不需要额外的用户输入。 虚拟参与者处理器可以像其他参与者一样连接到电话会议。 虚拟参与者处理器可以拦截以前在参与者之间交换的所有文本或音频数据现在可被虚拟参与者处理器拦截。 在获得部分或完整的语言识别结果或进行语言偏好确定时,虚拟参与者处理器可以调用适合每个参与者的翻译引擎。 虚拟参与者处理器可将所得到的翻译发送到电话会议管理处理器。 电话会议管理处理器可将相应的翻译文本或音频数据传送给适当的参与者。

    Natural language correction for speech input
    20.
    发明授权
    Natural language correction for speech input 有权
    语言输入的自然语言修正

    公开(公告)号:US09483459B1

    公开(公告)日:2016-11-01

    申请号:US13799767

    申请日:2013-03-13

    Applicant: Google Inc.

    Abstract: A system is configured to receive a first string corresponding to an interpretation of a natural-language user voice entry; provide a representation of the first string as feedback to the natural-language user voice entry; receive, based on the feedback, a second string corresponding to a natural-language corrective user entry, where the natural-language corrective user entry may correspond to a correction to the natural-language user voice entry; parse the second string into one or more tokens; determine at least one corrective instruction from the one or more tokens of the second string; generate, from at least a portion of each of the first and second strings and based on the at least one corrective instruction, candidate corrected user entries; select a corrected user entry from the candidate corrected user entries; and output the selected, corrected user entry.

    Abstract translation: 系统被配置为接收对应于自然语言用户语音输入的解释的第一串; 提供第一个字符串的表示作为对自然语言用户语音输入的反馈; 基于所述反馈接收对应于自然语言校正用户条目的第二字符串,其中所述自然语言校正用户条目可对应于对所述自然语言用户语音输入的校正; 将第二个字符串解析成一个或多个令牌; 确定来自所述第二串的所述一个或多个令牌的至少一个校正指令; 从所述第一和第二串中的每一个的至少一部分中,基于所述至少一个校正指令生成候选校正用户条目; 从候选者更正的用户条目中选择一个更正的用户条目; 并输出所选择的,更正的用户条目。

Patent Agency Ranking