Keyword detection without decoding
    11.
    发明授权
    Keyword detection without decoding 有权
    关键字检测无需解码

    公开(公告)号:US09378733B1

    公开(公告)日:2016-06-28

    申请号:US13860982

    申请日:2013-04-11

    Applicant: Google Inc.

    CPC classification number: G10L15/08 G10L15/02 G10L2015/088

    Abstract: Embodiments pertain to automatic speech recognition in mobile devices to establish the presence of a keyword. An audio waveform is received at a mobile device. Front-end feature extraction is performed on the audio waveform, followed by acoustic modeling, high level feature extraction, and output classification to detect the keyword. Acoustic modeling may use a neural network or a vector quantization dictionary and high level feature extraction may use pooling.

    Abstract translation: 实施例涉及移动设备中的自动语音识别以建立关键字的存在。 在移动设备处接收音频波形。 对音频波形执行前端特征提取,然后进行声学建模,高级特征提取和输出分类,以检测关键字。 声学建模可以使用神经网络或矢量量化字典,并且高级特征提取可以使用池。

    PROCESSING IMAGES USING DEEP NEURAL NETWORKS
    12.
    发明申请
    PROCESSING IMAGES USING DEEP NEURAL NETWORKS 有权
    使用深层神经网络处理图像

    公开(公告)号:US20160063359A1

    公开(公告)日:2016-03-03

    申请号:US14839452

    申请日:2015-08-28

    Applicant: GOOGLE INC.

    CPC classification number: G06K9/66 G06N3/0454 G06N3/063 G06N3/084

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用深层神经网络进行图像处理。 方法之一包括接收表征输入图像的数据; 使用深层神经网络处理表征输入图像的数据以产生输入图像的替代表示,其中所述深层神经网络包括多个子网络,其中所述子网络以从低到高的顺序排列,并且其中处理 使用深层神经网络表征输入图像的数据包括通过序列中的每个子网处理数据; 以及通过输出层处理输入图像的替代表示以从输入图像生成输出。

    ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS
    13.
    发明申请
    ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS 有权
    神经网络序列训练的异步优化

    公开(公告)号:US20150127337A1

    公开(公告)日:2015-05-07

    申请号:US14258139

    申请日:2014-04-22

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G06N3/0454 G10L15/16 G10L15/183

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于通过第一序列训练语音模型获得表示第一训练话语的语音特征的第一批训练帧; 通过所述第一序列训练语音模型获得一个或多个第一神经网络参数; 基于(i)第一批训练帧和(ii)所述一个或多个第一神经网络参数,通过所述第一序列训练语音模型确定一个或多个优化的第一神经网络参数; 通过第二序列训练语音模型获得表示第二训练语音的语音特征的第二批训练帧; 获得一个或多个第二神经网络参数; 以及通过所述第二序列训练语音模型,基于(i)第二批训练帧和(ii)所述一个或多个第二神经网络参数来确定一个或多个优化的第二神经网络参数。

    Asynchronous optimization for sequence training of neural networks

    公开(公告)号:US10019985B2

    公开(公告)日:2018-07-10

    申请号:US14258139

    申请日:2014-04-22

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G06N3/0454 G10L15/16 G10L15/183

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

Patent Agency Ranking