-
公开(公告)号:US20170092297A1
公开(公告)日:2017-03-30
申请号:US14986985
申请日:2016-01-04
Applicant: Google Inc.
Inventor: Tara N. Sainath , Gabor Simko , Maria Carolina Parada San Martin
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting voice activity. In one aspect, a method include actions of receiving, by a neural network included in an automated voice activity detection system, a raw audio waveform, processing, by the neural network, the raw audio waveform to determine whether the audio waveform includes speech, and provide, by the neural network, a classification of the raw audio waveform indicating whether the raw audio waveform includes speech.
-
公开(公告)号:US20190035390A1
公开(公告)日:2019-01-31
申请号:US15659016
申请日:2017-07-25
Applicant: Google Inc.
Inventor: Nathan David Howard , Gabor Simko , Maria Carolina Parada San Martin , Ramkarthik Kalyanasundaram , Guru Prakash Arumugam , Srinivas Vasudevan
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant. Receiving, from the classifier, an indication of whether the utterance corresponding to the received audio data is likely directed to the automated assistant or is likely not directed to the automated assistant. Selectively instructing the automated assistant based at least on the indication of whether the utterance corresponding to the received audio data is likely directed to the automated assistant or is likely not directed to the automated assistant.
-