SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS

发明申请

US20220005465A1 SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS 有权

请登陆查看更多内容

专利标题： SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS
申请号： US17448119

申请日： 2021-09-20
公开(公告)号： US20220005465A1

公开(公告)日： 2022-01-06
发明人: Rohit Prakash Prabhavalkar , Zhifeng Chen , Bo Li , Chung-cheng Chiu , Kanury Kanishka Rao , Yonghui Wu , Ron J. Weiss , Navdeep Jaitly , Michiel A.u. Bacchiani , Tara N. Sainath , Jan Kazimierz Chorowski , Anjuli Patricia Kannan , Ekaterina Gonina , Patrick An Phu Nguyen
申请人： Google LLC
申请人地址： US CA Mountain View
专利权人： Google LLC
当前专利权人： Google LLC
当前专利权人地址： US CA Mountain View
主分类号： G10L15/16
IPC分类号： G10L15/16 ; G10L15/22 ; G10L15/02 ; G06N3/08 ; G10L15/06 ; G10L25/30

SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS

摘要：

A method for performing speech recognition using sequence-to-sequence models includes receiving audio data for an utterance and providing features indicative of acoustic characteristics of the utterance as input to an encoder. The method also includes processing an output of the encoder using an attender to generate a context vector, generating speech recognition scores using the context vector and a decoder trained using a training process, and generating a transcription for the utterance using word elements selected based on the speech recognition scores. The transcription is provided as an output of the ASR system.

公开/授权文献

US12106749B2 Speech recognition with sequence-to-sequence models 公开/授权日：2024-10-01

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/08	.语音分类或检索
G10L15/16	..利用人工神经网络