Applying neural network language models to weighted finite state transducers for automatic speech recognition

Invention Grant

US10354652B2 Applying neural network language models to weighted finite state transducers for automatic speech recognition 有权

Please log in to see more content

Patent Title: Applying neural network language models to weighted finite state transducers for automatic speech recognition
Application No.: US16035513

Application Date: 2018-07-13
Publication No.: US10354652B2

Publication Date: 2019-07-16
Inventor: Rongqing Huang , Ilya Oparin
Applicant: Apple Inc.
Applicant Address: US CA Cupertino
Assignee: Apple Inc.
Current Assignee: Apple Inc.
Current Assignee Address: US CA Cupertino
Agency: Dentons US LLP
Main IPC: G10L15/28
IPC: G10L15/28 ; G10L15/14 ; G10L15/16 ; G10L15/19 ; G10L15/197 ; G10L15/193 ; G10L15/08

Applying neural network language models to weighted finite state transducers for automatic speech recognition

Abstract:

Systems and processes for converting speech-to-text are provided. In one example process, speech input can be received. A sequence of states and arcs of a weighted finite state transducer (WFST) can be traversed. A negating finite state transducer (FST) can be traversed. A virtual FST can be composed using a neural network language model and based on the sequence of states and arcs of the WFST. The one or more virtual states of the virtual FST can be traversed to determine a probability of a candidate word given one or more history candidate words. Text corresponding to the speech input can be determined based on the probability of the candidate word given the one or more history candidate words. An output can be provided based on the text corresponding to the speech input.

Public/Granted literature

US20180374484A1 APPLYING NEURAL NETWORK LANGUAGE MODELS TO WEIGHTED FINITE STATE TRANSDUCERS FOR AUTOMATIC SPEECH RECOGNITION Public/Granted day:2018-12-27

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/28	.语音识别系统的结构细节