Invention Application
- Patent Title: SYSTEM AND METHOD FOR SPEECH UNDERSTANDING VIA INTEGRATED AUDIO AND VISUAL BASED SPEECH RECOGNITION
-
Application No.: PCT/US2019/018215Application Date: 2019-02-15
-
Publication No.: WO2019161198A1Publication Date: 2019-08-22
- Inventor: SHUKLA, Nishant , DHARNE, Ashwin
- Applicant: DMAI, INC.
- Applicant Address: 10940 Wilshire Blvd, Suite 1100 Los Angeles, California 90024 US
- Assignee: DMAI, INC.
- Current Assignee: DMAI, INC.
- Current Assignee Address: 10940 Wilshire Blvd, Suite 1100 Los Angeles, California 90024 US
- Agency: GADKAR, Arush
- Priority: US62/630,976 20180215
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G10L15/00 ; G10L15/04 ; G10L17/00 ; G10L21/00
Abstract:
The present teaching relates to method, system, medium, and implementations for speech recognition. An audio signal is received that represents a speech of a user engaged in a dialogue. A visual signal is received that captures the user uttering the speech. A first speech recognition result is obtained by performing audio based speech recognition based on the audio signal. Based on the visual signal, lip movement of the user is detected and a second speech recognition result is obtained by performing lip reading based speech recognition. The first and the second speech recognition results are then integrated to generate an integrated speech recognition result.
Information query