SYSTEM AND METHOD FOR SPEECH UNDERSTANDING VIA INTEGRATED AUDIO AND VISUAL BASED SPEECH RECOGNITION

Invention Application

WO2019161198A1 SYSTEM AND METHOD FOR SPEECH UNDERSTANDING VIA INTEGRATED AUDIO AND VISUAL BASED SPEECH RECOGNITION 审中-公开

Please log in to see more content

Patent Title: SYSTEM AND METHOD FOR SPEECH UNDERSTANDING VIA INTEGRATED AUDIO AND VISUAL BASED SPEECH RECOGNITION
Application No.: PCT/US2019/018215

Application Date: 2019-02-15
Publication No.: WO2019161198A1

Publication Date: 2019-08-22
Inventor: SHUKLA, Nishant , DHARNE, Ashwin
Applicant: DMAI, INC.
Applicant Address: 10940 Wilshire Blvd, Suite 1100 Los Angeles, California 90024 US
Assignee: DMAI, INC.
Current Assignee: DMAI, INC.
Current Assignee Address: 10940 Wilshire Blvd, Suite 1100 Los Angeles, California 90024 US
Agency: GADKAR, Arush
Priority: US62/630,976 20180215
Main IPC: G06K9/00
IPC: G06K9/00 ; G10L15/00 ; G10L15/04 ; G10L17/00 ; G10L21/00

SYSTEM AND METHOD FOR SPEECH UNDERSTANDING VIA INTEGRATED AUDIO AND VISUAL BASED SPEECH RECOGNITION

Abstract:

The present teaching relates to method, system, medium, and implementations for speech recognition. An audio signal is received that represents a speech of a user engaged in a dialogue. A visual signal is received that captures the user uttering the speech. A first speech recognition result is obtained by performing audio based speech recognition based on the audio signal. Based on the visual signal, lip movement of the user is detected and a second speech recognition result is obtained by performing lip reading based speech recognition. The first and the second speech recognition results are then integrated to generate an integrated speech recognition result.

Information query

Global Dossier Patent Scope Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )