Invention Grant
- Patent Title: Adaptive visual speech recognition
-
Application No.: US18571553Application Date: 2022-06-15
-
Publication No.: US12211488B2Publication Date: 2025-01-28
- Inventor: Ioannis Alexandros Assael , Brendan Shillingford , Joao Ferdinando Gomes de Freitas
- Applicant: DeepMind Technologies Limited
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- Agency: Fish & Richardson P.C.
- Priority: GR20210100402 20210618
- International Application: PCT/EP2022/066419 WO 20220615
- International Announcement: WO2022/263570 WO 20221222
- Main IPC: G10L25/30
- IPC: G10L25/30 ; G10L15/06

Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing video data using an adaptive visual speech recognition model. One of the methods includes receiving a video that includes a plurality of video frames that depict a first speaker: obtaining a first embedding characterizing the first speaker; and processing a first input comprising (i) the video and (ii) the first embedding using a visual speech recognition neural network having a plurality of parameters, wherein the visual speech recognition neural network is configured to process the video and the first embedding in accordance with trained values of the parameters to generate a speech recognition output that defines a sequence of one or more words being spoken by the first speaker in the video.
Public/Granted literature
- US20240265911A1 ADAPTIVE VISUAL SPEECH RECOGNITION Public/Granted day:2024-08-08
Information query