-
公开(公告)号:US12236935B2
公开(公告)日:2025-02-25
申请号:US17931026
申请日:2022-09-09
Applicant: GOOGLE LLC
Inventor: Andrew R. Levine , Buddhika Kottahachchi , Christopher Davie , Kulumani Sriram , Richard James Potts , Sasakthi S. Abeysinghe
Abstract: The present disclosure relates to generating and adjusting translated audio from a video-based source. The method includes receiving video data and corresponding audio data in a first language; generating a translated preliminary transcript in a second language; aligning timing windows of portions of the translated preliminary transcript with corresponding segments of the audio data; determining portions of the translated aligned transcript in the second language that exceed a timing window range of the corresponding segments of the audio data in the first language to generate flagged transcript portions; transmitting the original transcript, the translated aligned transcript, and the first speech dub to a first device, the generated flagged transcript portions included in the original transcript and the translated aligned transcript; receiving, from the first device, a modified original transcript; and generating, based on the modified original transcript, a second speech dub in the second language.
-
公开(公告)号:US20240087557A1
公开(公告)日:2024-03-14
申请号:US17931026
申请日:2022-09-09
Applicant: GOOGLE LLC
Inventor: Andrew R. Levine , Buddhika Kottahachchi , Christopher Davie , Kulumani Sriram , Richard James Potts , Sasakthi S. Abeysinghe
CPC classification number: G10L13/02 , G06F40/58 , G10L13/086
Abstract: The present disclosure relates to generating and adjusting translated audio from a video-based source. The method includes receiving video data and corresponding audio data in a first language; generating a translated preliminary transcript in a second language; aligning timing windows of portions of the translated preliminary transcript with corresponding segments of the audio data; determining portions of the translated aligned transcript in the second language that exceed a timing window range of the corresponding segments of the audio data in the first language to generate flagged transcript portions; transmitting the original transcript, the translated aligned transcript, and the first speech dub to a first device, the generated flagged transcript portions included in the original transcript and the translated aligned transcript; receiving, from the first device, a modified original transcript; and generating, based on the modified original transcript, a second speech dub in the second language.
-