-
公开(公告)号:US11582527B2
公开(公告)日:2023-02-14
申请号:US16975696
申请日:2018-02-26
Applicant: GOOGLE LLC
Inventor: Terrence Paul McCartney, Jr. , Brian Colonna , Michael Nechyba
IPC: H04N7/10 , H04N21/488 , G06F40/58 , G06F40/30 , H04N21/43
Abstract: A method for aligning a translation of original caption data with an audio portion of a video is provided. The method includes identifying, by a processing device, original caption data for a video that includes a plurality of caption character strings. The processing device identifies speech recognition data that includes a plurality of generated character strings and associated timing information for each generated character string. The processing device maps the plurality of caption character strings to the plurality of generated character strings using assigned values indicative of semantic similarities between character strings. The processing device assigns timing information to the individual caption character strings based on timing information of mapped individual generated character strings. The processing device aligns a translation of the original caption data with the audio portion of the video using assigned timing information of the individual caption character strings.
-
公开(公告)号:US12114048B2
公开(公告)日:2024-10-08
申请号:US18109243
申请日:2023-02-13
Applicant: Google LLC
Inventor: Terrance Paul McCartney, Jr. , Brian Colonna , Michael Nechyba
IPC: H04N7/10 , G06F40/30 , G06F40/58 , H04N21/43 , H04N21/488
CPC classification number: H04N21/4884 , G06F40/30 , G06F40/58 , H04N21/43074
Abstract: A method for aligning a translation of original caption data with an audio portion of a video is provided. The method involves identifying original caption data for the video that includes caption character strings, identifying translated language caption data for the video that includes translated character strings associated with audio portion of the video, and mapping caption sentence fragments generated from the caption character strings to corresponding translated sentence fragments generated from the translated character strings based on timing associated with the original caption data and the translated language caption data. The method further involves estimating time intervals for individual caption sentence fragments using timing information corresponding to individual caption character strings, assigning time intervals to individual translated sentence fragments based on estimated time intervals of the individual caption sentence fragments, generating a set of translated sentences using consecutive translated sentence fragments, and aligning the set of translated sentences with the audio portion of the video using assigned time intervals of individual translated sentence fragments from corresponding translated sentences.
-
公开(公告)号:US20190147105A1
公开(公告)日:2019-05-16
申请号:US15813978
申请日:2017-11-15
Applicant: Google LLC
Inventor: Hang Chu , Michael Nechyba , Andrew C. Gallagher , Utsav Prabhu
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for partitioning videos. In one aspect, a method includes obtaining a partition of a video into one or more shots. Features are generated for each shot, including visual features and audio features. The generated features for each shot are provided as input to a partitioning neural network that is configured to process the generated features to generate a partitioning neural network output. The partition of the video into one or more chapters is determined based on the partitioning neural network output, where a chapter is a sequence of consecutive shots that are determined to be taken at one or more locations that are semantically related.
-
-