-
公开(公告)号:US11538461B1
公开(公告)日:2022-12-27
申请号:US17249930
申请日:2021-03-18
Applicant: Amazon Technologies, Inc.
Inventor: Honey Gupta , Mayank Sharma
IPC: G10L15/08 , G10L15/16 , H04N21/488 , G10L25/93
Abstract: Some implementations include methods for detecting missing subtitles associated with a media presentation and may include receiving an audio component and a subtitle component associated with a media presentation, the audio component including an audio sequence, the audio sequence divided into a plurality of audio segments; evaluating the plurality of audio segments using a combination of a recurrent neural network and a convolutional neural network to identify refined speech segments associated with the audio sequence, the recurrent neural network trained based on a plurality of languages, the convolutional neural network trained based on a plurality of categories of sound; determining timestamps associated with the identified refined speech segments; and determining missing subtitles based on the timestamps associated with the identified refined speech segments and timestamps associated with subtitles included in the subtitle component.
-
公开(公告)号:US20240223872A1
公开(公告)日:2024-07-04
申请号:US18411720
申请日:2024-01-12
Applicant: Amazon Technologies, Inc.
Inventor: Mayank Sharma , Prabhakar Gupta , Honey Gupta , Kumar Keshav
IPC: H04N21/8549 , H04N21/466 , H04N21/472
CPC classification number: H04N21/8549 , H04N21/466 , H04N21/47217
Abstract: A respective set of features, including emotion-related features, are extracted from segments of a video for which a preview is to be generated. A subset of the segments is chosen using the features and filtering criteria including at least one emotion-based filtering criterion. Respective weighted preview-suitability scores are assigned to the segments of the subset using at least a metric of similarity between individual segments and a plot summary of the video. The scores are used to select and combine segments to form a preview for the video.
-
公开(公告)号:US12205614B1
公开(公告)日:2025-01-21
申请号:US17661165
申请日:2022-04-28
Applicant: Amazon Technologies, Inc.
Inventor: Mayank Sharma , Anil Kumar Nelakanti , Palanivelu Balakrishnan , Saravanan Santhamoorthy Theckyam , Honey Gupta
Abstract: Methods and apparatus are described for evaluating dubbing of media content. Emotions are identified based on combinations of attributes determined for segments of a source language audio and a dubbed audio. The emotions may be compared to determine emotional prosody transfer between the source audio and dubbed audio. Based on the comparison, a notification is generated indicating whether an emotion classification associated with the source audio matches an emotion classification associated with the dubbed audio.
-
公开(公告)号:US11910073B1
公开(公告)日:2024-02-20
申请号:US17819918
申请日:2022-08-15
Applicant: Amazon Technologies, Inc.
Inventor: Mayank Sharma , Prabhakar Gupta , Honey Gupta , Kumar Keshav
IPC: H04N21/8549 , H04N21/466 , H04N21/472
CPC classification number: H04N21/8549 , H04N21/466 , H04N21/47217
Abstract: A respective set of features, including emotion-related features, are extracted from segments of a video for which a preview is to be generated. A subset of the segments is chosen using the features and filtering criteria including at least one emotion-based filtering criterion. Respective weighted preview-suitability scores are assigned to the segments of the subset using at least a metric of similarity between individual segments and a plot summary of the video. The scores are used to select and combine segments to form a preview for the video.
-
-
-