-
公开(公告)号:US11816849B2
公开(公告)日:2023-11-14
申请号:US17937323
申请日:2022-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Xiaohan Nie , Muhammad Raffay Hamid
IPC: G06T7/33 , G06T7/73 , G06V20/40 , G06V10/75 , H04N5/272 , G06F18/40 , G06F18/214 , G06F18/21 , H04N21/2187 , H04N21/234
CPC classification number: G06T7/337 , G06F18/214 , G06F18/2163 , G06F18/2178 , G06F18/41 , G06T7/73 , G06V10/751 , G06V20/46 , H04N5/272 , G06T2207/10016 , G06T2207/20021 , G06T2207/20081 , G06T2207/20092 , G06T2207/30204 , G06T2207/30228 , H04N21/2187 , H04N21/23418
Abstract: Methods and systems are described for registering a sports field to a video. Video of a live event may feature participants at a venue. A template of the venue, including virtual markings that represent real markings on the venue, may be obtained. A homographic transformation between an image plane and a ground plane may be determined by matching virtual markings to corresponding real markings captured in at least one frame of the video. The determined homographic transformation may be used in the automated analysis of sports statistics and in improving inserted annotations and visualizations.
-
公开(公告)号:US11776273B1
公开(公告)日:2023-10-03
申请号:US17107514
申请日:2020-11-30
Applicant: Amazon Technologies, Inc.
Inventor: Shixing Chen , Muhammad Raffay Hamid , Vimal Bhat , Shiva Krishnamurthy
IPC: G06V20/40 , G06N5/04 , G06N20/20 , G10L25/78 , G06F18/213
CPC classification number: G06V20/49 , G06F18/213 , G06N5/04 , G06N20/20 , G10L25/78
Abstract: Techniques for automatic scene change detection are described. As one example, a computer-implemented method includes receiving a request to train an ensemble of machine learning models on a training dataset of videos having labels that indicate scene changes to detect a scene change in a video, partitioning each video file of the training dataset of videos into a plurality of shots, training the ensemble of machine learning models into a trained ensemble of machine learning models based at least in part on the plurality of shots of the training dataset of videos and the labels that indicate scene changes, receiving an inference request for an input video, partitioning the input video into a plurality of shots, generating, by the trained ensemble of machine learning models, an inference of one or more scene changes in the input video based at least in part on the plurality of shots of the input video, and transmitting the inference to a client application or to a storage location.
-
公开(公告)号:US20230282006A1
公开(公告)日:2023-09-07
申请号:US18175044
申请日:2023-02-27
Applicant: Amazon Technologies, Inc.
Inventor: Tamojit Chatterjee , Mayank Sharma , Muhammad Raffay Hamid , Sandeep Joshi
IPC: G06V20/62 , G11B27/10 , G06F40/169 , G06V20/40 , G06N7/01
CPC classification number: G06V20/635 , G11B27/10 , G06F40/169 , G06V20/40 , G06N7/01 , G06V20/44
Abstract: Systems, methods, and computer-readable media are disclosed for language-agnostic subtitle drift detection and correction. A method may include determining subtitles and/or captions from media content (e.g., videos), the subtitles and/or captions corresponding to dialog in the media content. The subtitles may be broken up into segments which may be analyzed to determine a likelihood of drift (e.g., a likelihood that the subtitles are out of synchronization with the dialog in the media content) for each segment. For segments with a high likelihood of drift, the subtitles may be incrementally adjusted to determine an adjustment that eliminates and/or reduces the amount of drift, and the drift in the segment may be corrected based on the drift amount detected. A linear regression model and/or human blocks determined by human operators may be used to otherwise optimize drift correction.
-
公开(公告)号:US11354905B1
公开(公告)日:2022-06-07
申请号:US17247324
申请日:2020-12-07
Applicant: Amazon Technologies, Inc.
Inventor: Kewen Chen , Tu Anh Ho , Muhammad Raffay Hamid , Shixing Chen
Abstract: Methods and apparatus are described for generating compelling preview clips of media presentations. Compelling clips are identified based on the extent to which human faces are shown and/or the loudness of the audio associated with the clips. One or more of these compelling clips are then provided to a client device for playback.
-
公开(公告)号:US11336972B1
公开(公告)日:2022-05-17
申请号:US17141855
申请日:2021-01-05
Applicant: Amazon Technologies, Inc.
Inventor: Muhammad Raffay Hamid , Kewen Chen , Anne TuAnh Thanh Thuy Ho , Guy Friedel , Arun Velayudhan Pillai , Dhaval Damani , Jacob William Jensen , Zuzanna Maria Stepniakowska Coggins , Maciej Tadeusz Golonka , Anantha Krishna Hodrali Srinivasa Bhatta
IPC: H04N21/8549 , G10L25/78 , H04N21/44 , H04N21/475 , H04N21/442
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated video preview generation. Example methods may include determining video content, determining a first shot transition, a second shot transition, a third shot transition, and a fourth shot transition in the video content, and determining that human speech is present during the first shot transition and the second shot transition. Example methods may include determining a first timestamp associated with the third shot transition, determining a second timestamp associated with the fourth shot transition, generating a first video preview of the video content, where the first video preview includes a segment of the video content from the first timestamp to the second timestamp, and causing presentation of the first video preview, where the first video preview does not include a segment of the video content between the first shot transition and the second shot transition.
-
公开(公告)号:US10917704B1
公开(公告)日:2021-02-09
申请号:US16680825
申请日:2019-11-12
Applicant: Amazon Technologies, Inc.
Inventor: Muhammad Raffay Hamid , Kewen Chen , Anne TuAnh Thanh Thuy Ho , Guy Friedel , Arun Velayudhan Pillai , Dhaval Damani , Jacob William Jensen , Zuzanna Maria Stepniakowska Coggins , Maciej Tadeusz Golonka , Anantha Krishna Hodrali Srinivasa Bhatta
IPC: H04N21/8549 , G10L25/78 , H04N21/475 , H04N21/44 , H04N21/442
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated video preview generation. Example methods may include determining video content, determining a first shot transition, a second shot transition, a third shot transition, and a fourth shot transition in the video content, and determining that human speech is present during the first shot transition and the second shot transition. Example methods may include determining a first timestamp associated with the third shot transition, determining a second timestamp associated with the fourth shot transition, generating a first video preview of the video content, where the first video preview includes a segment of the video content from the first timestamp to the second timestamp, and causing presentation of the first video preview, where the first video preview does not include a segment of the video content between the first shot transition and the second shot transition.
-
公开(公告)号:US12073625B1
公开(公告)日:2024-08-27
申请号:US18223487
申请日:2023-07-18
Applicant: Amazon Technologies, Inc.
Inventor: Najmeh Sadoughi Nourabadi , Kewen Chen , Tu Anh Ho , Christina Botkins , Dongqing Zhang , Muhammad Raffay Hamid
IPC: G06V20/40 , G06F16/71 , G06F16/735 , G06F16/75 , G06N20/00
Abstract: Systems and methods are provided herein for generating optimized video segments. A derivative video segment (e.g., a scene) can be identified from derivative video content (e.g., a movie trailer). The segment may be used a query to search video content (e.g., the movie) for the segment. Once found, an optimized video segment may be generated from the video content. The optimized video segment may have a different start time and/or end time than those corresponding to the original segment. Once optimized, the video segment may be presented to a user or stored for subsequent content recommendations.
-
公开(公告)号:US11869537B1
公开(公告)日:2024-01-09
申请号:US17523777
申请日:2021-11-10
Applicant: Amazon Technologies, Inc.
Inventor: Mayank Sharma , Sandeep Joshi , Muhammad Raffay Hamid
CPC classification number: G10L25/84 , G10L15/063 , G10L15/16 , G10L15/22 , G10L25/18
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for language agnostic automated voice activity detection. Example methods may include determining an audio file associated with video content, generating audio segments using the audio file, the audio segments including a first segment and a second segment, and determining that the first segment includes first voice activity. Methods may include determining that the second segment comprises second voice activity, determining that voice activity is present between a first timestamp associated with the first segment and a second timestamp associated with the second segment, and generating text data representing the voice activity that is present between the first timestamp and the second timestamp.
-
公开(公告)号:US11790695B1
公开(公告)日:2023-10-17
申请号:US17322753
申请日:2021-05-17
Applicant: Amazon Technologies, Inc.
Inventor: Abhinav Aggarwal , Yash Pandya , Laxmi Shivaji Ahire , Lokesh Amarnath Ravindranathan , Manivel Sethu , Muhammad Raffay Hamid
IPC: G06K9/00 , H04N21/44 , G06F16/78 , G06F16/75 , G06F16/783 , G06V40/16 , G06V20/40 , G06F18/23 , G06F18/21
CPC classification number: G06V40/173 , G06F18/2178 , G06F18/23 , G06V20/40 , G06V40/179
Abstract: Devices, systems, and methods are provided for enhanced video annotations using image analysis. A method may include identifying, by a first device, first faces of first video frames, and second faces of second video frames. The method may include determining a first score for the first video frames, the first score indicative of a first number of faces to label, the first number of faces represented by the first video frames, and determining a second score for the second video frames, the second score indicative of a second number of faces to label. The method may include selecting the first video frames for face labeling, and receiving a first face label for the first face. The method may include generating a second face label for the second faces. The method may include sending the first face label and the second face label to a second device for presentation.
-
公开(公告)号:US11617008B1
公开(公告)日:2023-03-28
申请号:US17218009
申请日:2021-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Tarun Gupta , Mayank Sharma , Xiang Hao , Muhammad Raffay Hamid , Zhitao Qiu
IPC: H04N21/439 , G06N20/00 , H04N21/466 , H04N21/475
Abstract: Methods, systems, and computer-readable media for media classification using local and global audio features are disclosed. A media classification system determines local features of an audio input using an audio event detector model that is trained to detect a plurality of audio event classes descriptive of objectionable content. The local features are extracted using maximum values of audio event scores for individual audio event classes. The media classification system determines one or more global features of the audio input using the audio event detector model. The global feature(s) are extracted using averaging of clip-level descriptors of a plurality of clips of the audio input. The media classification system determines a content-based rating for media comprising the audio input based (at least in part) on the local features of the audio input and based (at least in part) on the global feature(s) of the audio input.
-
-
-
-
-
-
-
-
-