Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Muhammad Raffay Hamid"

11.

发明授权
Systems and methods for video-based sports field registration 有权

公开(公告)号：US11816849B2

公开(公告)日：2023-11-14

申请号：US17937323

申请日：2022-09-30

Applicant: Amazon Technologies, Inc.

Inventor： Xiaohan Nie , Muhammad Raffay Hamid

IPC: G06T7/33 , G06T7/73 , G06V20/40 , G06V10/75 , H04N5/272 , G06F18/40 , G06F18/214 , G06F18/21 , H04N21/2187 , H04N21/234

CPC classification number: G06T7/337 , G06F18/214 , G06F18/2163 , G06F18/2178 , G06F18/41 , G06T7/73 , G06V10/751 , G06V20/46 , H04N5/272 , G06T2207/10016 , G06T2207/20021 , G06T2207/20081 , G06T2207/20092 , G06T2207/30204 , G06T2207/30228 , H04N21/2187 , H04N21/23418

Abstract: Methods and systems are described for registering a sports field to a video. Video of a live event may feature participants at a venue. A template of the venue, including virtual markings that represent real markings on the venue, may be obtained. A homographic transformation between an image plane and a ground plane may be determined by matching virtual markings to corresponding real markings captured in at least one frame of the video. The determined homographic transformation may be used in the automated analysis of sports statistics and in improving inserted annotations and visualizations.

12.

发明授权
Ensemble of machine learning models for automatic scene change detection 有权

公开(公告)号：US11776273B1

公开(公告)日：2023-10-03

申请号：US17107514

申请日：2020-11-30

Applicant: Amazon Technologies, Inc.

Inventor： Shixing Chen , Muhammad Raffay Hamid , Vimal Bhat , Shiva Krishnamurthy

IPC: G06V20/40 , G06N5/04 , G06N20/20 , G10L25/78 , G06F18/213

CPC classification number: G06V20/49 , G06F18/213 , G06N5/04 , G06N20/20 , G10L25/78

Abstract: Techniques for automatic scene change detection are described. As one example, a computer-implemented method includes receiving a request to train an ensemble of machine learning models on a training dataset of videos having labels that indicate scene changes to detect a scene change in a video, partitioning each video file of the training dataset of videos into a plurality of shots, training the ensemble of machine learning models into a trained ensemble of machine learning models based at least in part on the plurality of shots of the training dataset of videos and the labels that indicate scene changes, receiving an inference request for an input video, partitioning the input video into a plurality of shots, generating, by the trained ensemble of machine learning models, an inference of one or more scene changes in the input video based at least in part on the plurality of shots of the input video, and transmitting the inference to a client application or to a storage location.

13.

发明公开
LANGUAGE AGNOSTIC DRIFT CORRECTION 审中-公开

公开(公告)号：US20230282006A1

公开(公告)日：2023-09-07

申请号：US18175044

申请日：2023-02-27

Applicant: Amazon Technologies, Inc.

Inventor： Tamojit Chatterjee , Mayank Sharma , Muhammad Raffay Hamid , Sandeep Joshi

IPC: G06V20/62 , G11B27/10 , G06F40/169 , G06V20/40 , G06N7/01

CPC classification number: G06V20/635 , G11B27/10 , G06F40/169 , G06V20/40 , G06N7/01 , G06V20/44

Abstract: Systems, methods, and computer-readable media are disclosed for language-agnostic subtitle drift detection and correction. A method may include determining subtitles and/or captions from media content (e.g., videos), the subtitles and/or captions corresponding to dialog in the media content. The subtitles may be broken up into segments which may be analyzed to determine a likelihood of drift (e.g., a likelihood that the subtitles are out of synchronization with the dialog in the media content) for each segment. For segments with a high likelihood of drift, the subtitles may be incrementally adjusted to determine an adjustment that eliminates and/or reduces the amount of drift, and the drift in the segment may be corrected based on the drift amount detected. A linear regression model and/or human blocks determined by human operators may be used to otherwise optimize drift correction.

14.

发明授权
Leveraging computer vision and machine learning to identify compelling scenes 有权

公开(公告)号：US11354905B1

公开(公告)日：2022-06-07

申请号：US17247324

申请日：2020-12-07

Applicant: Amazon Technologies, Inc.

Inventor： Kewen Chen , Tu Anh Ho , Muhammad Raffay Hamid , Shixing Chen

IPC: G06V20/40 , G06V40/16

Abstract: Methods and apparatus are described for generating compelling preview clips of media presentations. Compelling clips are identified based on the extent to which human faces are shown and/or the loudness of the audio associated with the clips. One or more of these compelling clips are then provided to a client device for playback.

15.

发明授权
Automated video preview generation 有权

公开(公告)号：US11336972B1

公开(公告)日：2022-05-17

申请号：US17141855

申请日：2021-01-05

Applicant: Amazon Technologies, Inc.

Inventor： Muhammad Raffay Hamid , Kewen Chen , Anne TuAnh Thanh Thuy Ho , Guy Friedel , Arun Velayudhan Pillai , Dhaval Damani , Jacob William Jensen , Zuzanna Maria Stepniakowska Coggins , Maciej Tadeusz Golonka , Anantha Krishna Hodrali Srinivasa Bhatta

IPC: H04N21/8549 , G10L25/78 , H04N21/44 , H04N21/475 , H04N21/442

Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated video preview generation. Example methods may include determining video content, determining a first shot transition, a second shot transition, a third shot transition, and a fourth shot transition in the video content, and determining that human speech is present during the first shot transition and the second shot transition. Example methods may include determining a first timestamp associated with the third shot transition, determining a second timestamp associated with the fourth shot transition, generating a first video preview of the video content, where the first video preview includes a segment of the video content from the first timestamp to the second timestamp, and causing presentation of the first video preview, where the first video preview does not include a segment of the video content between the first shot transition and the second shot transition.

16.

发明授权
Automated video preview generation 有权

公开(公告)号：US10917704B1

公开(公告)日：2021-02-09

申请号：US16680825

申请日：2019-11-12

Applicant: Amazon Technologies, Inc.

Inventor： Muhammad Raffay Hamid , Kewen Chen , Anne TuAnh Thanh Thuy Ho , Guy Friedel , Arun Velayudhan Pillai , Dhaval Damani , Jacob William Jensen , Zuzanna Maria Stepniakowska Coggins , Maciej Tadeusz Golonka , Anantha Krishna Hodrali Srinivasa Bhatta

IPC: H04N21/8549 , G10L25/78 , H04N21/475 , H04N21/44 , H04N21/442

Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated video preview generation. Example methods may include determining video content, determining a first shot transition, a second shot transition, a third shot transition, and a fourth shot transition in the video content, and determining that human speech is present during the first shot transition and the second shot transition. Example methods may include determining a first timestamp associated with the third shot transition, determining a second timestamp associated with the fourth shot transition, generating a first video preview of the video content, where the first video preview includes a segment of the video content from the first timestamp to the second timestamp, and causing presentation of the first video preview, where the first video preview does not include a segment of the video content between the first shot transition and the second shot transition.

17.

发明授权
Techniques for generating optimized video segments utilizing a visual search 有权

公开(公告)号：US12073625B1

公开(公告)日：2024-08-27

申请号：US18223487

申请日：2023-07-18

Applicant: Amazon Technologies, Inc.

Inventor： Najmeh Sadoughi Nourabadi , Kewen Chen , Tu Anh Ho , Christina Botkins , Dongqing Zhang , Muhammad Raffay Hamid

IPC: G06V20/40 , G06F16/71 , G06F16/735 , G06F16/75 , G06N20/00

CPC classification number: G06V20/46 , G06F16/71 , G06F16/735 , G06F16/75 , G06N20/00 , G06V20/41

Abstract: Systems and methods are provided herein for generating optimized video segments. A derivative video segment (e.g., a scene) can be identified from derivative video content (e.g., a movie trailer). The segment may be used a query to search video content (e.g., the movie) for the segment. Once found, an optimized video segment may be generated from the video content. The optimized video segment may have a different start time and/or end time than those corresponding to the original segment. Once optimized, the video segment may be presented to a user or stored for subsequent content recommendations.

18.

发明授权
Language agnostic automated voice activity detection 有权

公开(公告)号：US11869537B1

公开(公告)日：2024-01-09

申请号：US17523777

申请日：2021-11-10

Applicant: Amazon Technologies, Inc.

Inventor： Mayank Sharma , Sandeep Joshi , Muhammad Raffay Hamid

IPC: G10L25/84 , G10L15/06 , G10L15/22 , G10L15/16 , G10L25/18

CPC classification number: G10L25/84 , G10L15/063 , G10L15/16 , G10L15/22 , G10L25/18

Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for language agnostic automated voice activity detection. Example methods may include determining an audio file associated with video content, generating audio segments using the audio file, the audio segments including a first segment and a second segment, and determining that the first segment includes first voice activity. Methods may include determining that the second segment comprises second voice activity, determining that voice activity is present between a first timestamp associated with the first segment and a second timestamp associated with the second segment, and generating text data representing the voice activity that is present between the first timestamp and the second timestamp.

19.

发明授权
Enhanced video annotation using image analysis 有权

公开(公告)号：US11790695B1

公开(公告)日：2023-10-17

申请号：US17322753

申请日：2021-05-17

Applicant: Amazon Technologies, Inc.

Inventor： Abhinav Aggarwal , Yash Pandya , Laxmi Shivaji Ahire , Lokesh Amarnath Ravindranathan , Manivel Sethu , Muhammad Raffay Hamid

IPC: G06K9/00 , H04N21/44 , G06F16/78 , G06F16/75 , G06F16/783 , G06V40/16 , G06V20/40 , G06F18/23 , G06F18/21

CPC classification number: G06V40/173 , G06F18/2178 , G06F18/23 , G06V20/40 , G06V40/179

Abstract: Devices, systems, and methods are provided for enhanced video annotations using image analysis. A method may include identifying, by a first device, first faces of first video frames, and second faces of second video frames. The method may include determining a first score for the first video frames, the first score indicative of a first number of faces to label, the first number of faces represented by the first video frames, and determining a second score for the second video frames, the second score indicative of a second number of faces to label. The method may include selecting the first video frames for face labeling, and receiving a first face label for the first face. The method may include generating a second face label for the second faces. The method may include sending the first face label and the second face label to a second device for presentation.

20.

发明授权
Media classification using local and global audio features 有权

公开(公告)号：US11617008B1

公开(公告)日：2023-03-28

申请号：US17218009

申请日：2021-03-30

Applicant: Amazon Technologies, Inc.

Inventor： Tarun Gupta , Mayank Sharma , Xiang Hao , Muhammad Raffay Hamid , Zhitao Qiu

IPC: H04N21/439 , G06N20/00 , H04N21/466 , H04N21/475

Abstract: Methods, systems, and computer-readable media for media classification using local and global audio features are disclosed. A media classification system determines local features of an audio input using an audio event detector model that is trained to detect a plurality of audio event classes descriptive of objectionable content. The local features are extracted using maximum values of audio event scores for individual audio event classes. The media classification system determines one or more global features of the audio input using the audio event detector model. The global feature(s) are extracted using averaging of clip-level descriptors of a plurality of clips of the audio input. The media classification system determines a content-based rating for media comprising the audio input based (at least in part) on the local features of the audio input and based (at least in part) on the global feature(s) of the audio input.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification