-
公开(公告)号:US10582125B1
公开(公告)日:2020-03-03
申请号:US14727782
申请日:2015-06-01
Applicant: Amazon Technologies, Inc.
Inventor: Ross David Roessler , Matthew Alan Townsend , Yinfei Yang , Jim Oommen Thomas , Deon Poncini , William Evan Welbourne , Geoff Hunter Donaldson , Paul Aksenti Savastinuk , Cheng-Hao Kuo
Abstract: A video capture device may include multiple cameras that simultaneously capture video data. The video capture device and/or one or more remote computing resources may stitch the video data captured by the multiple cameras to generate stitched video data that corresponds to 360° video. The remote computing resources may apply one or more algorithms to the stitched video data to identify one or more frames that depict content that is likely to be of interest to a user. The video capture device and/or the remote computing resources may generate one or more images from the one or more frames, and may send the one or more images to the user.
-
公开(公告)号:US10277813B1
公开(公告)日:2019-04-30
申请号:US14751024
申请日:2015-06-25
Applicant: Amazon Technologies, Inc.
Inventor: Jim Oommen Thomas , Paul Aksenti Savastinuk , Cheng-Hao Kuo , Tsz Ho Yu , Ross David Roessler , William Evan Welbourne , Yinfei Yang
Abstract: A viewing device, such as a virtual reality headset, allows a user to view a panoramic scene captured by one or more video capture devices that may include multiple cameras that simultaneously capture 360° video data. The viewing device may display the panoramic scene in real time and change the display in response to moving the viewing device and/or changing perspectives by switching to video data being captured by a different video capture device within the environment. Moreover, multiple video capture devices located within an environment can be used to create a three-dimensional representation of the environment that allows a user to explore the three-dimensional space while viewing the environment in real time.
-
公开(公告)号:US10178301B1
公开(公告)日:2019-01-08
申请号:US14750895
申请日:2015-06-25
Applicant: Amazon Technologies, Inc.
Inventor: William Evan Welbourne , Ross David Roessler , Cheng-Hao Kuo , Jim Oommen Thomas , Paul Aksenti Savastinuk , Yinfei Yang
Abstract: Devices, systems and methods are disclosed for improving facial recognition and/or speaker recognition models by using results obtained from one model to assist in generating results from the other model. For example, a device may perform facial recognition for image data to identify users and may use the results of the facial recognition to assist in speaker recognition for corresponding audio data. Alternatively or additionally, the device may perform speaker recognition for audio data to identify users and may use the results of the speaker recognition to assist in facial recognition for corresponding image data. As a result, the device may identify users in video data that are not included in the facial recognition model and may identify users in audio data that are not included in the speaker recognition model. The facial recognition and/or speaker recognition models may be updated during run-time and/or offline using post-processed data.
-
公开(公告)号:US12293040B1
公开(公告)日:2025-05-06
申请号:US18307397
申请日:2023-04-26
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Shuang Gao , Jim Oommen Thomas , Jingyi Zhang , Songyao Jiang , Junwu Luo
IPC: G06F3/041 , G06F3/0354
Abstract: A stylus provides input via a touchscreen comprising a touch sensor and a display. Latency between placement of a stylus tip and corresponding presentation of visual indicia on the display is reduced or eliminated by determining a predicted path of the stylus tip during a stroke. Visual indicia is presented on the display, based on the predicted path. Inputs from the touch sensor may include hover events associated with detection of the tip while not in contact with the touchscreen and touch events associated with presence of the tip on the touchscreen. A machine learning network may be trained to determine the predicted path. A portion of the network may be trained to accept dynamic-length sequences of events and generate fixed length sequences, reducing subsequent network complexity. A hand may be detected and used to determine the predicted path. The end of a stroke may be predicted, reducing overshoot.
-
公开(公告)号:US11367306B1
公开(公告)日:2022-06-21
申请号:US16909074
申请日:2020-06-23
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Cheng-Hao Kuo , Anuja Shantaram Dawane , Che-Chun Su , Jingjing Zheng , Jim Oommen Thomas
Abstract: An autonomous mobile device (AMD) or other device may perform various tasks during operation. The AMD includes a camera to acquire an image. Some tasks, such as presenting information on a display screen or a video call, may involve the AMD determining whether a user is engaged with the AMD. The AMD may move a component, such as the camera or the display screen, to provide a best experience for an engaged user. Images from the camera are processed to determine attributes of the user, such as yaw of the face of the user, pitch of the face of the user, distance from the camera, and so forth. Based on the values of these attributes, a user engagement score is determined. The score may be used to select a particular user from many users in the image, or to otherwise facilitate operation of the AMD.
-
公开(公告)号:US10027883B1
公开(公告)日:2018-07-17
申请号:US14307491
申请日:2014-06-18
Applicant: Amazon Technologies, Inc.
Inventor: Cheng-Hao Kuo , Jim Oommen Thomas , Tianyang Ma , Stephen Vincent Mangiat , Sisil Sanjeev Mehta , Ambrish Tyagi , Amit Kumar Agrawal , Kah Kuen Fu , Sharadh Ramaswamy
Abstract: Various embodiments enable a primary user to be identified and tracked using stereo association and multiple tracking algorithms. For example, a face detection algorithm can be run on each image captured by a respective camera independently. Stereo association can be performed to match faces between cameras. If the faces are matched and a primary user is determined, a face pair is created and used as the first data point in memory for initializing object tracking. Further, features of a user's face can be extracted and the change in position of these features between images can determine what tracking method will be used for that particular frame.
-
公开(公告)号:US09973711B2
公开(公告)日:2018-05-15
申请号:US14753826
申请日:2015-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Yinfei Yang , William Evan Welbourne , Ross David Roessler , Paul Aksenti Savastinuk , Cheng-Hao Kuo , Jim Oommen Thomas , Tsz Ho Yu
CPC classification number: H04N5/2628 , G06K9/00711 , G06K9/00751 , G06K9/3233 , G06T3/40 , G11B27/031 , G11B27/06 , H04N5/23238
Abstract: Devices, systems and methods are disclosed for identifying content in video data and creating content-based zooming and panning effects to emphasize the content. Contents may be detected and analyzed in the video data using computer vision, machine learning algorithms or specified through a user interface. Panning and zooming controls may be associated with the contents, panning or zooming based on a location and size of content within the video data. The device may determine a number of pixels associated with content and may frame the content to be a certain percentage of the edited video data, such as a close-up shot where a subject is displayed as 50% of the viewing frame. The device may identify an event of interest, may determine multiple frames associated with the event of interest and may pan and zoom between the multiple frames based on a size/location of the content within the multiple frames.
-
公开(公告)号:US09160993B1
公开(公告)日:2015-10-13
申请号:US13945823
申请日:2013-07-18
Applicant: Amazon Technologies, Inc.
Inventor: Christopher John Lish , Geoffrey Scott Heller , Jim Oommen Thomas , Chang Yuan , Oleg Rybakov
CPC classification number: H04N9/3185 , G06F3/0425 , G06F3/0488 , H04N5/23219 , H04N5/23229 , H04N5/23293 , H04N9/3194
Abstract: Approaches enable the projection of one or more visual elements, such as one or more dynamically changing graphical elements, that can substantially bound, or otherwise at least partially surround or identify, an object recognized by a computing device. The computing device can project the graphical elements to collectively appear as a bounding element for the recognized/actionable object or object portion. As such, the graphical elements can appear as a bounding element that adorns, decorates, highlights, and/or emphasizes, etc., the recognized/actionable object or object portion. The graphical elements to be dynamic. For example, the graphical elements can be projected to move around individually over time, while still appearing to at least partially surround the recognized/actionable object or object portion. Further, the graphical elements can be used to improve various object recognition approaches.
Abstract translation: 方法使得能够基本上绑定或以其他方式至少部分地围绕或识别由计算设备识别的对象的一个或多个可视元素的投影,诸如一个或多个动态变化的图形元素。 计算设备可以投影图形元素以集体显示为识别/可操作的对象或对象部分的边界元素。 因此,图形元素可以显示为对已识别/可操作的对象或对象部分进行装饰,装饰,突出显示和/或强调等的边界元素。 图形元素是动态的。 例如,图形元素可以被投影以随着时间逐渐移动,同时仍然显示为至少部分地围绕识别/可操作的对象或对象部分。 此外,图形元素可以用于改进各种对象识别方法。
-
-
-
-
-
-
-