-
公开(公告)号:US20210073589A1
公开(公告)日:2021-03-11
申请号:US16821315
申请日:2020-03-17
Applicant: Apple Inc.
Inventor: Atila Orhon , Marco Zuliani , Vignesh Jagadeesh
Abstract: Training a network for image processing with temporal consistency includes obtaining un-annotated frames from a video feed. A pretrained network is applied to the first frame of first frame set comprising a plurality of frames to obtain a first prediction, wherein the pretrained network is pretrained for a first image processing task. A current version of the pretrained network is applied to each frame of the first frame set to obtain a first prediction. A content loss term is determined, based on the first prediction and a current prediction for the frame, based on the current network. A temporal consistency loss term is also determined based on a determined consistency of pixels within each frame of the first frame set. The pretrained network may be refined based on the content loss term and the temporal term to obtain a refined network.
-
公开(公告)号:US12052315B2
公开(公告)日:2024-07-30
申请号:US17129579
申请日:2020-12-21
Applicant: Apple Inc.
Inventor: Stephen Cosman , Kalu Onuka Kalu , Marcelo Lotif Araujo , Michael Chatzidakis , Thi Hai Van Do , Alexis Hugo Louis Durocher , Guillaume Tartavel , Sowmya Gopalan , Vignesh Jagadeesh , Abhishek Bhowmick , John Duchi , Julien Freudiger , Gaurav Kapoor , Ryan M. Rogers
IPC: H04L67/1097 , G06F16/2457 , G06F16/438 , G06F16/44 , G06F18/214 , G06F21/62 , G06N3/063 , G06N20/00 , G06V10/774 , G06V10/82 , H04L67/00
CPC classification number: H04L67/1097 , G06F16/24578 , G06F16/438 , G06F16/447 , G06F18/2148 , G06F21/6254 , G06N3/063 , G06N20/00 , G06V10/7747 , G06V10/82 , H04L67/34
Abstract: Embodiments described herein provide for a non-transitory machine-readable medium storing instructions to cause one or more processors to receive, at a client device, a machine learning model from a server, detect a usage pattern for a content item, store an association between the content item and the detected usage pattern in local data, train the machine learning model using local data for the content item with the detected usage pattern to generate a trained machine learning model, generate an update for the machine learning model, privatize the update for the machine learning model, and transmit the privatized update for the machine learning model to the server.
-
公开(公告)号:US11663806B2
公开(公告)日:2023-05-30
申请号:US17659377
申请日:2022-04-15
Applicant: Apple Inc.
Inventor: Vignesh Jagadeesh , Yingjun Bai , Guillaume Tartavel , Gregory Guyomarc'h
IPC: G06F18/214 , G06V10/20 , G06V10/46 , G06V20/64
CPC classification number: G06V10/255 , G06F18/214 , G06V10/462 , G06V20/64
Abstract: Various methods for utilizing a saliency heatmaps are described. The methods include obtaining image data corresponding to an image of a scene, obtaining a saliency heatmap for the image of the scene based on a saliency network, wherein the saliency heatmap indicates a likelihood of saliency for a corresponding portion of the scene, and manipulating the image data based on the saliency heatmap. In embodiments, the saliency heatmap may be produced using a trained machine learning model. The saliency heatmap may be used for various image processing tasks, such as determining which portion(s) of a scene to base an image capture device's autofocus, auto exposure, and/or white balance operations upon. According to some embodiments, one or more bounding boxes may be generated based on the saliency heatmap, e.g., using an optimization operation, which bounding box(es) may be used to assist or enhance the performance of various image processing tasks.
-
公开(公告)号:US11250041B2
公开(公告)日:2022-02-15
申请号:US16147444
申请日:2018-09-28
Applicant: Apple Inc.
Inventor: Vivek Kumar Rangarajan Sridhar , Xingwen Xu , Vignesh Jagadeesh
Abstract: A device implementing a system for expanded search includes a processor configured to identify plural words, and generate, for each word of the plural words, a word vector based on a proximity of the word relative to other words of the plural words, the word vector comprising plural dimensions. The processor is further configured to create a compressed word vector structure comprising clusters of subsets of the plural dimensions across the word vectors, each cluster including similar values of the respective dimensions, convert the word vectors to points on at least one plane, and partition the at least one plane into nested groupings of the points based on a threshold number of points per nested grouping. The processor is further configured to create a tree look-up structure of the nested groupings, and provide the compressed word vector structure and the tree look-up structure to a client device.
-
公开(公告)号:US11196943B2
公开(公告)日:2021-12-07
申请号:US16653704
申请日:2019-10-15
Applicant: Apple Inc.
Inventor: Shuang Gao , Vasilios E. Anton , Robert A. Bailey , Emilie Kim , Vignesh Jagadeesh , Paul Schneider , Piotr Stanczyk , Arwen Bradley , Jason Klivington , Jacques Gasselin De Richebourg , Joe Triscari , Sébastien Beysserie , Yang Yang , Afshin Dehghan , Rudolph van der Merwe
Abstract: Techniques are disclosed for editing captured media to overcome operational difficulties that may arise during capture operations. According to these techniques, content may be captured with a pair of cameras, a first camera having a wider field of view than a second camera. Object(s) may be detected from captured content from the wider field of view camera. The captured content may be processed from the wider field of view camera in a location of at least one detected object. Typically, operators may attempt to frame content using content from the narrower field of view camera. As a result, operators may be unaware that desired content is captured using a second, wider field of view camera. Results from the processed wider field of view data may be proposed to operators for review and, if desired, retention.
-
-
-
-