Patent search ap:("Apple Inc.") AND inv:"Atila Orhon" Page 1

1.

发明申请
TRAINING TRANSFORMER NEURAL NETWORKS TO GENERATE PARAMETERS OF CONVOLUTIONAL NEURAL NETWORKS 有权

公开(公告)号：US20220391693A1

公开(公告)日：2022-12-08

申请号：US17372367

申请日：2021-07-09

Applicant: Apple Inc.

Inventor： Atila Orhon

IPC: G06N3/08 , G06K9/00

Abstract: This application relates to use of transformer neural networks to generate dynamic parameters for use in convolutional neural networks. In various embodiments, received image data is encoded and the encoded signal is sent to both a decoder and a transformer neural network. The decoder outputs a decoded data for input into a convolutional neural network. The transformer outputs a set of dynamic parameter values for input into the convolutional neural network. The convolutional neural network may use the decoded data and the set of dynamic parameter values to output instance image data show identifying a number of objects in an image. In various embodiments, the decoded data is also used to generate semantic data. The semantic data may be combined with the instance data to form panoptic image data.

2.

发明授权
Interactive image segmentation 有权

公开(公告)号：US11887310B2

公开(公告)日：2024-01-30

申请号：US17078086

申请日：2020-10-22

Applicant: Apple Inc.

Inventor： Vignesh Jagadeesh , Atila Orhon

IPC: G06T7/00 , G06T7/11 , G06T7/10 , G06F17/18 , G06N20/00 , G06N3/047 , G06V10/762 , G06V10/77 , G06V10/82

CPC classification number: G06T7/11 , G06F17/18 , G06N3/047 , G06N20/00 , G06T7/10 , G06V10/763 , G06V10/7715 , G06V10/82

Abstract: A first subset of pixels of an image may be labeled with an object identifier based on user interactions with the image. Pixel data representing the pixels of the image may be passed through an embedding neural network model to generate pixel embedding vectors. A prototype embedding vector associated with the object identifier may be generated based pixel embedding vectors corresponding to the first subset of pixels. For each pixel of a second subset of pixels of the image, a probability that the pixel should be labeled with the object identifier may be determined based on the prototype embedding vector and pixel embedding vectors corresponding to the second subset of pixels. Pixels of the second subset of pixels may be labeled with the object identifier based on the determined probabilities, and the pixels in the image may be segmented based on the pixels labeled with the object identifier.

3.

发明授权
Training transformer neural networks to generate parameters of convolutional neural networks 有权

公开(公告)号：US12033075B2

公开(公告)日：2024-07-09

申请号：US17372367

申请日：2021-07-09

Applicant: Apple Inc.

Inventor： Atila Orhon

IPC: G06N3/08 , G06V20/10 , G06V20/40

CPC classification number: G06N3/08 , G06V20/10 , G06V20/41

Abstract: This application relates to use of transformer neural networks to generate dynamic parameters for use in convolutional neural networks. In various embodiments, received image data is encoded and the encoded signal is sent to both a decoder and a transformer neural network. The decoder outputs a decoded data for input into a convolutional neural network. The transformer outputs a set of dynamic parameter values for input into the convolutional neural network. The convolutional neural network may use the decoded data and the set of dynamic parameter values to output instance image data show identifying a number of objects in an image. In various embodiments, the decoded data is also used to generate semantic data. The semantic data may be combined with the instance data to form panoptic image data.

4.

发明申请
Method for Improving Temporal Consistency of Deep Neural Networks 有权

公开(公告)号：US20210073589A1

公开(公告)日：2021-03-11

申请号：US16821315

申请日：2020-03-17

Applicant: Apple Inc.

Inventor： Atila Orhon , Marco Zuliani , Vignesh Jagadeesh

IPC: G06K9/62 , G06K9/00 , G06N3/04 , G06N3/08

Abstract: Training a network for image processing with temporal consistency includes obtaining un-annotated frames from a video feed. A pretrained network is applied to the first frame of first frame set comprising a plurality of frames to obtain a first prediction, wherein the pretrained network is pretrained for a first image processing task. A current version of the pretrained network is applied to each frame of the first frame set to obtain a first prediction. A content loss term is determined, based on the first prediction and a current prediction for the frame, based on the current network. A temporal consistency loss term is also determined based on a determined consistency of pixels within each frame of the first frame set. The pretrained network may be refined based on the content loss term and the temporal term to obtain a refined network.

Patent Agency Ranking