Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Burak Uzkent"

1.

发明申请
METHOD AND APPARATUS FOR CLASSIFYING IMAGES USING AN ARTIFICIAL INTELLIGENCE MODEL 有权

公开(公告)号：US20220309774A1

公开(公告)日：2022-09-29

申请号：US17701209

申请日：2022-03-22

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor： Burak Uzkent , Vasili Ramanishka , Yilin Shen , Hongxia Jin

IPC: G06V10/82 , G06V10/764

Abstract: An apparatus for performing image processing, may include at least one processor configured to: input an image to a vision transformer comprising a plurality of encoders that correspond to at least one fixed encoder and a plurality of adaptive encoders; process the image via the at least one fixed encoder to obtain image representations; determine one or more layers of the plurality of adaptive encoders to drop, by inputting the image representations to a policy network configured to determine layer dropout actions for the plurality of adaptive encoders; and obtain a class of the input image using remaining layers of the plurality of adaptive encoders other than the dropped one or more layers.

2.

发明公开
FUSION TECHNIQUES FOR COMBINING MOST SIGNIFICANT BITS AND LEAST SIGNIFICANT BITS OF IMAGE DATA IN IMAGE PROCESSING OR OTHER APPLICATIONS 审中-公开

公开(公告)号：US20240080423A1

公开(公告)日：2024-03-07

申请号：US18057126

申请日：2022-11-18

Applicant: Samsung Electronics Co., Ltd.

Inventor： Wenbo Li , Zhipeng Mo , Yi Wei , Burak Uzkent , Qian Lou , Yilin Shen , Hongxia Jin

IPC: H04N9/64

CPC classification number: H04N9/64

Abstract: A method includes obtaining raw image data, where the raw image data includes data values each having most significant bits and least significant bits. The method also includes providing the raw image data to a trained machine learning model and generating processed image data using the trained machine learning model. The method further includes presenting an image based on the processed image data. The trained machine learning model is trained to modulate a feature map associated with the most significant bits of the data values of the raw image data based on the least significant bits of the data values of the raw image data in order to generate a fusion of the most significant bits and the least significant bits of the data values of the raw image data.

3.

发明公开
SYSTEM AND METHOD FOR SUPERVISED CONTRASTIVE LEARNING FOR MULTI-MODAL TASKS 审中-公开

公开(公告)号：US20230245435A1

公开(公告)日：2023-08-03

申请号：US17589535

申请日：2022-01-31

Applicant: Samsung Electronics Co., Ltd.

Inventor： Changsheng Zhao , Burak Uzkent , Yilin Shen , Hongxia Jin

IPC: G06V10/80 , G06V10/778 , G06V10/774 , G06F40/279

CPC classification number: G06V10/811 , G06V10/778 , G06V10/774 , G06F40/279

Abstract: A method includes obtaining a batch of training data including multiple paired image-text pairs and multiple unpaired image-text pairs, where each paired image-text pair and each unpaired image-text pair includes an image and a text. The method also includes training a machine learning model using the training data based on an optimization of a combination of losses. The losses include, for each paired image-text pair, (i) a first multi-modal representation loss based on the paired image-text pair and (ii) a second multi-modal representation loss based on two or more unpaired image-text pairs, selected from among the multiple unpaired image-text pairs, wherein each of the two or more unpaired image-text pairs includes either the image or the text of the paired image-text pair.

4.

发明授权
System and method for supervised contrastive learning for multi-modal tasks 有权

公开(公告)号：US12183062B2

公开(公告)日：2024-12-31

申请号：US17589535

申请日：2022-01-31

Applicant: Samsung Electronics Co., Ltd.

Inventor： Changsheng Zhao , Burak Uzkent , Yilin Shen , Hongxia Jin

IPC: G06V10/80 , G06F40/279 , G06V10/774 , G06V10/778

Abstract: A method includes obtaining a batch of training data including multiple paired image-text pairs and multiple unpaired image-text pairs, where each paired image-text pair and each unpaired image-text pair includes an image and a text. The method also includes training a machine learning model using the training data based on an optimization of a combination of losses. The losses include, for each paired image-text pair, (i) a first multi-modal representation loss based on the paired image-text pair and (ii) a second multi-modal representation loss based on two or more unpaired image-text pairs, selected from among the multiple unpaired image-text pairs, wherein each of the two or more unpaired image-text pairs includes either the image or the text of the paired image-text pair.

5.

发明公开
SMALL AND FAST TRANSFORMER MODEL FOR MULTI-MODAL OR OTHER TASKS 审中-公开

公开(公告)号：US20230177338A1

公开(公告)日：2023-06-08

申请号：US18073383

申请日：2022-12-01

Applicant: Samsung Electronics Co., Ltd.

Inventor： Qian Lou , Yen-Chang Hsu , Burak Uzkent , Ting Hua , Yilin Shen , Hongxia Jin

IPC: G06N3/082 , G06V10/82 , G06V10/772

CPC classification number: G06N3/082 , G06V10/82 , G06V10/772

Abstract: A method includes obtaining, using a first electronic device, a weight matrix associated with a trained transformer model. The method also includes factorizing the weight matrix into a dictionary weight matrix and an intermediate matrix. The method further includes pruning the intermediate matrix to generate a sparse intermediate matrix. The method also includes fine-tuning the sparse intermediate matrix based on a training dataset to generate a fine-tuned sparse intermediate matrix. The method further includes determining an index matrix and a coefficient matrix based on the fine-tuned sparse intermediate matrix. In addition, the method includes deploying the dictionary weight matrix, the index matrix, and the coefficient matrix to a second electronic device without deploying the weight matrix to the second electronic device. A number of parameters in the dictionary weight matrix, the index matrix, and the coefficient matrix is smaller than a number of parameters in the weight matrix.

6.

发明申请
Structured Pruning of Vision Transformer 有权

公开(公告)号：US20230073835A1

公开(公告)日：2023-03-09

申请号：US17900126

申请日：2022-08-31

Applicant: Samsung Electronics Co., Ltd.

Inventor： Miao Yin , Burak Uzkent , Yilin Shen , Hongxia Jin

IPC: G06V10/70 , G06V10/774 , G06V10/776 , G06V10/74

Abstract: In one embodiment, a method includes accessing a batch B of a plurality of images, wherein each image in the batch is part of a training set of images used to train a vision transformer comprising a plurality of attention heads. The method further includes determining, for each attention head A, a similarity between (1) the output of the attention head evaluated using each image in the batch and the (2) output of each attention head evaluated using each image in the batch. The method further includes determining, based on the determined similarities, an importance score for each attention head; and pruning, based on the importance scores, one or more attention heads from the vision transformer.

Patent Agency Ranking