-
公开(公告)号:US10783611B2
公开(公告)日:2020-09-22
申请号:US15859992
申请日:2018-01-02
Applicant: Google LLC
Inventor: Raviteja Vemulapalli , Matthew Brown , Seyed Mohammad Mehdi Sajjadi
Abstract: The present disclosure provides systems and methods to increase resolution of imagery. In one example embodiment, a computer-implemented method includes obtaining a current low-resolution image frame. The method includes obtaining a previous estimated high-resolution image frame, the previous estimated high-resolution frame being a high-resolution estimate of a previous low-resolution image frame. The method includes warping the previous estimated high-resolution image frame based on the current low-resolution image frame. The method includes inputting the warped previous estimated high-resolution image frame and the current low-resolution image frame into a machine-learned frame estimation model. The method includes receiving a current estimated high-resolution image frame as an output of the machine-learned frame estimation model, the current estimated high-resolution image frame being a high-resolution estimate of the current low-resolution image frame.
-
公开(公告)号:US20200151438A1
公开(公告)日:2020-05-14
申请号:US16743439
申请日:2020-01-15
Applicant: Google LLC
Inventor: Raviteja Vemulapalli , Aseem Agarwala
Abstract: The present disclosure provides systems and methods that include or otherwise leverage use of a facial expression model that is configured to provide a facial expression embedding. In particular, the facial expression model can receive an input image that depicts a face and, in response, provide a facial expression embedding that encodes information descriptive of a facial expression made by the face depicted in the input image. As an example, the facial expression model can be or include a neural network such as a convolutional neural network. The present disclosure also provides a novel and unique triplet training scheme which does not rely upon designation of a particular image as an anchor or reference image.
-
公开(公告)号:US20190206026A1
公开(公告)日:2019-07-04
申请号:US15859992
申请日:2018-01-02
Applicant: Google LLC
Inventor: Raviteja Vemulapalli , Matthew Brown , Seyed Mohammad Mehdi Sajjadi
CPC classification number: G06T3/4053 , G06N20/00 , G06T3/0093 , G06T3/4046 , G06T5/50 , G06T7/248 , G06T2207/20081
Abstract: The present disclosure provides systems and methods to increase resolution of imagery. In one example embodiment, a computer-implemented method includes obtaining a current low-resolution image frame. The method includes obtaining a previous estimated high-resolution image frame, the previous estimated high-resolution frame being a high-resolution estimate of a previous low-resolution image frame. The method includes warping the previous estimated high-resolution image frame based on the current low-resolution image frame. The method includes inputting the warped previous estimated high-resolution image frame and the current low-resolution image frame into a machine-learned frame estimation model. The method includes receiving a current estimated high-resolution image frame as an output of the machine-learned frame estimation model, the current estimated high-resolution image frame being a high-resolution estimate of the current low-resolution image frame.
-
公开(公告)号:US20230359865A1
公开(公告)日:2023-11-09
申请号:US18044842
申请日:2020-09-16
Applicant: Google LLC
Inventor: Zhuoran Shen , Raviteja Vemulapalli , Irwan Bello , Xuhui Jia , Ching-Hui Chen
Abstract: The present disclosure provides systems, methods, and computer program products for modeling dependencies throughout a network using a global-self attention model with a content attention layer and a positional attention layer that operate in parallel. The model receives input data comprising content values and context positions. The content attention layer generates one or more output features for each context position based on a global attention operation applied to the content values independent of the context positions. The positional attention layer generates an attention map for each of the context positions based on one or more content values of the respective context position and associated neighboring positions. Output is determined based on the output features generated by the content attention layer and the attention map generated for each context position by the positional attention layer. The model improves efficiency and can be used throughout a deep network.
-
公开(公告)号:US20230281979A1
公开(公告)日:2023-09-07
申请号:US18006078
申请日:2020-08-03
Applicant: Xuhui JIA , Raviteja VEMULAPALLI , Yukun ZHU , Bradley Ray GREEN , Bardia DOOSTI , Ching-Hui CHEN , Google LLC
Inventor: Xuhui Jia , Raviteja Vemulapalli , Bradley Ray Green , Bardia Doosti , Ching-Hui Chen
IPC: G06V10/82 , G06V10/776
CPC classification number: G06V10/82 , G06V10/776
Abstract: Systems and methods of the present disclosure are directed to a method for training a machine-learned visual attention model. The method can include obtaining image data that depicts a head of a person and an additional entity. The method can include processing the image data with an encoder portion of the visual attention model to obtain latent head and entity encodings. The method can include processing the latent encodings with the visual attention model to obtain a visual attention value and processing the latent encodings with a machine-learned visual location model to obtain a visual location estimation. The method can include training the models by evaluating a loss function that evaluates differences between the visual location estimation and a pseudo visual location label derived from the image data and between the visual attention value and a ground truth visual attention label.
-
公开(公告)号:US20230214656A1
公开(公告)日:2023-07-06
申请号:US18009629
申请日:2020-06-10
Applicant: Google LLC
Inventor: Raviteja Vemulapalli , Jianrui Cai , Bradley Ray Green , Ching-Hui Chen , Lior Shapira
IPC: G06N3/082 , G06V10/82 , G06V10/764
CPC classification number: G06N3/082 , G06V10/82 , G06V10/764
Abstract: At training time, a base neural network can be trained to perform each of a plurality of basis subtasks included in a total set of basis subtasks (e.g., individually or some combination thereof). Next, a description of a desired combined subtask can be obtained. Based on the description of the combined subtask, a mask generator can produce a pruning mask which is used to prune the base neural network into a smaller combined-subtask-specific network that performs only the two or more basis subtasks included in the combined subtask.
-
公开(公告)号:US11163987B2
公开(公告)日:2021-11-02
申请号:US16743439
申请日:2020-01-15
Applicant: Google LLC
Inventor: Raviteja Vemulapalli , Aseem Agarwala
Abstract: The present disclosure provides systems and methods that include or otherwise leverage use of a facial expression model that is configured to provide a facial expression embedding. In particular, the facial expression model can receive an input image that depicts a face and, in response, provide a facial expression embedding that encodes information descriptive of a facial expression made by the face depicted in the input image. As an example, the facial expression model can be or include a neural network such as a convolutional neural network. The present disclosure also provides a novel and unique triplet training scheme which does not rely upon designation of a particular image as an anchor or reference image.
-
-
-
-
-
-