Patent search ap:("Google LLC") AND inv:"Shan Yang" Page 1

1.

发明申请
Monitoring Animal Pose Dynamics from Monocular Images 有权

公开(公告)号：US20220383652A1

公开(公告)日：2022-12-01

申请号：US17775529

申请日：2020-11-04

Applicant: Google LLC

Inventor： Bryan Andrew Seybold , Shan Yang , Bo Hu , Kevin Patrick Murphy , David Alexander Ross

IPC: G06V40/10 , G06V40/20 , G06V10/82 , G06V10/62 , A01K29/00

Abstract: A computing system comprising one or more computing devices can obtain one or more images of an animal. The computing system can determine, using at least one of one or more machine-learned models, a plurality of joint positions associated with the animal based on the one or more images. The computing system can determine a body model for the animal. The computing system can estimate a body pose for the animal based on the one or more images, the plurality of joint positions, and the determined body model.

2.

发明公开
Attention Bottlenecks for Multimodal Fusion 审中-公开

公开(公告)号：US20230177384A1

公开(公告)日：2023-06-08

申请号：US17545526

申请日：2021-12-08

Applicant: Google LLC

Inventor： Arsha Nagrani , Shan Yang , Anurag Arnab , Chen Sun , Cordelia Luise Schmid

IPC: G06N20/00 , G06N5/04

CPC classification number: G06N20/00 , G06N5/04

Abstract: Example embodiments according to aspects of the present disclosure provide an example computer-implemented method for multimodal data processing with improved cross-modal attention. The example method includes inputting a multimodal sequence to an example machine-learned model. The example model includes a first modal processing stream receiving a first modal portion of the multimodal sequence and a second modal processing stream receiving a second modal portion of the multimodal sequence. The example model includes fusing the first modal processing stream and the second modal processing stream across one or more fusion layers of the machine-learned model through a plurality of cross-modal context encodings. The example method includes outputting an inference based at least in part on the plurality of cross-modal context encodings.

Patent Agency Ranking