-
公开(公告)号:WO2021258978A1
公开(公告)日:2021-12-30
申请号:PCT/CN2021/096269
申请日:2021-05-27
Applicant: 北京字节跳动网络技术有限公司
IPC: A63F13/213 , A63F13/52 , G06K9/00 , G06V40/161 , G06V40/174
Abstract: 一种操作控制的方法及装置,其中,该方法包括:获取目标用户的人脸图像(S101);检测所述人脸图像中目标部位的位置信息(S102);基于检测到的位置信息,在所述人脸图像上与检测到的位置信息对应的相对位置处展示处于初始展示形态的目标虚拟道具(S103);根据检测到的所述目标部位的状态信息,调整所述目标虚拟道具的展示形态(S104)。该方法可以实现用户对虚拟道具展示形态的实时控制,实现了用户人脸图像与虚拟道具的配合展示,增强了对虚拟道具进行操作的现实体验,另外,由于虚拟道具代替了现实道具,还起到了节省素材成本、保护环境(减少现实道具垃圾)、以及便于统计操作结果的作用。
-
公开(公告)号:WO2022212787A1
公开(公告)日:2022-10-06
申请号:PCT/US2022/022953
申请日:2022-03-31
Applicant: SONY INTERACTIVE ENTERTAINMENT LLC
Inventor: WEDIG, Geoff
IPC: G06T13/40 , A63F13/57 , G06T17/20 , G06V10/774 , G06V40/168 , G06V40/172 , G06V40/174
Abstract: Methods and systems are provided for training a model using a simulated character for animating a facial expression of a game character. The method includes generating facial expressions of the simulated character using input label value files (iLVFs). The method includes capturing mesh data of the simulated character using a virtual camera to generate three- dimensional (3D) depth data of a face of the simulated character. In one embodiment, the 3D depth data being output as mesh files corresponding to frames captured by the virtual camera. The method includes processing the iLVFs and the mesh data to train the model. In one embodiment, the model is configured to receive input mesh files from a human actor to generate output label value files (oLVFs) that are used for animating the facial expression of the game character. In this way, a real human actor is not required for training the model.
-
3.
公开(公告)号:WO2022212309A1
公开(公告)日:2022-10-06
申请号:PCT/US2022/022253
申请日:2022-03-29
Applicant: SNAP INC.
Inventor: GOLOBOKOV, Roman , MARINENKO, Alexandr , MASHRABOV, Aleksandr , BROMOT, Aleksei , TKACHENKO, Grigoriy
IPC: G06T17/00 , G06T13/40 , G06T19/006 , G06T2200/24 , G06V40/168 , G06V40/174
Abstract: The subject technology captures first image data by a computing device, the first image data comprising a target face of a target actor and facial expressions of the target actor, the facial expressions including lip movements. The subject technology generates, based at least in part on frames of a source media content, sets of source pose parameters. The subject technology receives a selection of a particular facial expression from a set of facial expressions. The subject technology generates, based at least in part on sets of source pose parameters and the selection of the particular facial expression, an output media content. The subject technology provides augmented reality content based at least in part on the output media content for display on the computing device.
-
公开(公告)号:WO2022212171A1
公开(公告)日:2022-10-06
申请号:PCT/US2022/021747
申请日:2022-03-24
Applicant: SNAP INC. , MARINENKO, Alexandr , MASHRABOV, Aleksandr , PCHELNIKOV, Alexey
Inventor: MARINENKO, Alexandr , MASHRABOV, Aleksandr , PCHELNIKOV, Alexey
IPC: G06T13/40 , H04N21/431 , G06T7/73 , G06V10/82 , G06V40/16 , H04N21/4788 , G06Q30/0276 , G06Q50/01 , G06T11/00 , G06V40/161 , G06V40/174 , H04N21/4312 , H04N21/4318
Abstract: The subject technology receives frames of a source media content, the frames of the source media content including representations of a head and a face of a source actor. The subject technology generates sets of source pose parameters. The subject technology receives at least one target image, the at least one target image including representations of a target head and a target face of a target entity. The subject technology generates, based at least in part on the sets of source pose parameters, an output media content, each frame of the output media content includes an image of the target face in at least one frame of the output media content. The subject technology provides an online advertisement based at least in part on the output media content for display on a computing device.
-
公开(公告)号:WO2022240525A1
公开(公告)日:2022-11-17
申请号:PCT/US2022/024179
申请日:2022-04-11
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: WILLIAMS, Todd Matthew
IPC: G10H1/00 , G10H1/36 , G06F40/279 , G06K9/6256 , G06N20/00 , G06V10/40 , G06V20/41 , G06V20/44 , G06V20/46 , G06V40/172 , G06V40/174 , G10H1/0025 , G10H1/368 , G10H2210/005 , G10H2210/036 , G10H2210/111 , G10H2220/441 , G10H2240/085 , G10H2250/311 , G10L25/57 , G11B27/036
Abstract: A method for training one or more AI models for generating audio scores accompanying visual datasets includes obtaining training data comprising a plurality of audiovisual datasets and analyzing each of the plurality of audiovisual datasets to extract multiple visual features, textual features, and audio features. The method also includes correlating the multiple visual features and textual features with the multiple audio features via a machine learning network. Based on the correlations between the visual features, textual features, and audio features, one or more AI models are trained for composing one or more audio scores for accompanying a given dataset.
-
公开(公告)号:WO2022238083A1
公开(公告)日:2022-11-17
申请号:PCT/EP2022/060068
申请日:2022-04-14
Applicant: REACTIVE REALITY AG
Inventor: VALTA, Thomas , BRAULIO, Sespede , GRASMUG, Philipp , HAUSWIESNER, Stefan
IPC: G06T17/00 , G06T19/20 , G06V10/44 , G06V10/77 , G06V20/64 , G06V40/16 , G06T2219/2012 , G06T2219/2021 , G06V10/7715 , G06V20/647 , G06V40/161 , G06V40/171 , G06V40/174
Abstract: A method for generating a 3D avatar using a computer system is disclosed, wherein the computer system has access to a parametric body model, PBM, template and to a base texture map associated with the PBM template, and the PBM template comprises a 3D representation of a surface of a template body. The method includes receiving a set of keyframes captured with a digital camera, each of the keyframes including a color image having a different viewing angle of at least a head area of a user, together with 3D metadata, generating a head data structure thereof comprising a 3D representation and the keyframes and being adapted for image based rendering, generating a textured PBM based on the PBM template, and generating the avatar by assembling the head data structure and the textured PBM.
-
公开(公告)号:WO2021244217A1
公开(公告)日:2021-12-09
申请号:PCT/CN2021/092344
申请日:2021-05-08
Applicant: 腾讯科技(深圳)有限公司
IPC: G06K9/00 , G06K9/62 , G06K9/6256 , G06K9/6269 , G06K9/629 , G06V40/168 , G06V40/174
Abstract: 一种应用于人工智能领域的表情迁移模型的训练方法,该方法包括:获取源域人脸图像、目标域人脸图像以及人脸特征图像;通过待训练表情迁移模型获取合成人脸图像;通过判别网络模型获取第一判别结果以及第二判别结果;通过图像分类模型获取类别特征向量;根据类别特征向量、第一判别结果以及第二判别结果,对待训练表情迁移模型的模型参数进行更新,得到表情迁移模型。本申请还公开了表情迁移的方法及装置。上述方法无需对人脸图像进行复杂的图像处理,降低了训练难度和训练成本,并且有利于表情迁移模型输出更真实的人脸图像。
-
公开(公告)号:WO2023056842A1
公开(公告)日:2023-04-13
申请号:PCT/CN2022/120511
申请日:2022-09-22
Applicant: NINGBO GEELY AUTOMOBILE RESEARCH & DEVELOPMENT CO., LTD. , ZHEJIANG GEELY HOLDING GROUP CO., LTD.
Inventor: GARDTMAN, Angelika , NILSSON, Magnus
IPC: G10K15/12 , G10K11/16 , G10L21/003 , G10L25/63 , G10L21/0316 , G06V40/174 , G10L15/26
Abstract: The disclosure relates to a system (100) for silencing a person, the system (100) comprises a microphone (10a, 10b, 10c, 10d) configured to obtain sound of a first person (1), a speaker (20a, 20b, 20c, 20d) configured to playback sound to the first person (1), a processing circuitry (102a, 102b, 102c) connected to the microphone (10a, 10b, 10c, 10d) and the speaker (20a, 20b, 20c, 20d) and configured to obtain sound of the first person (1) by the microphone (10a, 10b, 10c, 10d), and play back the obtained sound of the first person (1) with a predefined time delay (td) by the speaker (20a, 20b, 20c, 20d). The disclosure further relates to a method for silencing a person and a computer program product (500).
-
公开(公告)号:WO2022212503A1
公开(公告)日:2022-10-06
申请号:PCT/US2022/022545
申请日:2022-03-30
Applicant: SNAP INC. , TKACHENKO, Grigoriy , ZAITSEVA, Inna
Inventor: TKACHENKO, Grigoriy , ZAITSEVA, Inna
IPC: G06T13/40 , G06T7/00 , G06T19/00 , G06F3/0482 , G06T19/006 , G06T2200/24 , G06T2207/20084 , G06T2207/30201 , G06T7/70 , G06V10/82 , G06V40/103 , G06V40/161 , G06V40/174 , G06V40/20
Abstract: The subject technology receives a selection of a selectable graphical item to initiate generating augment reality content including facial synthesis, the selection being received by a third party application, the third party application being executed by a computing device separate from a first party application and a messaging server system. The subject technology captures image data by the client device. The subject technology generates, by the one or more hardware processors and based at least in part on frames of a source media content, sets of source pose parameters. The subject technology generates, based at least in part on sets of the source pose parameters, an output media content using an interface communicating with the messaging server system. The subject technology provides augmented reality content based at least in part on the output media content for display on the computing device.
-
公开(公告)号:WO2022062403A1
公开(公告)日:2022-03-31
申请号:PCT/CN2021/091086
申请日:2021-04-29
Applicant: 平安科技(深圳)有限公司
Inventor: 易苗
IPC: G06K9/00 , G06K9/6256 , G06N3/0454 , G06V40/168 , G06V40/174
Abstract: 本申请适用于生物识别技术领域,提供了一种表情识别模型训练方法、装置、终端设备及存储介质,方法包括:获取第一表情数据,并对所述第一表情数据进行数据增广,生成第二表情数据;将所述第二表情数据输入至预设的人脸识别模型中,以提取所述第二表情数据的监督特征;根据所述第二表情数据及其监督特征,对预设的初始表情识别模型进行训练,得到所述表情识别模型。采样上述方法得到的表情识别模型可以提高人脸表情识别的准确率。
-
-
-
-
-
-
-
-
-