Patent search ap:("ADVANCED MICRO DEVICES Page INC.") AND inv:"Karthik Mohan Kumar"

1.

发明申请
MULTIMODAL CONTEXTUALIZER FOR NON-PLAYER CHARACTER GENERATION AND CONFIGURATION 有权

公开(公告)号：US20240428494A1

公开(公告)日：2024-12-26

申请号：US18749032

申请日：2024-06-20

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Karthik Mohan Kumar , Michael Mantor , Pedro Antonio Pena , Archana Ramalingam

IPC: G06T13/40 , H04N19/124

Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data. The face vertex displacement data, joint trajectory data, and speech data are used to produce an animated representation of the NPC, which is then provided to environment-specific adapters to animate the NPC within a virtual digital environment.

2.

发明申请
FUSED MULTIMODAL FRAMEWORK FOR NON-PLAYER CHARACTER GENERATION AND CONFIGURATION 有权

公开(公告)号：US20240424398A1

公开(公告)日：2024-12-26

申请号：US18748920

申请日：2024-06-20

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Karthik Mohan Kumar , Michael Mantor , Pedro Antonio Pena , Archana Ramalingam

IPC: A63F13/56 , A63F13/67 , G06T13/20 , G06T13/40

Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data. The face vertex displacement data, joint trajectory data, and speech data are used to produce an animated representation of the NPC, which is then provided to environment-specific adapters to animate the NPC within a virtual digital environment.

3.

发明申请
ADAPTIVE MULTIMODAL FUSING FOR NON-PLAYER CHARACTER GENERATION AND CONFIGURATION 有权

公开(公告)号：US20240424407A1

公开(公告)日：2024-12-26

申请号：US18749065

申请日：2024-06-20

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Karthik Mohan Kumar , Michael Mantor , Pedro Antonio Pena , Archana Ramalingam

IPC: A63F13/67 , G06F40/284

Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data. The face vertex displacement data, joint trajectory data, and speech data are used to produce an animated representation of the NPC, which is then provided to environment-specific adapters to animate the NPC within a virtual digital environment.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification