CONVERSATIONAL AI PLATFORM WITH RENDERED GRAPHICAL OUTPUT

    公开(公告)号:US20210358188A1

    公开(公告)日:2021-11-18

    申请号:US17318871

    申请日:2021-05-12

    Abstract: In various examples, a virtually animated and interactive agent may be rendered for visual and audible communication with one or more users with an application. For example, a conversational artificial intelligence (AI) assistant may be rendered and displayed for visual communication in addition to audible communication with end-users. As such, the AI assistant may leverage the visual domain—in addition to the audible domain—to more clearly communicate with users, including interacting with a virtual environment in which the AI assistant is rendered. Similarly, the AI assistant may leverage audio, video, and/or text inputs from a user to determine a request, mood, gesture, and/or posture of a user for more accurately responding to and interacting with the user.

    CONVERSATIONAL AI PLATFORM WITH RENDERED GRAPHICAL OUTPUT

    公开(公告)号:US20250045996A1

    公开(公告)日:2025-02-06

    申请号:US18921922

    申请日:2024-10-21

    Abstract: In various examples, a virtually animated and interactive agent may be rendered for visual and audible communication with one or more users with an application. For example, a conversational artificial intelligence (AI) assistant may be rendered and displayed for visual communication in addition to audible communication with end-users. As such, the AI assistant may leverage the visual domain—in addition to the audible domain—to more clearly communicate with users, including interacting with a virtual environment in which the AI assistant is rendered. Similarly, the AI assistant may leverage audio, video, and/or text inputs from a user to determine a request, mood, gesture, and/or posture of a user for more accurately responding to and interacting with the user.

    AUDIO-DRIVEN FACIAL ANIMATION USING MACHINE LEARNING

    公开(公告)号:US20250061634A1

    公开(公告)日:2025-02-20

    申请号:US18457251

    申请日:2023-08-28

    Abstract: Systems and methods of the present disclosure include animating virtual avatars or agents according to input audio and one or more selected or determined emotions and/or styles. For example, a deep neural network can be trained to output motion or deformation information for a character that is representative of the character uttering speech contained in audio input. The character can have different facial components or regions (e.g., head, skin, eyes, tongue) modeled separately, such that the network can output motion or deformation information for each of these different facial components. During training, the network can use a transformer-based audio encoder with locked parameters to train an associated decoder using a weighted feature vector. The network output can be provided to a renderer to generate audio-driven facial animation that is emotion-accurate.

    Conversational AI platform with rendered graphical output

    公开(公告)号:US12205210B2

    公开(公告)日:2025-01-21

    申请号:US17318871

    申请日:2021-05-12

    Abstract: In various examples, a virtually animated and interactive agent may be rendered for visual and audible communication with one or more users with an application. For example, a conversational artificial intelligence (AI) assistant may be rendered and displayed for visual communication in addition to audible communication with end-users. As such, the AI assistant may leverage the visual domain—in addition to the audible domain—to more clearly communicate with users, including interacting with a virtual environment in which the AI assistant is rendered. Similarly, the AI assistant may leverage audio, video, and/or text inputs from a user to determine a request, mood, gesture, and/or posture of a user for more accurately responding to and interacting with the user.

    SYNTHETIC AUDIO-DRIVEN BODY ANIMATION USING VOICE TEMPO

    公开(公告)号:US20240233229A1

    公开(公告)日:2024-07-11

    申请号:US18007867

    申请日:2021-11-08

    CPC classification number: G06T13/205 G06T13/40

    Abstract: In various examples, animations may be generated using audio-driven body animation synthesized with voice tempo. For example, full body animation may be driven from an audio input representative of recorded speech, where voice tempo (e.g., a number of phonemes per unit time) may be used to generate a 1D audio signal for comparing to datasets including data samples that each include an animation and a corresponding 1D audio signal. One or more loss functions may be used to compare the 1D audio signal from the input audio to the audio signals of the datasets, as well as to compare joint information of joints of an actor between animations of two or more data samples, in order to identify optimal transition points between the animations. The animations may then be stitched together—e.g., using interpolation and/or a neural network trained to seamlessly stitch sequences together—using the transition points.

    ESTIMATING FACIAL EXPRESSIONS USING FACIAL LANDMARKS

    公开(公告)号:US20230144458A1

    公开(公告)日:2023-05-11

    申请号:US18051209

    申请日:2022-10-31

    CPC classification number: G06V40/174 G06V40/171 G06V40/165 G06V10/82 G06T13/40

    Abstract: In examples, locations of facial landmarks may be applied to one or more machine learning models (MLMs) to generate output data indicating profiles corresponding to facial expressions, such as facial action coding system (FACS) values. The output data may be used to determine geometry of a model. For example, video frames depicting one or more faces may be analyzed to determine the locations. The facial landmarks may be normalized, then be applied to the MLM(s) to infer the profile(s), which may then be used to animate the mode for expression retargeting from the video. The MLM(s) may include sub-networks that each analyze a set of input data corresponding to a region of the face to determine profiles that correspond to the region. The profiles from the sub-networks, along global locations of facial landmarks may be used by a subsequent network to infer the profiles for the overall face.

Patent Agency Ranking