EMOTION AND CHARACTER PARAMETERS FOR DIFFUSION MODEL CONTENT GENERATION SYSTEMS AND APPLICATIONS

    公开(公告)号:US20240304177A1

    公开(公告)日:2024-09-12

    申请号:US18178762

    申请日:2023-03-06

    CPC classification number: G10L13/10 G06F40/247 G06T17/20

    Abstract: Approaches presented herein provide systems and methods for generating three-dimensional (3D) content with fine grained emotions and character traits. A set of classifiers may be used to identify emotions and character traits from an input provided by a user. Each of the classifiers in the set of classifiers may use a set of seed words that is expanded through methods including manual collection, synonym extension, and/or word alignment. An input may then be evaluated for indications of emotion and/or character traits, such as by identifying certain words or phrases present within the input. Output vectors associated with the identified emotion and/or character traits may then be provided to different generative models to adjust content, such as modifications to output audio or facial expressions for digital character representations.

    PERSONALIZED LANGUAGE MODELS FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

    公开(公告)号:US20250018298A1

    公开(公告)日:2025-01-16

    申请号:US18351900

    申请日:2023-07-13

    Inventor: Yi Dong Xianchao Wu

    Abstract: Disclosed are systems and techniques for training personalized language models. The techniques include applying a plurality of first machine learning models to a first input prompt. Each of the plurality of first machine learning models generates a respective reward value of a first plurality of reward values. The techniques include applying a second machine learning model to the first plurality of reward values to obtain first reward value embeddings; applying a third machine learning model to the first reward value embeddings and the first input prompt to obtain a first output response; calculating a first loss based on a comparison between the first output response and the first input prompt; and causing the second machine learning model to be modified based on the first loss.

    LANGUAGE MODEL TUNING IN CONVERSATIONAL ARTIFICIAL INTELLIGENCE SYSTEMS AND APPLICATIONS

    公开(公告)号:US20240311579A1

    公开(公告)日:2024-09-19

    申请号:US18123055

    申请日:2023-03-17

    CPC classification number: G06F40/40

    Abstract: Disclosed are systems and techniques that may generate prompts for language models. The techniques include obtaining a first dataset and a second dataset and training a hierarchical virtual token generator (VTG) model to generate a large language model (LLM) input prompt. Training the hierarchical VTG includes training, based on the first dataset, a first VTG to output a first virtual token and training, based on the second dataset, a second VTG to output a second virtual token embedding. The generated LLM input prompt includes the first virtual token embedding and the second virtual token embedding.

    DOMAIN-CUSTOMIZABLE MODELS FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

    公开(公告)号:US20240193445A1

    公开(公告)日:2024-06-13

    申请号:US18064125

    申请日:2022-12-09

    Inventor: Yi Dong Xianchao Wu

    CPC classification number: G06N5/043 G06F40/40

    Abstract: In various examples, systems and methods are disclosed that train a machine learning model(s)—such as a large language model (LLM)—for one or more specific domains. In some embodiments, the machine learning model(s) may include at least a base model(s) as well as additional parts, such as additional layers, associated with the domains for which the machine learning model(s) is being trained. As such, the parts of the machine learning model(s) may be trained separately, such that training data associated with a domain is used to train a part of the machine learning model(s) that is associated with the domain without training the other part(s) of the machine learning model(s). The systems and methods may then use these parts when deploying the machine learning model(s), such as by activating and/or deactivating parts based on the input data being processed.

    JOINT TRAINING OF SPEECH RECOGNITION AND SPEECH SYNTHESIS MODELS FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

    公开(公告)号:US20250014571A1

    公开(公告)日:2025-01-09

    申请号:US18347031

    申请日:2023-07-05

    Abstract: Disclosed are systems and techniques for training machine learning models. The techniques include providing a first data of a first modality as input to a first machine learning model to obtain a first output of a second modality, providing the first output of the second modality as input to a second machine learning model to obtain a second output of the first modality, providing the first data as input to a third machine learning model to obtain a first tensor, providing the second output as input to the third machine learning model to obtain a second tensor, calculating a first loss based on a comparison between the first tensor and the second tensor, and causing the first machine learning model to be modified based on the first loss.

    NEURAL NETWORKS TRAINED USING EVENT OCCURRENCES

    公开(公告)号:US20230135659A1

    公开(公告)日:2023-05-04

    申请号:US17519532

    申请日:2021-11-04

    Inventor: Xianchao Wu

    Abstract: Apparatuses, systems, and techniques to facilitate financial natural language processing (NLP) training and tasks, such as sentiment analysis, machine reading comprehension, question answering, and causal inferencing. In at least one embodiment, training of one or more neural networks comprises a bidirectional encoder representations from transformers (BERT) machine learning model and input data further comprising timestamps of financial news articles.

    MULTI-LINGUAL AUTOMATIC SPEECH RECOGNITION FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

    公开(公告)号:US20250022457A1

    公开(公告)日:2025-01-16

    申请号:US18349716

    申请日:2023-07-10

    Abstract: Disclosed are systems and techniques for training machine learning models. The techniques include generating, using a first automatic speech recognition (ASR) model, a first text output based on a vector representation of a first speech data and generating, using a second ASR model, a second text output, wherein the second ASR model adds noise to a vector representation of the first text output to obtain a noisy vector representation of the first text output and is trained to remove the noise from the noisy vector representation of the first text output. The techniques include calculating a first loss of the second ASR model based at least on a comparison between the second text output and the first text output and modifying learnable parameters of the second ASR model to improve an accuracy of the second ASR model.

    FINANCIAL INVESTMENT PREDICTIONS AND RECOMMENDATIONS USING NEURAL NETWORKS

    公开(公告)号:US20240144373A1

    公开(公告)日:2024-05-02

    申请号:US18051206

    申请日:2022-10-31

    CPC classification number: G06Q40/06 G06N3/08

    Abstract: In various examples, interactive systems that use neural networks to determine financial investment predictions or recommendations are presented. Systems and methods are disclosed that determine financial predictions or recommendations associated with one or more investments using a neural network(s). The financial predictions may include a predicted movement of an investment (e.g., extremely down, down, preserved, up, extremely up, etc.), a predicted price of an investment (e.g., a future stock price, etc.), a specific investment for a user to buy/sell/trade, and/or so forth. In some examples, the systems and methods may include an interactive system(s), such as a dialogue system(s), that interacts with users to provide the financial predictions.

    REVERSIBLE SPEECH-TO-SPEECH TRANSLATION FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS

    公开(公告)号:US20240428020A1

    公开(公告)日:2024-12-26

    申请号:US18212408

    申请日:2023-06-21

    Abstract: Disclosed are apparatuses, systems, and techniques that may use machine learning for reversible translations of speech utterances. The techniques include training and using duplex neural networks (NNs) having a first subnetwork and a second subnetwork that are mirror images of each other. Training data for training the duplex NNs may include a target output that includes a first speech utterance in a first language, a first training input that includes the target output distorted by a noise, and a second training input that includes a second speech utterance in a second language. The duplex NNs may be trained to identify, using the first training input and the second training input, at least one of the target output or the first noise.

Patent Agency Ranking