ADVERSARIAL LEARNING FRAMEWORK FOR PERSONA-BASED DIALOGUE MODELING

    公开(公告)号:US20230368778A1

    公开(公告)日:2023-11-16

    申请号:US18204746

    申请日:2023-06-01

    摘要: Various embodiments may be generally directed to the use of an adversarial learning framework for persona-based dialogue modeling. In some embodiments, automated multi-turn dialogue response generation may be performed using a persona-based hierarchical recurrent encoder-decoder-based generative adversarial network (phredGAN). Such a phredGAN may feature a persona-based hierarchical recurrent encoder-decoder (PHRED) generator and a conditional discriminator. In some embodiments, the conditional discriminator may include an adversarial discriminator that is provided with attribute representations as inputs. In some other embodiments, the conditional discriminator may include an attribute discriminator, and attribute representations may be handled as targets of the attribute discriminator. The embodiments are not limited in this context.

    DELTA MODELS FOR PROVIDING PRIVATIZED SPEECH-TO-TEXT DURING VIRTUAL MEETINGS

    公开(公告)号:US20230352026A1

    公开(公告)日:2023-11-02

    申请号:US17732876

    申请日:2022-04-29

    摘要: Provided herein are systems and methods for delta models for providing privatized speech-to-text during virtual meetings. In one embodiment, a system may include a non-transitory computer-readable medium; a communications interface; and a processor. The processor may be configured to execute processor-executable instructions to: join a virtual meeting. Each participant in the virtual meeting may exchange audio streams with other participants in the virtual meeting. The instructions may include receiving, from a video conference provider, a local model for speech recognition. The local model may be a copy of a centralized model. The instructions may include performing speech recognition using the local model on the audio streams. Performing speech recognition may include identifying audio feature data within the one or more audio streams, identifying, based on a vocabulary database, user-specific vocabulary within the audio feature data, and generating, based on the user-specific vocabulary, a private transcription of the audio streams.