Duplex communications for conversational AI by dynamically responsive interrupting content

    公开(公告)号:US11605384B1

    公开(公告)日:2023-03-14

    申请号:US17390118

    申请日:2021-07-30

    Abstract: Systems and methods of presenting interrupting content during human speech are disclosed. The proposed systems offer improved duplex communications in conversational AI platforms. In some embodiments, the system receives speech data and evaluates the data using linguistic models. If the linguistic models detect indications of linguistic irregularities such as mispronunciation, a smart feedback assistant can determine that the system should interrupt the speaker in near-real-time and provide feedback regarding their pronunciation. In addition, conversational irregularities may also be detected, causing the smart feedback assistant to interrupt with presentation of moderating guidance. In some cases, emotion models may also be utilized to detect emotional states based on the speaker's voice in order to offer near-immediate feedback. Users can also customize the manner and occasions in which they are interrupted.

    Selective data encoding and machine learning video synthesis for content streaming systems and applications

    公开(公告)号:US12301785B1

    公开(公告)日:2025-05-13

    申请号:US18045915

    申请日:2022-10-12

    Abstract: Systems and methods of selectively compressing video data are disclosed. The proposed systems provide a computer-implemented process configured to classify a person's behavior(s) during a video and encode the behaviors as a representation of the video. The encoding will be tailor-generated based on the specific display configuration of the target device at which playback is expected to occur. Target device displays with lower resolution and video quality characteristics will trigger an encoding of the video data that has less complexity than target device displays with higher resolution and video quality characteristics. When playback of the video is requested at the target device, a reconstruction of the video is generated by a video synthesizer based on a reference image of the person and the encoding rather than the original video file, significantly reducing memory, processing, power, and bandwidth requirements.

    Conversational AI-encoded language for video navigation

    公开(公告)号:US12288570B1

    公开(公告)日:2025-04-29

    申请号:US18049446

    申请日:2022-10-25

    Abstract: Systems and methods of compressing video content as encoded data and selectively reconstructing portions of the content are disclosed. The proposed systems provide a computer-implemented process configured to classify a person's behavior(s) during a video and encode the behaviors as a representation of the video. When playback of the video is requested, a video navigation assistant will allow the end-user to select specific segments of the video based on topics discussed in the video and the codes that were generated to represent the video. The user is then able to move through segments of the video in a sequence that aligns with their viewing preferences.

    Pronunciation features for language models

    公开(公告)号:US12266350B1

    公开(公告)日:2025-04-01

    申请号:US17583812

    申请日:2022-01-25

    Abstract: Systems and methods are directed toward evaluating auditory inputs against a range of tolerance to provide feedback regarding pronunciation. An auditory input may be evaluated using a trained machine learning system and evaluated for similarity against a target word. Similarity may be scored and then evaluated to determine whether the similarity falls within a range of tolerance, wherein the range of tolerance may be adjusted or modified for particular uses. A score within the range of tolerance is indicative of a word that has been pronounced such that it would be perceptible.

    Volume control for audio and video conferencing applications

    公开(公告)号:US11487498B2

    公开(公告)日:2022-11-01

    申请号:US17153466

    申请日:2021-01-20

    Abstract: In various examples, when a local user initiates an instance of a video conference application, the user may be provided with a user interface (UI) that displays an icon corresponding to the user as well as several other icons corresponding to participants in the instance of the video conference application. As the users converse, the local user may find that a particular participant is speaking loudly compared to the other remote users. The local user may then select an icon corresponding to the particular participant and move the icon away from the local user's icon in the UI. Based on moving the remote user's icon away from the local user's icon, the system may reduce the output volume of the audio data for the participant. Further, if the local user moves the participant icon closer to the local user's icon, the volume for the participant may be increased.

    VOLUME CONTROL FOR AUDIO AND VIDEO CONFERENCING APPLICATIONS

    公开(公告)号:US20220229626A1

    公开(公告)日:2022-07-21

    申请号:US17153466

    申请日:2021-01-20

    Abstract: In various examples, when a local user initiates an instance of a video conference application, the user may be provided with a user interface (UI) that displays an icon corresponding to the user as well as several other icons corresponding to participants in the instance of the video conference application. As the users converse, the local user may find that a particular participant is speaking loudly compared to the other remote users. The local user may then select an icon corresponding to the particular participant and move the icon away from the local user's icon in the UI. Based on moving the remote user's icon away from the local user's icon, the system may reduce the output volume of the audio data for the participant. Further, if the local user moves the participant icon closer to the local user's icon, the volume for the participant may be increased.

Patent Agency Ranking