-
公开(公告)号:US12126791B1
公开(公告)日:2024-10-22
申请号:US17664265
申请日:2022-05-20
Applicant: NVIDIA Corporation
Inventor: Pratyush Mahapatra , Somayyeh Rahimi , Ruthie Lyle
IPC: H04N19/103 , G06V20/40 , G06V40/16 , G06V40/20 , H04L65/403 , H04N19/136
CPC classification number: H04N19/103 , G06V20/40 , G06V40/176 , G06V40/20 , H04L65/403 , H04N19/136
Abstract: Systems and methods of compressing video data are disclosed. The proposed systems provide a computer-implemented process configured to classify a person's behavior(s) during a video and encode the behaviors as a representation of the video. When playback of the video is requested, a reconstruction of the video is generated by a video synthesizer based on a reference image of the person and the sequence of codes corresponding to their behavior during the video. Storage and transmission of the video can then be limited to the reference image and the behavioral codes rather than the video file itself, significantly reducing memory and bandwidth requirements.
-
2.
公开(公告)号:US11605384B1
公开(公告)日:2023-03-14
申请号:US17390118
申请日:2021-07-30
Applicant: NVIDIA Corporation
Inventor: Steven Dalton , Siddha Ganju , Ruthie Lyle
Abstract: Systems and methods of presenting interrupting content during human speech are disclosed. The proposed systems offer improved duplex communications in conversational AI platforms. In some embodiments, the system receives speech data and evaluates the data using linguistic models. If the linguistic models detect indications of linguistic irregularities such as mispronunciation, a smart feedback assistant can determine that the system should interrupt the speaker in near-real-time and provide feedback regarding their pronunciation. In addition, conversational irregularities may also be detected, causing the smart feedback assistant to interrupt with presentation of moderating guidance. In some cases, emotion models may also be utilized to detect emotional states based on the speaker's voice in order to offer near-immediate feedback. Users can also customize the manner and occasions in which they are interrupted.
-
公开(公告)号:US12301785B1
公开(公告)日:2025-05-13
申请号:US18045915
申请日:2022-10-12
Applicant: NVIDIA Corporation
Inventor: Pratyush Mahapatra , Ruthie Lyle
IPC: H04N19/103 , H04N19/146 , H04N19/156
Abstract: Systems and methods of selectively compressing video data are disclosed. The proposed systems provide a computer-implemented process configured to classify a person's behavior(s) during a video and encode the behaviors as a representation of the video. The encoding will be tailor-generated based on the specific display configuration of the target device at which playback is expected to occur. Target device displays with lower resolution and video quality characteristics will trigger an encoding of the video data that has less complexity than target device displays with higher resolution and video quality characteristics. When playback of the video is requested at the target device, a reconstruction of the video is generated by a video synthesizer based on a reference image of the person and the encoding rather than the original video file, significantly reducing memory, processing, power, and bandwidth requirements.
-
公开(公告)号:US12288570B1
公开(公告)日:2025-04-29
申请号:US18049446
申请日:2022-10-25
Applicant: NVIDIA Corporation
Inventor: Ruthie Lyle , Somayyeh Rahimi , Pratyush Mahapatra
IPC: G06F3/0482 , G06V10/94 , G06V20/40 , G11B27/031 , G11B27/34 , G06V40/16
Abstract: Systems and methods of compressing video content as encoded data and selectively reconstructing portions of the content are disclosed. The proposed systems provide a computer-implemented process configured to classify a person's behavior(s) during a video and encode the behaviors as a representation of the video. When playback of the video is requested, a video navigation assistant will allow the end-user to select specific segments of the video based on topics discussed in the video and the codes that were generated to represent the video. The user is then able to move through segments of the video in a sequence that aligns with their viewing preferences.
-
公开(公告)号:US12266350B1
公开(公告)日:2025-04-01
申请号:US17583812
申请日:2022-01-25
Applicant: Nvidia Corporation
Inventor: Siddha Ganju , Ruthie Lyle , Steven Dalton
Abstract: Systems and methods are directed toward evaluating auditory inputs against a range of tolerance to provide feedback regarding pronunciation. An auditory input may be evaluated using a trained machine learning system and evaluated for similarity against a target word. Similarity may be scored and then evaluated to determine whether the similarity falls within a range of tolerance, wherein the range of tolerance may be adjusted or modified for particular uses. A score within the range of tolerance is indicative of a word that has been pronounced such that it would be perceptible.
-
公开(公告)号:US12167169B1
公开(公告)日:2024-12-10
申请号:US17933186
申请日:2022-09-19
Applicant: NVIDIA Corporation
Inventor: Siddha Ganju , Ruthie Lyle , Naveen Kumar Rai , Ronay Ak , Andrew Russell
Abstract: A digital avatar system can process video streams and generate synthetic video with a digital avatar. The digital avatar provides the appearance of a participant from the video stream talking and one or more of performing various behaviors or actions consistent with the participant's behavior when they are live streamed. A digital avatar system can detect triggering events during a live stream and automatically switch to an avatar mode.
-
公开(公告)号:US11487498B2
公开(公告)日:2022-11-01
申请号:US17153466
申请日:2021-01-20
Applicant: NVIDIA Corporation
Inventor: Henning Lysdal , Ruthie Lyle
Abstract: In various examples, when a local user initiates an instance of a video conference application, the user may be provided with a user interface (UI) that displays an icon corresponding to the user as well as several other icons corresponding to participants in the instance of the video conference application. As the users converse, the local user may find that a particular participant is speaking loudly compared to the other remote users. The local user may then select an icon corresponding to the particular participant and move the icon away from the local user's icon in the UI. Based on moving the remote user's icon away from the local user's icon, the system may reduce the output volume of the audio data for the participant. Further, if the local user moves the participant icon closer to the local user's icon, the volume for the participant may be increased.
-
公开(公告)号:US20220229626A1
公开(公告)日:2022-07-21
申请号:US17153466
申请日:2021-01-20
Applicant: NVIDIA Corporation
Inventor: Henning Lysdal , Ruthie Lyle
Abstract: In various examples, when a local user initiates an instance of a video conference application, the user may be provided with a user interface (UI) that displays an icon corresponding to the user as well as several other icons corresponding to participants in the instance of the video conference application. As the users converse, the local user may find that a particular participant is speaking loudly compared to the other remote users. The local user may then select an icon corresponding to the particular participant and move the icon away from the local user's icon in the UI. Based on moving the remote user's icon away from the local user's icon, the system may reduce the output volume of the audio data for the participant. Further, if the local user moves the participant icon closer to the local user's icon, the volume for the participant may be increased.
-
-
-
-
-
-
-