-
11.
公开(公告)号:US20250124734A1
公开(公告)日:2025-04-17
申请号:US18999826
申请日:2024-12-23
Applicant: Nvidia Corporation
Inventor: Sakthivel Sivaraman , Nishant Puri , Yuzhuo Ren , Atousa Torabi , Shubhadeep Das , Niranjan Avadhanam , Sumit Kumar Bhattacharya , Jason Roche
IPC: G06V40/10 , G06F3/01 , G06F16/632 , G06T7/73 , G06T15/06
Abstract: Interactions with virtual systems may be difficult when users inadvertently fail to provide sufficient information to proceed with their requests. Certain types of inputs, such as auditory inputs, may lack sufficient information to properly provide a response to the user. Additional information, such as image data, may enable user gestures or poses to supplement the auditory inputs to enable response generation without requesting additional information from users.
-
12.
公开(公告)号:US12211308B2
公开(公告)日:2025-01-28
申请号:US17462833
申请日:2021-08-31
Applicant: Nvidia Corporation
Inventor: Sakthivel Sivaraman , Nishant Puri , Yuzhuo Ren , Atousa Torabi , Shubhadeep Das , Niranjan Avadhanam , Sumit Kumar Bhattacharya , Jason Roche
IPC: G06V40/10 , G06F3/01 , G06F16/632 , G06T7/73 , G06T15/06
Abstract: Interactions with virtual systems may be difficult when users inadvertently fail to provide sufficient information to proceed with their requests. Certain types of inputs, such as auditory inputs, may lack sufficient information to properly provide a response to the user. Additional information, such as image data, may enable user gestures or poses to supplement the auditory inputs to enable response generation without requesting additional information from users.
-
13.
公开(公告)号:US20240143072A1
公开(公告)日:2024-05-02
申请号:US18410801
申请日:2024-01-11
Applicant: NVIDIA Corporation
Inventor: Nuri Murat Arar , Sujay Yadawadkar , Hairong Jiang , Nishant Puri , Niranjan Avadhanam
CPC classification number: G06F3/013 , G06F18/2148 , G06F18/2178 , G06V10/462 , G06V20/597 , G06V40/165 , G06V40/171
Abstract: In various examples, systems and methods are disclosed that provide highly accurate gaze predictions that are specific to a particular user by generating and applying, in deployment, personalized calibration functions to outputs and/or layers of a machine learning model. The calibration functions corresponding to a specific user may operate on outputs (e.g., gaze predictions from a machine learning model) to provide updated values and gaze predictions. The calibration functions may also be applied one or more last layers of the machine learning model to operate on features identified by the model and provide values that are more accurate. The calibration functions may be generated using explicit calibration methods by instructing users to gaze at a number of identified ground truth locations within the interior of the vehicle. Once generated, the calibration functions may be modified or refined through implicit gaze calibration points and/or regions based on gaze saliency maps.
-
公开(公告)号:US11657263B2
公开(公告)日:2023-05-23
申请号:US17005914
申请日:2020-08-28
Applicant: NVIDIA Corporation
Inventor: Nuri Murat Arar , Hairong Jiang , Nishant Puri , Rajath Shetty , Niranjan Avadhanam
IPC: G06K9/62 , G06F18/214 , G06N20/00 , G06V10/94 , G06V20/59 , G06V20/64 , G06V40/16 , G06V40/18 , G06F18/21
CPC classification number: G06F18/214 , G06F18/2193 , G06N20/00 , G06V10/95 , G06V20/597 , G06V20/647 , G06V40/171 , G06V40/193
Abstract: Systems and methods for determining the gaze direction of a subject and projecting this gaze direction onto specific regions of an arbitrary three-dimensional geometry. In an exemplary embodiment, gaze direction may be determined by a regression-based machine learning model. The determined gaze direction is then projected onto a three-dimensional map or set of surfaces that may represent any desired object or system. Maps may represent any three-dimensional layout or geometry, whether actual or virtual. Gaze vectors can thus be used to determine the object of gaze within any environment. Systems can also readily and efficiently adapt for use in different environments by retrieving a different set of surfaces or regions for each environment.
-
15.
公开(公告)号:US11487968B2
公开(公告)日:2022-11-01
申请号:US17004252
申请日:2020-08-27
Applicant: NVIDIA Corporation
Inventor: Nuri Murat Arar , Niranjan Avadhanam , Nishant Puri , Shagan Sah , Rajath Shetty , Sujay Yadawadkar , Pavlo Molchanov
Abstract: Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.
-
公开(公告)号:US20210183072A1
公开(公告)日:2021-06-17
申请号:US17010205
申请日:2020-09-02
Applicant: NVIDIA Corporation
Inventor: Nishant Puri
Abstract: Machine learning systems and methods that determine gaze direction by using face orientation information, such as facial landmarks, to modify eye direction information determined from images of the subject's eyes. System inputs include eye crops of the eyes of the subject, as well as face orientation information such as facial landmarks of the subject's face in the input image. Facial orientation information, or facial landmark information, is used to determine a coarse prediction of gaze direction as well as to learn a context vector of features describing subject face pose. The context vector is then used to adaptively re-weight the eye direction features determined from the eye crops. The re-weighted features are then combined with the coarse gaze prediction to determine gaze direction.
-
公开(公告)号:US20210182609A1
公开(公告)日:2021-06-17
申请号:US17005914
申请日:2020-08-28
Applicant: NVIDIA Corporation
Inventor: Nuri Murat Arar , Hairong Jiang , Nishant Puri , Rajath Shetty , Niranjan Avadhanam
Abstract: Systems and methods for determining the gaze direction of a subject and projecting this gaze direction onto specific regions of an arbitrary three-dimensional geometry. In an exemplary embodiment, gaze direction may be determined by a regression-based machine learning model. The determined gaze direction is then projected onto a three-dimensional map or set of surfaces that may represent any desired object or system. Maps may represent any three-dimensional layout or geometry, whether actual or virtual. Gaze vectors can thus be used to determine the object of gaze within any environment. Systems can also readily and efficiently adapt for use in different environments by retrieving a different set of surfaces or regions for each environment.
-
18.
公开(公告)号:US20240265254A1
公开(公告)日:2024-08-08
申请号:US18605628
申请日:2024-03-14
Applicant: NVIDIA Corporation
Inventor: Nuri Murat Arar , Niranjan Avadhanam , Nishant Puri , Shagan Sah , Rajath Shetty , Sujay Yadawadkar , Pavlo Molchanov
IPC: G06N3/08 , G06F18/21 , G06F18/214 , G06N20/00 , G06V10/764 , G06V10/774 , G06V10/82 , G06V10/94 , G06V20/59 , G06V20/64 , G06V40/16 , G06V40/18
CPC classification number: G06N3/08 , G06F18/214 , G06F18/2193 , G06N20/00 , G06V10/764 , G06V10/774 , G06V10/82 , G06V10/95 , G06V20/597 , G06V20/647 , G06V40/171 , G06V40/193
Abstract: Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.
-
19.
公开(公告)号:US11886634B2
公开(公告)日:2024-01-30
申请号:US17206585
申请日:2021-03-19
Applicant: NVIDIA Corporation
Inventor: Nuri Murat Arar , Sujay Yadawadkar , Hairong Jiang , Nishant Puri , Niranjan Avadhanam
CPC classification number: G06F3/013 , G06F18/2148 , G06F18/2178 , G06V10/462 , G06V20/597 , G06V40/165 , G06V40/171
Abstract: In various examples, systems and methods are disclosed that provide highly accurate gaze predictions that are specific to a particular user by generating and applying, in deployment, personalized calibration functions to outputs and/or layers of a machine learning model. The calibration functions corresponding to a specific user may operate on outputs (e.g., gaze predictions from a machine learning model) to provide updated values and gaze predictions. The calibration functions may also be applied one or more last layers of the machine learning model to operate on features identified by the model and provide values that are more accurate. The calibration functions may be generated using explicit calibration methods by instructing users to gaze at a number of identified ground truth locations within the interior of the vehicle. Once generated, the calibration functions may be modified or refined through implicit gaze calibration points and/or regions based on gaze saliency maps.
-
公开(公告)号:US20230351807A1
公开(公告)日:2023-11-02
申请号:US17661706
申请日:2022-05-02
Applicant: NVIDIA Corporation
Inventor: Yuzhuo Ren , Weili Nie , Arash Vahdat , Animashree Anandkumar , Nishant Puri , Niranjan Avadhanam
IPC: G06V40/16 , G06V10/82 , G06V10/774 , G06V10/62
CPC classification number: G06V40/176 , G06V10/82 , G06V10/774 , G06V10/62 , G06V40/164
Abstract: A machine learning model (MLM) may be trained and evaluated. Attribute-based performance metrics may be analyzed to identify attributes for which the MLM is performing below a threshold when each are present in a sample. A generative neural network (GNN) may be used to generate samples including compositions of the attributes, and the samples may be used to augment the data used to train the MLM. This may be repeated until one or more criteria are satisfied. In various examples, a temporal sequence of data items, such as frames of a video, may be generated which may form samples of the data set. Sets of attribute values may be determined based on one or more temporal scenarios to be represented in the data set, and one or more GNNs may be used to generate the sequence to depict information corresponding to the attribute values.
-
-
-
-
-
-
-
-
-