Abstract:
The present disclosure provides a non-invasive, inexpensive and unobtrusive system that enables heart rate (HR) monitoring by addressing the traditionally known issues with face video based systems due to respiration, facial expressions, out-of-plane movements, camera parameters and environmental factors. These issues are alleviated by filtering, pulse modelling and HR tracking. Quality measures are defined which incorporate out-of-plane movements to define the quality of each video frame unlike existing approaches which provide a single quality for the entire video. To handle out-of-plane movement, Fourier basis function is employed to reconstruct pulse signals at affected locations. Bayesian decision theory based method performs HR tracking using previous HR and quality estimates for improved HR monitoring.
Abstract:
A method and system is provided for finding and analyzing gait parameters and postural balance of a person using a Kinect system. The system is easy to use and can be installed at home as well as in clinic. The system includes a Kinect sensor, a software development kit (SDK) and a processor. The temporal skeleton information obtained from the Kinect sensor to evaluate gait parameters including stride length, stride time, stance time and swing time. Eigenvector based curvature detection is used to analyze the gait pattern with different speeds. In another embodiment, Eigenvector based curvature detection is employed to detect static single limb stance (SLS) duration along with gait variables for evaluating body balance.
Abstract:
This disclosure relates generally to method and system for multi-object tracking and navigation without pre-sequencing. Multi-object navigation is an embodied Al task where object navigation only searches for an instance of at least one target object where a robot localizes an instance to locate target objects associated with an environment. The method of the present disclosure employs a deep reinforcement learning (DRL) based framework for sequence agnostic multi-object navigation. The robot receives from an actor critic network a deterministic local policy to compute a low-level navigational action to navigate along a shortest path calculated from a current location of the robot to the long-term goal to reach the target object. Here, a deep reinforcement learning network is trained to assign the robot with a computed reward function when the navigational action is performed by the robot to reach an instance of the plurality of target objects.
Abstract:
This disclosure addresses the unresolved problems of tackling object disambiguation task for an embodied agent. The embodiments of present disclosure provide a method and system for disambiguation of referred objects for embodied agents. With a phrase-to-graph network disclosed in the system of the present disclosure, any natural language object description indicating the object disambiguation task can be converted into a semantic graph representation. This not only provides a formal representation of the referred object and object instances but also helps to find an ambiguity in disambiguating the referred object using a real-time multi-view aggregation algorithm. The real-time multi-view aggregation algorithm processes multiple observations from an environment and finds the unique instances of the referred object. The method of the present disclosure demonstrates significant improvement in qualifying ambiguity detection with accurate, context-specific information so that it is sufficient for a user to come up with a reply towards disambiguation.
Abstract:
The disclosure generally relates to methods and systems for enabling human robot interaction by cognition sharing which includes gesture and audio. Conventional techniques that use the gestures and the speech, require extra hardware setup and are limited to navigation in structured outdoor driving environments. The present disclosure herein provides methods and systems that solves the technical problem of enabling the human robot interaction with a two-step approach by transferring the cognitive load from the human to the robot. An accurate shared perspective associated with the task is determined in the first step by computing relative frame transformations based on understanding of navigational gestures of the subject. Then, the shared perspective transformed to the robot in the field view of the robot. The transformed shared perspective is then given to a language grounding technique in the second step, to accurately determine a final goal associated with the task.
Abstract:
This disclosure relates generally to a method and system for generating 2D animated lip images synchronizing to an audio signal for an unseen subject. Recent advances in Convolutional Neural Network (CNN) based approaches generate convincing talking heads. Personalization of such talking heads requires training the model with large number of samples of the target person which is time consuming. The lip generator system receives an audio signal and a target lip image of an unseen target subject as inputs from a user and processes these inputs to extract a plurality of high dimensional audio image features. The lip generator system is meta-trained with training dataset which consists of large variety of subjects' ethnicity and vocabulary. The meta-trained model generates realistic animation for previously unseen face and unseen audio when finetuned with only a few-shot samples for a predefined interval of time. Additionally, the method protects intrinsic features of the unseen target subject.
Abstract:
Systems and methods of the present disclosure facilitate rigid point cloud registration with characteristics including shape constraint, translation proportional to distance and spatial point-set distribution model for handling scale. The method of the present disclosure enables registration of a rigid template point cloud to a given reference point cloud. Shape-constrained gravitation, as induced by the reference point cloud, controls movement of the template point cloud such that at each iteration, the template point cloud better aligns with the reference point cloud in terms of shape. This enables alignment in difficult conditions introduced by change such as presence of outliers and/or missing parts, translation, rotation and scaling. Also, systems and methods of the present disclosure provide an automated method as against conventional methods that depended on manually adjusted parameters.
Abstract:
Motion blur occur when acquiring images and videos with cameras fitted to the high speed motion devices, for example, drones. Distorted images intervene with the mapping of the visual points, hence the pose estimation and tracking may get corrupted. System and method for solving inverse problems using a coupled autoencoder is disclosed. In an embodiment, solving inverse problems, for example, generating a clean sample from an unknown corrupted sample is disclosed. The coupled autoencoder learns the autoencoder weights and coupling map (between source and target) simultaneously. The technique is applicable to any transfer learning problem. The embodiments of the present disclosure implements/proposes a new formulation that recasts deblurring as a transfer learning problem which is solved using the proposed coupled autoencoder.
Abstract:
A system and method for real time estimation of heart rate (HR) from one or more face videos acquired in non-invasive manner. The system receives face videos and obtains several blocks as ROI consisting of facial skin areas. Subsequently, the temporal fragments are extracted from the blocks and filtered to minimize the noise. In the next stage, several temporal fragments are extracted from the video. The several temporal fragments, corrupted by noise are determined using an image processing range filter and pruned for further processing. The HR of each temporal fragment, referred as local HR is estimated along with its quality. Eventually, a quality based fusion is applied to estimate a global HR corresponding to the received face videos. In addition, the disclosure herein is also applicable for frontal, profile and multiple faces and performs in real-time.
Abstract:
Method and System for estimating three dimensional measurements of a physical object by utilizing readings from inertial sensors is provided. The method involves capturing by a handheld unit, three dimensional aspects of the physical object. The raw recordings are received from the inertial sensors and are used to develop a raw rotation matrix. The raw rotation matrix is subjected to low pass filtering to obtain processed matrix constituted of filtered Euler angles wherein coordinates from the processed rotation matrix is used to estimate gravitational component along the three axis leading to determination of acceleration values and further calculation of measurement of each dimension of the physical object.