摘要:
System and methods are disclosed to perform multi-human 3D tracking with a plurality of cameras. At each view, a module receives each camera output and provides 2D human detection candidates. A plurality of 2D tracking modules are connected to the CNNs, each 2D tracking module managing 2D tracking independently. A 3D tracking module is connected to the 2D tracking modules to receive promising 2D tracking hypotheses. The 3D tracking module selects trajectories from the 2D tracking modules to generate 3D tracking hypotheses.
摘要:
System and methods are disclosed to perform multi-human 3D tracking with a plurality of cameras. At each view, a module receives each camera output and provides 2D human detection candidates. A plurality of 2D tracking modules are connected to the CNNs, each 2D tracking module managing 2D tracking independently. A 3D tracking module is connected to the 2D tracking modules to receive promising 2D tracking hypotheses. The 3D tracking module selects trajectories from the 2D tracking modules to generate 3D tracking hypotheses.
摘要:
Systems and methods are disclosed to detect unsafe system states by capturing and analyzing data from a plurality of sensors detecting parameters of the system; and applying temporal difference (TD) learning to learn a function to approximate an expected future reward given current and historical sensor readings.
摘要:
Systems and methods are disclosed for determining 3D human pose by generating an Appearance and Position Context (APC) local descriptor that achieves selectivity and invariance while requiring no background subtraction; jointly learning visual words and pose regressors in a supervised manner; and estimating the 3D human pose.
摘要:
Systems and methods are disclosed to predict driving danger by capturing vehicle dynamic parameter, driver physiological data and driver behavior feature; applying a learning algorithm to the features; and predicting driving danger.
摘要:
Systems and methods are disclosed for processing a low resolution image by performing a high resolution edge segment extraction on the low resolution image; performing an image super resolution on each edge segment; performing reconstruction constraint reinforcement; and generating a high quality image from the low quality image.
摘要:
A method and system for training a neural network of a visual recognition computer system, extracts at least one feature of an image or video frame with a feature extractor; approximates the at least one feature of the image or video frame with an auxiliary output provided in the neural network; and measures a feature difference between the extracted at least one feature of the image or video frame and the approximated at least one feature of the image or video frame with an auxiliary error calculator. A joint learner of the method and system adjusts at least one parameter of the neural network to minimize the measured feature difference.
摘要:
Systems and methods are disclosed for determining human pose by generating an Appearance and Position Context (APC) local descriptor that achieves selectivity and invariance while requiring no background subtraction; jointly learning visual words and pose regressors in a supervised manner; and estimating the human pose.
摘要:
Systems and methods are disclosed to predict driving danger by capturing vehicle dynamic parameter, driver physiological data and driver behavior feature; applying a learning algorithm to the features; and predicting driving danger.
摘要:
A fully automatic, computationally efficient segmentation method of video employing sequential clustering of sparse image features. Both edge and corner features of a video scene are employed to capture an outline of foreground objects and the feature clustering is built on motion models which work on any type of object and moving/static camera in which two motion layers are assumed due to camera and/or foreground and the depth difference between the foreground and background. Sequential linear regression is applied to the sequences and the instantaneous replacements of image features in order to compute affine motion parameters for foreground and background layers and consider temporal smoothness simultaneously. The Foreground layer is then extracted based upon sparse feature clustering which is time efficient and refined incrementally using Kalman filtering.