Abstract:
Various examples related to determining a location of an active speaker are provided. In one example, image data of a room from an image capture device is received and a three dimensional model is generated. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array laterally spaced from the image capture device is received. Using the three dimensional model, a location of the second microphone array with respect to the image capture device is determined. Using the audio data and the location and angular orientation of the second microphone array, an estimated location of the active speaker is determined. Using the estimated location, a setting for the image capture device is determined and outputted to highlight the active speaker.
Abstract:
A system and method for tracking, identifying, and labeling objects or features of interest is provided. In some embodiments, tracking is accomplished using unique signature of the feature of interest and image stabilization techniques. According to some aspects a frame of reference using predetermined markers is defined and updated based on a change in location of the markers and/or specific signature information. Individual objects or features within the frame may also be tracked and identified. Objects may be tracked by comparing two still images, determining a change in position of an object between the still images, calculating a movement vector of the object, and using the movement vector to update the location of an image device.
Abstract:
A system for determining occupancy includes a first luminaire having a first camera to detect a first occupant and a second luminaire having a second camera to detect a second occupant. The system further includes a processor to determine whether the first camera and the second camera have a common visual field and to determine whether the first occupant and the second occupant are the same occupant in response to determining that the first camera and the second camera have a common visual field.
Abstract:
A viewing device for a vehicle is provided with plural cameras (20), an image generation unit (36), a display (40) and a switching control unit (38). The plural cameras have different imaging ranges. On the basis of captured images from one or a plural number of the cameras, the image generation unit generates plural viewing images that differ in at least one of viewpoint or viewing angle. The display displays the viewing images. The switching control unit is capable of switching a viewing image being displayed at the display to another of the viewing images.
Abstract:
A method based on Structure from Motion for processing a plurality of sparse images acquired by one or more acquisition devices to generate a sparse 3D points cloud and of a plurality of internal and external parameters of the acquisition devices includes the steps of collecting the images; extracting keypoints therefrom and generating keypoint descriptors; organizing the images in a proximity graph; pairwise image matching and generating keypoints connecting tracks according maximum proximity between keypoints; performing an autocalibration between image clusters to extract internal and external parameters of the acquisition devices, wherein calibration groups are defined that contain a plurality of image clusters and wherein a clustering algorithm iteratively merges the clusters in a model expressed in a common local reference system starting from clusters belonging to the same calibration group; and performing a Euclidean reconstruction of the object as a sparse 3D point cloud based on the extracted parameters.
Abstract:
A method is disclosed of processing a sequence of video frames showing motion of a subject to compare the motion of the subject with a reference motion. The method comprises storing at least one reference motion data frame defining a reference motion, each reference motion data frame corresponding to respective first and second reference video frames in a sequence of video frames showing the reference motion and comprising a plurality of optical flow vectors, each optical flow vector corresponding to a respective area segment defined in the first reference video frame and a corresponding area segment defined in the second reference video frame and defining optical flow between the area segment defined in the first reference video frame and the area segment defined in the second reference video frame. The method further comprises receiving a sequence of video frames to be processed. The method further comprises processing at least one pair of the received video frames to generate a motion data frame defining motion of a subject between the pair of received video frames. Each pair of received video frames that is processed is processed by, for each area segment of the reference video frames, determining a corresponding area segment in a first video frame of the pair and a corresponding area segment in a second video frame of the pair. Each of the pairs of received video frames is further processed by, for each determined pair of corresponding area segments, comparing the area segments and generating an optical flow vector defining optical flow between the area segments. Each of the pairs of received video frames is further processed by generating a motion data frame for the pair of received video frames, the motion data frame comprising the optical flow vectors generated for the determined pairs of corresponding area segments. The method further comprises comparing the at least one reference motion data frame defining the reference motion to the at least one generated motion data frames defining the motion of the subject and generating a similarity metric for the motion of the subject and the reference motion.
Abstract:
A method and apparatus for processing video data streams for an area. Objects are identified in the area from images in the video data streams. The video data streams are generated by cameras. First locations are identified for the objects using the images. The first locations are defined using a coordinate system for the images. Graphical representations are formed for the objects using the images. The graphical representations are displayed for the objects in second locations in a model of the area on a display system with respect to features in the area that are represented in the model. The second locations are defined using a geographic coordinate system for the model. A first location in the first locations for an object in the objects corresponds to a second location in the second locations for a corresponding graphical representation in the graphical representations.
Abstract:
[Object] To perform more stable and highly accurate attitude estimation. [Solution] The attitude optimization unit optimizes the articulation position, the angle, the number of articulations, and the like which are attitude parameters of a human body model (tree structure) by a plurality of optimization techniques so as to match a region in which a human body can exist, and switches among a plurality of optimization techniques and uses an optimum technique. Note that optimization techniques include 1. initial value, 2. algorithm, and 3. restriction, and optimization is performed by switching among these three. For example, it is possible to apply the present disclosure to an image processing device that performs image processing of optimizing the articulation position and angle of a human body model.
Abstract:
A portable terminal (1) includes an information acquisition unit (55), an output control unit (56) and an output unit (20). The information acquisition unit (55) acquires move locus information on a move due to daily behavior of a user and a move due to non-daily behavior. The output control unit (56) controls an output unit (20) such that the output unit (20) outputs the acquired move locus information in a state where it is possible to distinguish the move due to non-daily behavior of the user. In this way, in the portable terminal (1), the acquired move locus information on the move due to daily behavior of the user and the move due to non-daily behavior can be output in a state where it is possible to distinguish the move due to non-daily behavior of the user.
Abstract:
An image processing apparatus that provides privacy protection and monitoring acquires a background image and a captured image, extracts an object region corresponding to a predetermined object from the captured image, sets a masking color based on color information about the background image, and masks the extracted object region based on the masking color.