摘要:
An omni-directional camera (a 360 degree camera) is proposed with an integrated microphone array. The primary application for such a camera is videoconferencing and meeting recording, and the device is designed to be placed on a meeting room table. The microphone array is in a planar configuration, and the microphones are located as close to the desktop as possible to eliminate sound reflections from the table. The camera is connected to the microphone array base with a thin cylindrical rod, which is acoustically invisible to the microphone array for the frequency range [50-4000] Hz. This provides a direct path from the person talking to all of the microphones in the array, and can therefore be used for sound source localization (determining the location of the talker) and beam-forming (improving the sound quality of the talker by filtering only sound from a particular direction). The camera array is elevated from the table to provide a near frontal viewpoint of the meeting participants.
摘要:
A system that captures both whiteboard content and audio signals of a meeting using a digital camera and a microphone. The system can be retrofit to any existing whiteboard. It computes the time stamps of pen strokes on the whiteboard by analyzing the sequence of captured snapshots. It also automatically produces a set of key frames representing all the written content on the whiteboard before each erasure. The whiteboard content serves as a visual index to efficiently browse the audio meeting. The system not only captures the whiteboard content, but also helps the users to view and manage the captured meeting content efficiently and securely.
摘要:
Audio and video frames are synchronized by hashing an audio frame at a sender and combining the resultant hash value with the video frame. The audio frame is transmitted over an audio network, such as a telephone network, and the video frame is transmitted over a digital network, such as an intranet. The audio frame may be combined with additional audio signals from an audio bridge. The receiver receives the audio signal from the audio bridge and performs the same hash function on the mixed signal as was performed on the original signal. The receiver correlates the hash value on the mixed signal with the hash value included with the video frame (wherein the video frame is one of several video frames buffered by the receiver). The receiver can thus identify the video frame that corresponds to the audio frame and render them simultaneously.
摘要:
A User Interface (UI) for a real-time panoramic image correction system and method that simplifies the use of the system for the user. The UI includes a control panel that allows a user to enter meeting table size and shape, camera position and orientation, and the amount of normalization desired (e.g. 0 to 100%). A window can also be implemented on a display that displays the corrected panoramic image. In this window, the head (either normalized or non-normalized) of a meeting participant, preferably one that is speaking, is extracted and displayed in a separate window. Additionally, the corrected panoramic image, whose size will vary in conjunction with the amount of warping applied, can be displayed and transmitted with extra pixels around its perimeter in order to allow the corrected or normalized panoramic image to adapt to any of the standard display size and resolutions and to simplify network transmission. The corrected image can also be transmitted with standard resolutions using non-unity pixel aspect ratios to simply network transmission.
摘要:
A panoramic camera design that is lower cost, robust, stable and more user friendly than prior art designs. The camera design makes use of a unified molded structure of optical material to house a mirror, aligned sensor, and lens assembly. The unified molded structure of the camera keeps the sensed optical path enclosed to minimize dust and user's fingers and maintain optical alignment.
摘要:
A system and process for highlighting the current speaker on an on-going basis in each frame of a low frame-rate video of an event having multiple people in attendance, such as a video teleconference, is presented. In general, this is accomplished by periodically identifying an attendee that is currently speaking at a rate substantially faster than the video frame rate, and for each frame of the video updating the frame to highlight the attendee currently speaking. More particularly, an A/V source provides video and audio data streams to the client computing device, with current speaker data embedded into the audio stream via audio watermarking techniques. The client device extracts the current speaker data from the audio stream, and then renders and displays the video while using the current speaker data to periodically update the frame being displayed to highlight the current speaker.
摘要:
A system and process for highlighting the current speaker on an on-going basis in each frame of a low frame-rate video of an event having multiple people in attendance, such as a video teleconference, is presented. In general, this is accomplished by periodically identifying an attendee that is currently speaking at a rate substantially faster than the video frame rate, and for each frame of the video updating the frame to highlight the attendee currently speaking. More particularly, an audio/visual (A/V) source provides separate video, audio, and current speaker data streams to a client computing device. The client device then uses these data streams to render and display the video and to periodically update the frame being displayed to highlight the current speaker depicted therein.
摘要:
A User Interface (UI) for a real-time panoramic image correction system and method that simplifies the use of the system for the user. The UI includes a control panel that allows a user to enter meeting table size and shape, camera position and orientation, and the amount of normalization desired (e.g. 0 to 100%). A window can also be implemented on a display that displays the corrected panoramic image. In this window, the head (either normalized or non-normalized) of a meeting participant, preferably one that is speaking, is extracted and displayed in a separate window. Additionally, the corrected panoramic image, whose size will vary in conjunction with the amount of warping applied, can be displayed and transmitted with extra pixels around its perimeter in order to allow the corrected or normalized panoramic image to adapt to any of the standard display size and resolutions and to simplify network transmission. The corrected image can also be transmitted with standard resolutions using non-unity pixel aspect ratios to simply network transmission.
摘要:
A foveated panoramic camera system includes multiple cameras oriented so that individual images captured by the cameras can be combined to form a panoramic image. Each of the cameras includes a lens having a focal length that corresponds to a field of view for the camera. A field of view for a camera overlaps with the field(s) of view of each adjacent camera. At least one of the cameras has a field of view that differs from fields of view of other cameras for capturing images that are situated at a greater distance from the camera system than are images captured by the other cameras. As a result, a more uniform resolution is achieved across all images captured by the multiple cameras. A mirror assembly is utilized to reflect object images into the multiple cameras to achieve a near center of projection for the camera system.