Localized audio source extraction from video recordings
摘要:
Technologies are generally described for a system to process a collection of video recordings of a scene to extract and localize audio sources for the audio data. According to some examples, video recordings captured by mobile devices from different perspectives may be uploaded to a central database. Video segments capturing an overlapping portion of the scene at an overlapping time may be identified, and a relative location of each of the video capturing devices may be determined. Audio data for the video segments may be indexed with a sub-frame time reference and relative locations as a function of overlapping time. Using the indices that include the sub-frame time references and relative locations, audio sources for the audio data may be extracted and localized. The extracted audio sources may be transcribed and indexed to enable searching, and may be added back to each video recording as a separate audio channel.
公开/授权文献
信息查询
0/0