Abstract:
A system (20) includes a memory (26, 29) and one or more processors (24, 27) configured to cooperatively carry out a process. The process includes loading a first audio file (28) and a digital note file (30) from the memory, computing a first spectrogram of the first audio file, computing a second spectrogram of a second audio file (32) generated from the digital note file, computing a mapping between first spectra of the first spectrogram and respective second spectra of the second spectrogram that minimizes a distance measure under predefined constraints, and based on the mapping, shifting, and adjusting respective durations of, notes in the digital note file so as to increase an alignment of the notes with the first audio file. Other embodiments are also described.
Abstract:
Presented herein are method and systems for generating a mood based summary video for a full-length movie, comprising receiving a full-length movie, receiving a Knowledge Graph (KG) annotated model of the full-length movie generated by annotating features extracted for the full-length movie, segmenting the full-length movie to a plurality of mood based time intervals each expressing a certain dominant mood based on an analysis of the KG, computing a score for each of the plurality of mood based time intervals according to one or more of a plurality of metrics expressing a relevance level of the respective mood based time interval to a narrative of the full-length movie, generating a mood based summary video by concatenating a subset of the plurality of mood based time intervals having a score exceeding a predefined threshold; and outputting the mood based summary video for presentation to one or more users.
Abstract:
A method and a device for generating a composite video are provided. The method comprises obtaining primary and secondary video segments each comprising a sequence of intra-coded I frames and predicted P frames, the primary and secondary video segments having first and second priority levels and first and second capture time intervals, wherein the second priority level is higher than the first priority level and the second capture time interval overlaps with the first capture time interval. The method comprises time-aligning the primary and the secondary video segments; identifying a start merge time in the primary video segment of a first anchor I frame of the secondary video segment; and merging frames of the primary and secondary video segments, without transcoding, to generate a composite video, wherein the composite video comprises frames of the primary video segment up to the start merge time, the first anchor I frame and frames of the secondary video segment subsequent to the first anchor I frame.
Abstract:
A video production server comprising at least one processor and a storage is suggested. Software modules composed of executable program code are loaded into a working memory of the at least one processor. Each software module, when executed by the at least one processor, provides an elementary service. A concatenation of elementary services provides for a functionality involving processing of video and/or audio signals needed for producing a broadcast program. The video production server includes a set of software components that runs on conventional hardware. Each functionality of the video production server is achieved by using a specific piece of software that is assembled from reusable functional software blocks and that can run on any compatible hardware platform. Furthermore, a method for operating the video production server and a distributed video production system including the video production server is suggested.
Abstract:
A method for automatically starting an audio recording that includes receiving audio data and dividing the audio data into a first set of consecutive segments and a second set of consecutive segments that occur after the first set. The method further includes analyzing the first set of segments by measuring an average energy and peak value for each segment of the first set and determining a silence score therefrom, and analyzing the second set of segments by measuring an average energy and peak value for each segment of the second set and determining an music score therefrom. The method begins a recording of the audio data if the silence score is above a first predetermined value and the music score is above a second predetermined value.
Abstract:
Methods and systems are disclosed related to the processing of video and sensor data recorded by a video camera. For example, a first embodiment is directed to the storage of sensor data in a metadata portion of a digital media file, a second embodiment is directed to the storage of highlight data in a metadata portion of a digital media file, and a third embodiment is directed to the creation of highlight data based on senor data.
Abstract:
Methods and apparatus for contextual video content adaptation are disclosed. Video content is adapted based on any number of criteria such as a target device type, viewing conditions, network conditions or various use cases, for example. A target adaptation of content may be defined for a specified video source. For example, based on receiving a request from a portable device for a live sports feed, a shortened and reduced resolution version of the live sport feed video may be defined for the portable device. The source content may be accessed and adapted (e.g., adapted temporally, spatially, etc.) and an adapted version of content generated. For example, the source content may be cropped to a particular spatial region of interest and/or reduced in length to a particular scene. The generated adaptation may be transmitted to a device in response to the request, or stored to a storage device.
Abstract:
Systems and methods, and computer-readable media bearing instructions for carrying out methods of capturing notes from passive recording of an ongoing content stream and associating visual content (e.g., images and video) with the note are presented. Passive recording comprises temporarily recording the most recent content of the ongoing content stream. An ongoing content stream is passively recorded in a passive recording buffer. The passive recording buffer is configured to store a limited amount of recorded content corresponding to the most recently recorded content of the ongoing content stream. Upon indication by the user, a note is generated from the recorded content in the passive recording, associated with visual content, and stored in a note file for the user.
Abstract:
There is provided a method of altering a supplementary audio recording, in preparation for adding to a video recording of a scene comprising a sound source (F). The video recording was recorded by a camera (C) and a microphone (E), and the supplementary audio recording was recorded at a different time to the video recording. The method comprises receiving the supplementary audio recording and location information defining relative positions (T5, T4) of the sound source and the microphone, or relative positions (T5, T3) of the sound source and the camera,and altering characteristics of the supplementary audio recording based on the relative positions.
Abstract:
A method and apparatus for generating an extrapolated image from an existing film or video content, which can be displayed beyond the borders of the existing file or video content to increase viewer immersiveness, are provided. The present principles provide to generating the extrapolated image without salient objects included therein, that is, objects that may distract the viewer from the main image. Such an extrapolated image is generated by determining salient areas and generating the extrapolated image with lesser salient objects included in its place. Alternatively, salient objects can be detected in the extrapolated image and removed. Additionally, selected salient objects may be added to the extrapolated image.