Abstract:
The technology is directed to keeping high image quality when a subtitle is superimposed on a video. A reception device includes circuitry configured to receive a video stream and a subtitle stream. The circuitry processes the video stream to obtain video data of a video. The circuitry processes the subtitle stream to obtain subtitle bitmap data of a subtitle bitmap image. The circuitry adjusts a color gamut of the subtitle bitmap data to a color gamut of the video data. The color gamut of the subtitle bitmap data is adjusted based on color gamut identification information of the subtitle bitmap data and color gamut identification information of the video data. The circuitry further superimposes, on the video, the color gamut adjusted subtitle bitmap image.
Abstract:
A camera for capturing simultaneous images in the visible and non-visible light spectrums for use in videoing and tracking object movement is disclosed. The camera comprises a camera housing with a single lens capable of transmitting both visible and non-visible light frequencies; and a prism system for directing the non-visible frequencies to a first sensor and the visible frequencies to a second sensor. A processor relates the pixel data extracted from the first non-visible sensor with the pixel data from the second visible sensor.
Abstract:
A method and system for preparing subtitles for use in a stereoscopic presentation are described. The method allows a subtitle to be displayed without being truncated or masked by comparing the subtitle's initial footprint with an image display area. If any portion of the initial footprint lies outside the image display area, the subtitle is adjusted according to adjustment information, which includes at least one of: a scale factor, a translation amount and a disparity change, so that the adjusted subtitle lies completely within the image display area. Furthermore, the disparity of the subtitle can be adjusted by taking into account the disparities of one or more objects in an underlying image to be displayed with the subtitle.
Abstract:
A camera for capturing simultaneous images in the visible and non-visible light spectrums for use in videoing and tracking object movement is disclosed. The camera comprises a camera housing with a single lens capable of transmitting both visible and non-visible light frequencies; and a prism system for directing the non-visible frequencies to a first sensor and the visible frequencies to a second sensor. A processor relates the pixel data extracted from the first non-visible sensor with the pixel data from the second visible sensor.
Abstract:
A three-dimensional image signal comprises a first image component, a second component for creating a three-dimensional image in combination with the first image component, a text component comprising text-based subtitles and/or presentation graphics-based bitmap images for including in the three-dimensional image, a shared Z-location component comprising Z-location information describing the depth location of the text component within the three-dimensional image. The signal is rendered by rendering a three-dimensional image from the first image component and the second component, the rendering including rendering the text-based subtitles and/or presentation graphics-based bitmap images in the three-dimensional image, the rendering of the text component including adjusting the depth location of the text-based subtitles and/or presentation graphics-based bitmap images based on the shared Z-location component. Advantageously, the Z-location for both text-based and presentation-graphics-based subtitles is the same and only needs to be stored once per stream.
Abstract:
Subtitle generation method for live programs with numerous participants (1) including: assigning a subtitling technician (2) to each participant (1) of the program; each subtitling technician repeating, in a summarized form, what the participant assigned to him/her is saying; transforming the acoustic signals of each subtitling technician (2) in audio signals using a microphone (7); extracting an adapted text (4) from each audio signal in an automatic voice recognition phase; sending each adapted text (4) to a central data processing unit (6); revising, with the help of a subtitling editor (5), the adapted text (4) received by the central data processing unit (6), thus giving the text the proper format and/or correcting the errors produced during the automatic voice recognition stage; inserting the adapted text (4), once revised, of each participant in the images of the program, transmitting said images to the screen (9) with the text (4) already inserted.
Abstract:
A data combining apparatus and a data combining method in which timing for transformation, movement (arrangement), and generation of data combined in accordance with timing information can be corrected and a screen formation to be needed for matching of timing can be formed by combination, wherein a data combining apparatus 100 generates timing information of a change of the screen formation by a timing information generation unit 108 and synchronizes generation timing in the graphics generation unit 103 to process timing for image data in a transformation and movement processing unit 105 to enable combination of the image data and the graphics without timing deviation to thereby obtain desired combined data in a combining unit 107.
Abstract:
Speech of a speaker is repeated by a repeating person whose speech is recognized and a video of the speaker is delayed when displayed so that it is displayed together with characters, so that the speech of the speaker can easily be understood. A video delay unit (2) outputs delayed video data of video input to a camera (1) and delayed. A first speech recognition unit (5) recognizes the content of a first language of a first repeating person input to a first speech input unit (3) and converts it into visible language data. A second speech recognition unit (6) recognizes the content of a second language of a second repeating person input to a second speech input unit (4) and converts it into second visible language data. A layout setting unit (8) receives the first and the second language data from the first and the second speech recognition unit (5,6) and delayed video data from the video delay unit (2), sets a display layout of these data, creates a display video, and displays it on a character video display unit (9).