Abstract:
Systems, methods, and devices for an interactive viewing experience by detecting on-screen data are disclosed. One or more frames of video data are analyzed to detect regions in the visual video content that contain text. A character recognition operation can be performed on the regions to generate textual data. Based on the textual data and the regions, a graphical user interface (GUI) definition to can be generated. The GUI definition can be used to generate a corresponding GUI superimposed onto the visual video content to present users with controls and functionality with which to interact with the text or enhance the video content. Context metadata can be determined from external sources or by analyzing the continuity of audio and visual aspects of the video data. The context metadata can then be used to improve the character recognition or inform the generation of the GUI.
Abstract:
A video processing device includes a histogram generating component, an analyzing component, a comparator and an encoding component. The histogram generating component can generate a histogram for image data of an image frame. The analyzing component can analyze the histogram, can identify an isolated spike in the histogram and can output at least one strobe parameter. The comparator can compare the at least one strobe parameter with at least one predetermined threshold, can output a first instruction signal when the at least one comparison operation is indicative of a strobe and can output a second instruction signal when the at least one comparison operation is not indicative of a strobe. The encoding component can encode the image data in a first manner based on the first instruction signal and can encode the image data in a second manner based on the second instruction signal.
Abstract:
A method of classifying the shot type of a video frame, comprising loading a frame, dividing the frame into field pixels and non-field pixels based on a first playfield detection criteria, determining an initial shot type classification using the number of the field pixels and the number of the non-field pixels, partitioning the frame into one or more regions based on the initial classification, determining the status of each of the one or more regions based upon the number of the field pixels and the non-field pixels located within each the region, and determining a shot type classification for the frame based upon the status of each the region.
Abstract:
Systems and methods are provided for presenting content to a user (202). An exemplary method involves establishing (302, 404, 406) a relationship between a first device (204) and the user (202), wherein, based on the relationship, one or more instances of secondary content are automatically excluded (306) from display by the first device (204) while primary content is displayed (230, 412) by the first device (204). The method continues by presenting (240, 308, 416) an instance of secondary content to the user (202) in a manner that is influenced by the relationship.
Abstract:
A method of identifying long shots of sports video comprising receiving a video frame comprising a plurality of pixels, classifying each of the plurality of pixels as a candidate field pixel or a candidate non-field pixel, determining whether at least a predefined percentage of the plurality of pixels are candidate field pixels, calculating a first standard deviation, the first standard deviation being the standard deviation of the hues of all candidate field pixels, and classifying the video frame as a long shot of sports video when at least the predefined percentage of the plurality of pixels are candidate field pixels and the first standard deviation is equal to or lower than a predefined maximum standard deviation value.
Abstract:
A video processing device is provided that includes a buffer, a luminance component, a maximum threshold component, a minimum threshold component and a flagging component. The buffer can store frame image data for a plurality of video frames. The luminance component can generate a first luminance value corresponding to a first frame image data and can generate a second luminance value corresponding to a second frame image data. The maximum threshold component can generate a maximum indicator signal when the difference between the second luminance value and the first luminance value is greater than a maximum threshold. The minimum threshold component can generate a minimum indicator signal when the difference between the second luminance value and the first luminance value is less than a minimum threshold. The flagging component can generate a flagged signal based on the maximum indicator signal and the minimum indicator signal.
Abstract:
A method implemented in a computer system for controlling the delivery of data and audio/video content. The method delivers primary content to the subscriber device for viewing by a subscriber. The method also delivers secondary content to the companion device for viewing by the subscriber in parallel with the subscriber viewing the primary content, where the secondary content relates to the primary content. The method extracts attention estimation features from the primary content, and monitors the companion device to determine an interaction measurement for the subscriber viewing the secondary content on the companion device. The method calculates an attention measurement for the subscriber viewing the primary content based on the attention estimation features, and the interaction measurement, and controls the delivery of the secondary content to the companion device based on the attention measurement.