摘要:
A method of sensing depth using an RGB camera. In an example method, a color image of a scene is received from an RGB camera. The color image is applied to a trained machine learning component which uses features of the image elements to assign all or some of the image elements a depth value which represents the distance between the surface depicted by the image element and the RGB camera. In various examples, the machine learning component comprises one or more entangled geodesic random decision forests.
摘要:
A method and apparatus for performing motion estimation in a digital video system is disclosed. Specifically, the present invention discloses a system that quickly calculates estimated motion vectors in a very efficient manner. In one embodiment, a first multiplicand is determined by multiplying a first display time difference between a first video picture and a second video picture by a power of two scale value. This step scales up a numerator for a ratio. Next, the system determines a scaled ratio by dividing that scaled numerator by a second first display time difference between said second video picture and a third video picture. The scaled ratio is then stored calculating motion vector estimations. By storing the scaled ratio, all the estimated motion vectors can be calculated quickly with good precision since the scaled ratio saves significant bits and reducing the scale is performed by simple shifts.
摘要:
A method and apparatus for performing motion estimation in a digital video system is disclosed. Specifically, the present invention discloses a system that quickly calculates estimated motion vectors in a very efficient manner. In one embodiment, a first multiplicand is determined by multiplying a first display time difference between a first video picture and a second video picture by a power of two scale value. This step scales up a numerator for a ratio. Next, the system determines a scaled ratio by dividing that scaled numerator by a second first display time difference between said second video picture and a third video picture. The scaled ratio is then stored calculating motion vector estimations. By storing the scaled ratio, all the estimated motion vectors can be calculated quickly with good precision since the scaled ratio saves significant bits and reducing the scale is performed by simple shifts.
摘要:
A method and apparatus for performing motion estimation in a digital video system is disclosed. Specifically, the present invention discloses a system that quickly calculates estimated motion vectors in a very efficient manner. In one embodiment, a first multiplicand is determined by multiplying a first display time difference between a first video picture and a second video picture by a power of two scale value. This step scales up a numerator for a ratio. Next, the system determines a scaled ratio by dividing that scaled numerator by a second first display time difference between said second video picture and a third video picture. The scaled ratio is then stored calculating motion vector estimations. By storing the scaled ratio, all the estimated motion vectors can be calculated quickly with good precision since the scaled ratio saves significant bits and reducing the scale is performed by simple shifts.
摘要:
Some embodiments provide a method for detecting and/or identifying a set of faces in a video frame and performing a set of image processing operations based on locations of the set of faces. In particular, the method identifies a set of respective locations of the set of faces in the video frame and applies one or more image processing operations based on the locations of the set of faces found in the video frame. The image processing operations include color correction operations, non-color correction operations, and image processing operations that modify areas inside or outside of the detected and/or identified faces. Additionally, some embodiments provide a graphical user interface for automatically applying image processing operations to an area of a video frame isolated by an ellipse-shaped mask. Furthermore, some embodiments provide a system for automatically applying image processing operations to an area of a video frame isolated by an ellipse-shaped mask.
摘要:
A method and apparatus for variable accuracy inter-picture timing specification for digital video encoding is disclosed. Specifically, the present invention discloses a system that allows the relative timing of nearby video pictures to be encoded in a very efficient manner. In one embodiment, the display time difference between a current video picture and a nearby video picture is determined. The display time difference is then encoded into a digital representation of the video picture. In a preferred embodiment, the nearby video picture is the most recently transmitted stored picture. For coding efficiency, the display time difference may be encoded using a variable length coding system or arithmetic coding. In an alternate embodiment, the display time difference is encoded as a power of two to reduce the number of bits transmitted.
摘要:
A method and apparatus for variable accuracy inter-picture timing specification for digital video encoding is disclosed. Specifically, the present invention discloses a system that allows the relative timing of nearby video pictures to be encoded in a very efficient manner. In one embodiment, the display time difference between a current video picture and a nearby video picture is determined. The display time difference is then encoded into a digital representation of the video picture. In a preferred embodiment, the nearby video picture is the most recently transmitted stored picture. For coding efficiency, the display time difference may be encoded using a variable length coding system or arithmetic coding. In an alternate embodiment, the display time difference is encoded as a power of two to reduce the number of bits transmitted.
摘要:
A method and apparatus for variable accuracy inter-picture timing specification for digital video encoding is disclosed. Specifically, the present invention discloses a system that allows the relative timing of nearby video pictures to be encoded in a very efficient manner. In one embodiment, the display time difference between a current video picture and a nearby video picture is determined. The display time difference is then encoded into a digital representation of the video picture. In a preferred embodiment, the nearby video picture is the most recently transmitted stored picture. For coding efficiency, the display time difference may be encoded using a variable length coding system or arithmetic coding. In an alternate embodiment, the display time difference is encoded as a power of two to reduce the number of bits transmitted.
摘要:
A method and apparatus for variable accuracy inter-picture timing specification for digital video encoding is disclosed. Specifically, the present invention discloses a system that allows the relative timing of nearby video pictures to be encoded in a very efficient manner. In one embodiment, the display time difference between a current video picture and a nearby video picture is determined. The display time difference is then encoded into a digital representation of the video picture. In a preferred embodiment, the nearby video picture is the most recently transmitted stored picture. For coding efficiency, the display time difference may be encoded using a variable length coding system or arithmetic coding. In an alternate embodiment, the display time difference is encoded as a power of two to reduce the number of bits transmitted.
摘要:
An effective method for dynamically selecting the number of I, P and B frames during video coding is proposed. Short-term look-ahead analysis of a video sequence yields a variable number of B frames to be coded between any two stored pictures. The first picture of a group of frames (GOF) may be coded as a B picture. Motion speed is calculated for each picture of the GOF with respect to the first picture of the GOF. Subject to exceptions, as long as the subsequent pictures exhibit motion speeds that are similar and motion vector displacements that are co-linear with those of the first picture in the GOF, they may be coded as B pictures. When a picture is encountered having a motion speed that is not the same as that of the first picture in the GOF, the picture may be coded as a P picture. In some embodiments, a sequence of B pictures that terminates in a P picture may be called a “group of frames” (GOF).