摘要:
In one aspect, the present invention is directed to a method and an apparatus for organizing digital media, particularly digital photos, using face recognition. According to a first aspect of the present invention, a computer-based method for organizing digital photos comprises: extracting objects of interest from a plurality of photographs; cropping said plurality of photographs to generate images of isolated objects of interest; applying a recognition algorithm to determine the similarity of isolated objects of interest with a reference; displaying a plurality of objects arranged as a function of the determined similarity; and receiving user input to associate said objects with a particular classification.
摘要:
A system provides for controlling a cursor on a screen automatically and dynamically when using a video camera as a pointing device. A computer displays static or dynamic content to a screen. A video camera connected to the computer points at the screen. As the video camera films the screen, frames captured by the video camera are sent to the computer. A target image is displayed by the computer onto the screen and marks the position of the screen cursor of the video camera. Frames captured by the video camera include the target image, and the computer dynamically moves the target image on the screen to ensure that the target image stays in the center of the view of the video camera.
摘要:
Embodiments of the present invention include a video server that can detect and track the image of a pointing indicator in an input video stream representation of a computer display. The video server checks ordered frames of the video signal and determines movements for a pointing indicator such as a mouse arrow. Certain motions by the pointing indicator, such as lingering over a button or menu item or circling a button or menu item can provoke a control action on the server.
摘要:
A system and method for identifying query-related keywords in documents found in a search using latent semantic analysis. The documents are represented as a document term matrix M containing one or more document term-weight vectors d, which may be term-frequency (tf) vectors or term-frequency inverse-document-frequency (tf-idf) vectors. This matrix is subjected to a truncated singular value decomposition. The resulting transform matrix U can be used to project a query term-weight vector q into the reduced N-dimensional space, followed by its expansion back into the full vector space using the inverse of U.To perform a search, the similarity of qexpanded is measured relative to each candidate document vector in this space. Exemplary similarity functions are dot product and cosine similarity. Keywords are selected with the highest values in qexpanded that are also comprised in at least one document. Matching keywords from the query may be highlighted in the search results.
摘要:
In one aspect, the present invention is directed to a method and an apparatus for organizing digital media, particularly digital photos, using face recognition. According to a first aspect of the present invention, a computer-based method for organizing digital photos comprises: extracting objects of interest from a plurality of photographs; cropping said plurality of photographs to generate images of isolated objects of interest; applying a recognition algorithm to determine the similarity of isolated objects of interest with a reference; displaying a plurality of objects arranged as a function of the determined similarity; and receiving user input to associate said objects with a particular classification.
摘要:
Embodiments of the present invention provide the ability to navigate, view, and manipulate a collection of digital images utilizing a GUI that has the familiar context of a calendar. Graphical objects representative of digital images are displayed within a particular day displayed in a calendar-based GUI. A user may group digital images into groups, modify the date with which a digital image is associated and perform various other manipulations using embodiments of a calendar-based GUI.
摘要:
The systems and methods of this invention watermark an original data file using dimensional compression and expansion. The original data file extends along a given dimension and has portions that extend along that given dimension. The information is embedded into the data file by selectively dimensionally compressing or expanding a size of each of some or all of the portions along the given dimension, which can be space or time. The portions of the data file are selectively dimensionally expanded or compressed according to a given encoding scheme. This encoding scheme can use the kind of modification, the relationships between the type of modification between adjacent portions, or the duration or degree of compression or expansion to store a portion of the embedded information. The portions of the embedded information can be individual bits of binary or trinary information, or can be a portion of analog information.
摘要:
Apparatus for determining the location of a signal-generating source (e.g., a conferee in a telephone conference) includes at least three sensors (e.g., microphones) arranged in a plurality of sets, each having two or more sensors. A surface-finding element responds to receipt at each sensor set of signals (e.g., speech) from the source for identifying a geometric surface (e.g., the surface of a hyperboloid or cone) representing potential locations of the source as a function of sensor locations and time difference of arrival of the signals. A location-approximating element coupled to two or more of the sets identifies a line that further defines potential source locations at the intersection of the surfaces. A location signal representing those potential locations is generated in accord with parameters of that line. Further functionality generates generating the location signal as a function of closest intersections the plural ones of the aforementioned lines.
摘要:
A system provides for controlling a cursor on a screen automatically and dynamically when using a video camera as a pointing device. A computer displays static or dynamic content to a screen. A video camera connected to the computer points at the screen. As the video camera films the screen, frames captured by the video camera are sent to the computer. A target image is displayed by the computer onto the screen and marks the position of the screen cursor of the video camera. Frames captured by the video camera include the target image, and the computer dynamically moves the target image on the screen to ensure that the target image stays in the center of the view of the video camera.
摘要:
Apparatus for determining the location of a signal-generating source (e.g., a conferee in a telephone conference) includes at least three sensors (e.g., microphones) arranged in a plurality of sets, each having two or more sensors. A surface-finding element responds to receipt at each sensor set of signals (e.g., speech) from the source for identifying a geometric surface (e.g., the surface of a hyperboloid or cone) representing potential locations of the source as a function of sensor locations and time difference of arrival of the signals. A location-approximating element coupled to two or more of the sets identifies a line that further defines potential source locations at the intersection of the surfaces. A location signal representing those potential locations is generated in accord with parameters of that line. Further functionality generates generating the location signal as a function of closest intersections the plural ones of the aforementioned lines.